CN108981698B - Visual positioning method based on multi-mode data - Google Patents

Visual positioning method based on multi-mode data Download PDF

Info

Publication number
CN108981698B
CN108981698B CN201810534761.3A CN201810534761A CN108981698B CN 108981698 B CN108981698 B CN 108981698B CN 201810534761 A CN201810534761 A CN 201810534761A CN 108981698 B CN108981698 B CN 108981698B
Authority
CN
China
Prior art keywords
features
gist
bow
color
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810534761.3A
Other languages
Chinese (zh)
Other versions
CN108981698A (en
Inventor
程瑞琦
林书妃
杨恺伦
汪凯巍
于红雷
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Kr Vision Technology Co ltd
Original Assignee
Hangzhou Kr Vision Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Kr Vision Technology Co ltd filed Critical Hangzhou Kr Vision Technology Co ltd
Priority to CN201810534761.3A priority Critical patent/CN108981698B/en
Publication of CN108981698A publication Critical patent/CN108981698A/en
Application granted granted Critical
Publication of CN108981698B publication Critical patent/CN108981698B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01CMEASURING DISTANCES, LEVELS OR BEARINGS; SURVEYING; NAVIGATION; GYROSCOPIC INSTRUMENTS; PHOTOGRAMMETRY OR VIDEOGRAMMETRY
    • G01C21/00Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00
    • G01C21/20Instruments for performing navigational calculations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features

Landscapes

  • Engineering & Computer Science (AREA)
  • Remote Sensing (AREA)
  • Radar, Positioning & Navigation (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Computation (AREA)
  • Evolutionary Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Automation & Control Theory (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)
  • Navigation (AREA)

Abstract

The invention discloses a visual positioning method based on multi-modal data. The method utilizes multimode data acquired by a GNSS and a camera, utilizes a small-sized processor to process the acquired data and outputs a positioning result. The method can be used for positioning under different illumination conditions such as day and night, and has the advantages of low false detection rate, low omission factor, good real-time performance and good cross-platform performance. Can well meet the application requirement of accurate positioning of the visually impaired.

Description

Visual positioning method based on multi-mode data
Technical Field
The invention belongs to the technical fields of image processing technology, signal processing technology and computer vision, and relates to a visual positioning method based on multi-modal data.
Background
Visual information is the most important information source for human beings to recognize the surrounding environment, and about 80% of information obtained by human beings is input from a visual system. According to the statistics of the world health organization, 2.85 hundred million people with visual impairment exist in the world. The visually impaired person loses normal vision and has difficulty in understanding the color and shape. Many of them now use white canes or guide dogs to assist their daily lives. White canes are not sufficient to solve all the difficulties during travel. The guide dogs can guide visually impaired people to avoid danger when walking on the road, but they cannot be used for all visually impaired people because of the great cost required for training the guide dogs. Therefore, the traditional tools such as walking sticks and guide dogs cannot provide sufficient assistance for traveling. Since the development of various Electronic Travel Aid (ETA) devices, it has been considered as an effective method for assisting visually impaired people to travel under various conditions. To help users find access, many auxiliary systems deploy depth cameras to detect accessible paths and obstacles. However, in these systems, there is little integration of accurate positioning system detection. Due to limited vision, the visually impaired cannot be accurately positioned when the person goes outdoors. The positioning accuracy of GNSS devices is typically several meters or more to more than ten and several meters. Therefore, it is important for the visually impaired to perform outdoor trip to realize accurate positioning.
Much work has been devoted to solving the visual localization problem. However, most current solutions apply to automated navigation of robots or unmanned vehicles. In automatic navigation, a camera is static relative to a bearer, and the shooting direction is single; the blind-person assistance application is quite different, the image captured by the handheld camera is unstable, and the camera can shoot any direction image. Therefore, the visual localization method for visually impaired people must be robust enough to cope with various environments. As an aid to visually impaired people, the visual positioning method should have a very low false alarm rate, which is very important for the safety of the user. In addition, real-time is also a requirement of the algorithm. Because the algorithms must be implemented in portable platforms, limited system resources require efficient algorithms to maintain moderate frame rates.
Disclosure of Invention
The invention aims to provide a visual positioning method based on multi-modal data, aiming at overcoming the defects of the prior art.
The purpose of the invention is realized by the following technical scheme: a visual localization method based on multi-modal data comprises the following steps:
(1) establishing a position P-feature database W, wherein the feature data W comprises longitude data L on, latitude data L at and three features of GIST, L DB and BoW, the three features of GIST, L DB and BoW are extracted from image information, the image information comprises a Color image Color, a Depth image Depth and an infrared image IR which are collected at a position P, and the extraction method comprises the steps of extracting GIST features from the Color, respectively extracting L DB features from the Color, Depth and IR and splicing the three L DB features into L DB features;
(2) when positioning is needed, acquiring a Color image Color ', a Depth image Depth ' and an infrared image IR ' at a position to be positioned, longitude data L on ' and latitude data L at ', extracting GIST characteristics from the Color ', respectively extracting L DB characteristics from the Color ', Depth ' and IR ', and splicing the three L DB characteristics into one L DB characteristic;
(3) and (3) screening out the approximate position P according to L on 'and L at', and meeting the following conditions:
Figure BDA0001677578900000021
where t is an approximate threshold, and 0< t <1, L on, L at are longitude data L on and latitude data L at, respectively, of the position P stored in the database.
(4) Respectively searching the nearest positions from the approximate positions screened out in the step 3 according to three characteristics of GIST ', L DB ' and BoW ', and respectively obtaining three nearest positions PGIST,PLDB,PBoWWherein the distance between GIST features of two locations is calculated using euclidean distance, the distance between two L DB features is calculated using Hamming distance, and the distance between two BoW features is calculated using L1 fraction;
(5) if PGIST,PLDB,PBoWThe three points are overlapped, the final positioning result P is obtained0=PGIST=PLDB=PBoWIf the three points are not coincident, the final positioning result is the central point position of the three most adjacent positions.
Further, the three L DB features are spliced into a L DB feature, specifically, L DB features are extracted from Color, Depth and IR respectively and are marked as L DBc, L DBd and L DBi respectively, and the three L DB features are spliced into a L DB feature in an end-to-end mode.
Further, the center point positions of the three nearest neighboring positions are obtained by:
Figure BDA0001677578900000022
wherein (L on0,Lat0) As a result of positioning P0(L on)GIST,LatGIST) Is PGIST(L on)LDB,LatLDB) Is PLDB(L on)BoW,LatBoW) Is PBoWThe coordinates of (a).
Compared with the conventional sidewalk traffic light detection method, the method has the following advantages that:
1. the positioning accuracy is high, and compared with the positioning only by using a GNSS positioning module, the positioning accuracy can be improved by vision-assisted positioning.
2. And (4) environmental suitability. The method can be used for positioning by utilizing visual image information under different illumination conditions such as strong light, weak light and the like.
3. The real-time performance is good. The method can be used for positioning on a mobile platform (such as a mobile phone) in real time without delay under various conditions.
4. The portability is good. The core part of the method is a camera, a processor and an earphone returning device, and the method can be conveniently transplanted to intelligent devices such as mobile phones and tablets.
Drawings
1. FIG. 1 is a flow chart of a visual positioning method for visually impaired people based on multimodal data;
2. fig. 2 is a schematic diagram of an original image acquired by a camera.
Detailed Description
A visual localization method based on multi-modal data comprises the following steps:
(1) establishing a position P-characteristic database W, wherein the characteristic data W comprises longitude data L on, latitude data L at and three characteristics of GIST, L DB and BoW, the latitude and longitude data format is a format specified by an NMEA-0183 protocol, as shown in the following table, the three characteristics of GIST, L DB and BoW are extracted from image information, the image information comprises a Color image Color, a Depth image Depth and an infrared image IR which are collected at a position P, and a plurality of common key positions are stored in the database under the normal condition;
Figure BDA0001677578900000031
(2) when positioning is needed, acquiring a Color image Color ', a Depth image Depth ' and an infrared image IR ' at a position to be positioned, longitude data L on ' and latitude data L at ', extracting GIST characteristics from the Color ', respectively extracting L DB characteristics from the Color ', Depth ' and IR ', and splicing the three L DB characteristics into one L DB characteristic;
(3) and (3) screening out the approximate position P according to L on 'and L at', and meeting the following conditions:
Figure BDA0001677578900000032
wherein t is an approximate threshold value, t is more than 0 and less than 1, t is selected according to the positioning accuracy of the GNSS module, when the GNSS positioning accuracy is influenced by building sheltering, rainy and foggy weather and the like, a larger numerical value is selected, and vice versa, L on and L at are longitude data L on and latitude data L at of the position P stored in the database respectively.
(4) Respectively searching the nearest positions from the approximate positions screened out in the step 3 according to three characteristics of GIST ', L DB ' and BoW ', and respectively obtaining three nearest positions PGIST,PLDB,PBoWWherein the distance between GIST features of two locations is calculated using euclidean distance, the distance between two L DB features is calculated using Hamming distance, and the distance between two BoW features is calculated using L1 fraction;
(5) if PGIST,PLDB,PBoWThe three points are overlapped, the final positioning result P is obtained0=PGIST=PLDB=PBoWIf the three points are not coincident, the final positioning is performedThe result is the center point location of the three nearest neighbors, namely:
Figure BDA0001677578900000041
wherein (L on0,Lat0) As a result of positioning P0(L on)GIST,LatGIST) Is PGIST(L on)LDB,LatLDB) Is PLDB(L on)BoW,LatBoW) Is PBoWThe coordinates of (a).

Claims (3)

1. A visual positioning method based on multi-modal data is characterized by comprising the following steps:
(1) establishing a position P-feature data W database, wherein the feature data W comprises longitude data L on, latitude data L at and three features of GIST, L DB and BoW, the three features of GIST, L DB and BoW are extracted from image information, the image information comprises a Color image Color, a Depth image Depth and an infrared image IR which are collected at a position P, and the extraction method comprises the steps of extracting GIST features from the Color, extracting L DB features from the Color, Depth and IR respectively, and splicing the three L DB features into L DB features;
(2) when positioning is needed, acquiring a Color image Color ', a Depth image Depth ' and an infrared image IR ' at a position to be positioned, longitude data L on ' and latitude data L at ', extracting GIST ' features from the Color ', respectively extracting L DB ' features from the Color ', Depth ' and IR ', splicing the three L DB ' features into one L DB ' feature, and extracting BoW ' features from the Color ';
(3) and (5) screening out approximate positions according to L on 'and L at', and meeting the following conditions:
Figure FDA0002451184510000011
where t is an approximate threshold, and 0< t <1, L on, L at are longitude data L on and latitude data L at of the position P stored in the database, respectively;
(4) respectively searching the nearest positions from the approximate positions screened out in the step (3) according to three characteristics of GIST ', L DB ' and BoW ', and respectively obtaining three nearest positions PGIST,PLDB,PBoWWherein the distance between GIST and GIST ' features is calculated by using Euclidean distance, the distance between L DB and L DB ' features is calculated by using Hamming distance, and the distance between BoW and BoW ' features is calculated by using L1 fraction;
(5) if PGIST,PLDB,PBoWThe three points are overlapped, the final positioning result P is obtained0=PGIST=PLDB=PBoWIf the three points are not coincident, the final positioning result is the central point position of the three most adjacent positions.
2. The method of claim 1, wherein the three L DB features are spliced into one L DB feature, and specifically, L DB features are extracted from Color, Depth and IR respectively and are respectively marked as L DBc, L DBd and L DBi, and the three L DB features are spliced end to end into one L DB feature.
3. The method of claim 1, wherein the location of the center point of the three nearest neighbor locations is obtained by:
Figure FDA0002451184510000012
wherein (L on0,Lat0) As a result of positioning P0(L on)GIST,LatGIST) Is PGIST(L on)LDB,LatLDB) Is PLDB(L on)BoW,LatBoW) Is PBoWThe coordinates of (a).
CN201810534761.3A 2018-05-29 2018-05-29 Visual positioning method based on multi-mode data Active CN108981698B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810534761.3A CN108981698B (en) 2018-05-29 2018-05-29 Visual positioning method based on multi-mode data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810534761.3A CN108981698B (en) 2018-05-29 2018-05-29 Visual positioning method based on multi-mode data

Publications (2)

Publication Number Publication Date
CN108981698A CN108981698A (en) 2018-12-11
CN108981698B true CN108981698B (en) 2020-07-14

Family

ID=64542769

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810534761.3A Active CN108981698B (en) 2018-05-29 2018-05-29 Visual positioning method based on multi-mode data

Country Status (1)

Country Link
CN (1) CN108981698B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104820718A (en) * 2015-05-22 2015-08-05 哈尔滨工业大学 Image classification and searching method based on geographic position characteristics and overall situation vision characteristics
CN105716609A (en) * 2016-01-15 2016-06-29 浙江梧斯源通信科技股份有限公司 Indoor robot vision positioning method
CN106920250A (en) * 2017-02-14 2017-07-04 华中科技大学 Robot target identification and localization method and system based on RGB D videos
CN107451593A (en) * 2017-07-07 2017-12-08 西安交通大学 A kind of high-precision GPS localization method based on image characteristic point
CN107609565A (en) * 2017-09-21 2018-01-19 哈尔滨工业大学 A kind of indoor vision positioning method based on image overall feature principal component linear regression

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070257903A1 (en) * 2006-05-04 2007-11-08 Harris Corporation Geographic information system (gis) for displaying 3d geospatial images with reference markers and related methods

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104820718A (en) * 2015-05-22 2015-08-05 哈尔滨工业大学 Image classification and searching method based on geographic position characteristics and overall situation vision characteristics
CN105716609A (en) * 2016-01-15 2016-06-29 浙江梧斯源通信科技股份有限公司 Indoor robot vision positioning method
CN106920250A (en) * 2017-02-14 2017-07-04 华中科技大学 Robot target identification and localization method and system based on RGB D videos
CN107451593A (en) * 2017-07-07 2017-12-08 西安交通大学 A kind of high-precision GPS localization method based on image characteristic point
CN107609565A (en) * 2017-09-21 2018-01-19 哈尔滨工业大学 A kind of indoor vision positioning method based on image overall feature principal component linear regression

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
"Visual localization across seasons using sequence matching based on multi-feature combination";Yongliang Qiao 等,;《Sensors》;20171025;1-22页 *

Also Published As

Publication number Publication date
CN108981698A (en) 2018-12-11

Similar Documents

Publication Publication Date Title
US11604076B2 (en) Vision augmented navigation
KR102266830B1 (en) Lane determination method, device and storage medium
US7230538B2 (en) Apparatus and method for identifying surrounding environment by means of image processing and for outputting the results
Angin et al. A mobile-cloud collaborative traffic lights detector for blind navigation
KR100533033B1 (en) Position tracing system and method using digital video process technic
Jie et al. A new traffic light detection and recognition algorithm for electronic travel aid
Tian et al. Dynamic crosswalk scene understanding for the visually impaired
US20200288532A1 (en) Intelligent disaster prevention system and intelligent disaster prevention method
CN110164164B (en) Method for enhancing accuracy of mobile phone navigation software for identifying complex road by utilizing camera shooting function
Yusro et al. SEES: Concept and design of a smart environment explorer stick
Parikh et al. Android smartphone based visual object recognition for visually impaired using deep learning
CN106372610A (en) Foreground information prompt method based on intelligent glasses, and intelligent glasses
CN111767831B (en) Method, apparatus, device and storage medium for processing image
CN103312899A (en) Smart phone with blind guide function
CN108721069B (en) Blind person auxiliary glasses based on multi-mode data for visual positioning
CN115272949A (en) Pedestrian tracking method and system based on geographic spatial information
CN108981698B (en) Visual positioning method based on multi-mode data
CN112932910A (en) Wearable intelligent sensing blind guiding system
CN202350794U (en) Navigation data acquisition device
TWI451990B (en) System and method for lane localization and markings
Kamasaka et al. Image based location estimation for walking out of visual impaired person
CN114283278A (en) Infectious disease prevention and control device and method
CN106874945B (en) Sidewalk traffic light detection system and method for visually impaired people
CN114283279A (en) Travel card identification method applied to infectious disease prevention and control and infectious disease prevention and control method
CN114323013A (en) Method for determining position information of a device in a scene

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant