CN110109457A

CN110109457A - A kind of intelligent sound blind-guidance robot control method and control system

Info

Publication number: CN110109457A
Application number: CN201910355337.7A
Authority: CN
Inventors: 马行; 王政博; 黄全进; 穆春阳; 张春涛; 陈建宇; 杨玉
Original assignee: North Minzu University
Current assignee: North Minzu University
Priority date: 2019-04-29
Filing date: 2019-04-29
Publication date: 2019-08-09

Abstract

The invention discloses a kind of intelligent sound blind-guidance robot control method and control systems, barrier classification information is obtained to the detection and identification of object using YOLO neural fusion, binocular calibration and Stereo matching are carried out to realize distance measurement function to binocular camera, by obtained barrier classification information and the distance and bearing of barrier and robot is taken to carry out voice broadcast using espeak speech synthesis, to realize that barrier is broadcasted, the barrier in the visual field is detected using near infrared sensor, walking and avoidance are carried out by running gear, and information transmitting is carried out by speech form and user, user is increased to the sensing capability of ambient enviroment, recognition accuracy of the present invention is high, to complex environment Robust Performance, the present invention realizes the voice broadcast of accurate intelligent, to prompt the object in front of blind person Classification, and the distance of nearest object is provided, so that blind person is had accurate understanding to environment, improve the quality of living, can satisfy blindman needs.

Description

A kind of intelligent sound blind-guidance robot control method and control system

Technical field

The invention belongs to blind-guidance robots, and in particular to a kind of intelligent sound blind-guidance robot control method and control system System.

Background technique

According to national statistics department statistics, there is 17,310,000 people of vision disorder disabled person in China, wherein blind person has more than 500 ten thousand People.The blind person increased newly every year ten thousand people about more than 40.In our daily life, the information obtained by vision is to be in the great majority , however blind person brings great inconvenience to the life of blind person in the influence of defect and social complex environment physiologically.

Country increases the support dynamics of blind person, and the mechanism of all kinds of training blind person massages increases, and blind person, which has, more to be selected Industry chance, the appearance of blind guiding stick, seeing-eye dog etc. have greatly helped the trip of blind person.And currently without the guide for being directed to blind person Robot provides a kind of intelligently guiding according to the angle of surrounding objects and object distance blind person and distance for blind person.

Summary of the invention

The purpose of the present invention is to provide a kind of intelligent sound blind-guidance robot control method and control systems, existing to overcome There is the deficiency of technology.

In order to achieve the above objectives, the present invention adopts the following technical scheme:

A kind of intelligent sound blind-guidance robot control method, comprising the following steps:

Step 1) obtains robot forward image by binocular camera, then by using yolo to the image of acquisition Algorithm carries out classification to robot front obstacle and identifies to obtain barrier classification information；

Step 2) carries out binocular calibration to binocular camera, obtains barrier and machine using calibrated binocular camera The distance and bearing of device people；

Step 3) by barrier classification information that step 1) and step 2) obtain and takes obstacle by espeak speech synthesis The distance and bearing of object and robot carries out voice broadcast；

Step 4) detects the position that detection barrier is located at vehicle using to infrared sensor, controls vehicle to barrier The opposite direction of place vehicle location is mobile.

Further, in step 1), be based on darknet neural network structure, using YOLO algorithm to the image of acquisition into The training of row neural network sample and test.

Further, the image that binocular camera obtains is divided into size is S*S sub-box, and each grid is responsible for inspection Whether have the center of examined object fallen in grid in, B detection frame, each frame packet are generated when each grid detection if surveying Containing 5 dimension information: x, y, w, h, Cobj, x, y, w, h, Cobj respectively represent the abscissa at frame center, ordinate, entire image The confidence level of width, the height of entire image and encirclement frame, shown in confidence level such as formula (1) (2):

In formula (1), Pr (Object) is probability of the object in grid, and existing then is 1, otherwise indicates reference for 0, IOU Template and detection block friendship and area ratio；

YOLO algorithm extracts feature in convolutional layer, and target prediction is carried out in full articulamentum, whole as Pr (Object)=1 The confidence level of certain class such as formula (3) in width picture:

Pr in formula (Class | Object) is the class condition probability of target to be detected；Pr(Class_i) it is to predict certain class Other probability；The threshold value of detection frame is set, the encirclement frame that score is lower than threshold value is filtered, the frame of reservation is carried out non-very big It is worth inhibition processing, barrier in image can be obtained and correspond to confidence level, i.e. barrier classification information.

Further, binocular camera is demarcated, is defined p=[x y]^TPoint and space in point p=[x y z]^T Between relationship be expressed as: p=sHP

In formula: s is scale factor, and H is indicated by matrix: H=AW, A are the Intrinsic Matrix of video camera, (f_x, f_y) it is figure As the scale factor in x-axis and y-axis direction, (c_x, c_y) be video camera principal point coordinate；W=[R T] is used for camera plane With the rotation translation transformation of target object plane, as external parameters of cameras matrix.

Further, carry out three-dimensional correction to calibrated binocular camera: the position put after correction is expressed as (x_p, y_p), The position put after distortion is expressed as (x_d, y_d), then have:

In formula: k₁、k₂、k₃For tangential distortion parameter, p₁、p₂For radial distortion parameter.

Further, specific ranging formula are as follows:

In formula, f is camera focus, and T is two image centers away from x^l-x^rFor parallax, Z is depth information, and P is determinand position It sets.

Further, half overall situation SGBM Stereo matching is carried out to the left and right picture that the binocular camera corrected obtains, Find same object point in left and right picture respective coordinates and two points corresponding to coordinate parallax value d, pass through following formula Calculate the depth of object:

Q[x y d 1]^T=[X Y Z W]^T (4-11)

The three-dimensional coordinate of spatial point is exactly (X/W Y/W Z/W).

Further, the x direction difference of the position according to the edge of object in image pixel and image intermediate pixel position Value, multiplied by the visual angle of whole image, can be obtained real world object and robot wait for angle offset divided by the direction x pixel value, entire to scheme The visual angle of picture is the field-of-view angle that image represents.

A kind of blind-guidance robot system including Mecanum wheel wheeled vehicle and is set on Mecanum wheel wheeled vehicle Control system, binocular camera, voice announcer and infrared sensor；

Control system includes main control unit and the infrared distance measurement unit connecting with main control unit, travelling control unit And speech control unit；Main control unit is for loading above method program；

Infrared distance measurement unit obtains the obstacle information around Mecanum wheel wheeled vehicle by infrared sensor, and will obtain The obstacle information taken is transmitted to main control unit；

Main control unit generates avoidance route information according to the obstacle information of acquisition, and avoidance route information is transmitted to row It walks control unit and realizes Mecanum wheel wheeled vehicle avoidance purpose；Travelling control unit is for receiving main control unit control signal Drive Mecanum wheel wheeled vehicle motor movement；

Binocular camera is for obtaining Mecanum wheel wheeled vehicle forward image, and the image transmitting that will acquire is to main control Unit；Main control unit carries out identification and ranging according to the image of acquisition, and identification information and ranging information are then passed through voice control Unit control voice announcer processed is broadcasted.

Further, travelling control unit uses L298N motor drive module；The wheeled chassis of Mecanum wheel uses four The Mecanum wheel of a DC MOTOR CONTROL builds the wheeled chassis of mode.

Compared with prior art, the invention has the following beneficial technical effects:

A kind of intelligent sound blind-guidance robot control method of the present invention and control system, using YOLO neural fusion Barrier classification information is obtained to the detection and identification of object, binocular calibration and Stereo matching are carried out to reality to binocular camera Existing distance measurement function, using espeak speech synthesis by obtained barrier classification information and take barrier at a distance from robot and Orientation carries out voice broadcast, to realize that barrier is broadcasted, is detected using near infrared sensor to the barrier in the visual field, Walking and avoidance are carried out by running gear, and information transmitting is carried out by speech form and user, increases user couple The sensing capability of ambient enviroment, recognition accuracy of the present invention is high, to complex environment Robust Performance, and the present invention realizes accurate intelligent Voice broadcast, to prompt the object category in front of blind person, and provide the distance of nearest object, there is blind person accurately to environment Understand, improves the quality of living, can satisfy blindman needs.

Further, it is based on darknet neural network structure, neural network is carried out using image of the YOLO algorithm to acquisition Sample training and test, improve computational accuracy.

Further, binocular camera is demarcated, reduces the radial distortion and tangential distortion of camera.

Detailed description of the invention

Fig. 1 is schematic structural view of the invention.

Fig. 2 is control system block diagram of the present invention.

Fig. 3 is the method for the present invention flow chart.

Fig. 4 is Mecanum wheel wheeled vehicle control flow chart.

Fig. 5 is object detection flow chart.

Fig. 6 is neural metwork training block diagram.

Fig. 7 is predicted position argument structure schematic diagram.

Fig. 8 is transfer layer structure chart.

Fig. 9 is video camera national forest park in Xiaokeng figure.

Figure 10 is Stereo matching and ranging basic schematic diagram.

Specific embodiment

The invention will be described in further detail with reference to the accompanying drawing:

As shown in Figure 1 and Figure 2, a kind of intelligent sound blind-guidance robot, including Mecanum wheel wheeled vehicle 2 and be set to Control system 1, binocular camera 4, voice announcer 3 and infrared sensor 5 on Mecanum wheel wheeled vehicle 2；

Control system includes main control unit and the infrared distance measurement unit connecting with main control unit, travelling control unit And speech control unit；

Infrared distance measurement unit obtains the obstacle information around Mecanum wheel wheeled vehicle 2 by infrared sensor 5, and will The obstacle information of acquisition is transmitted to main control unit；

Binocular camera 4 is for obtaining Mecanum wheel wheeled vehicle forward image, and the image transmitting that will acquire is to master control Unit processed；Main control unit carries out identification and ranging according to the image of acquisition, and identification information and ranging information are then passed through voice Control unit control voice announcer 3 is broadcasted；

Travelling control unit uses L298N motor drive module；The wheeled chassis of Mecanum wheel uses four direct currents The wheeled chassis that the Mecanum wheel of machine control builds mode is 360 degree rotation of the carrier to realize trolley, optimization avoidance effect Fruit.

Detection and identification using robot in machine vision to external object, and perceive object at a distance from user and Angle, to find the optimal path that blind person is suitble to.

Step 1) obtains robot forward image by binocular camera, then by using yolo to the image of acquisition Algorithm carries out classification identification to robot front obstacle；

In step 1), it is based on darknet neural network structure, neural network is carried out using image of the YOLO algorithm to acquisition Sample training and test can preferably avoid the error detection to the background of image；

Train samples mainly to the other training of family's common impairments species, have trained desk, chair, and potting is planted Object, computer, more than 20 object such as cup carry out yolo algorithm, can identify trained after camera obtains image The object category crossed simultaneously outlines, and the result of output is written voice broadcast program, binocular vision detection to nearest object away from From with the deviation angle with robot, also together be written voice broadcast program, finally broadcast content be mainly preceding object species Not, the distance and angle apart from nearest barrier, reminding blind are hidden in time and understand front obstacle information substantially to understand Road conditions.

Specifically, the image full figure that binocular camera obtains is divided into the grid that size is S × S, each grid is responsible for Center the grid target detection, using disposably predict target contained by all grid bounding box (surround frame), Positioning confidence level and all categories probability vector problem is disposably solved, each bounding box to predict x, y, w, H and confidence (confidence level) is worth for 5 totally, and x, y respectively represent the abscissa value and ordinate value at frame center, P (x, y) Coordinate is element mesh center of a lattice, and w, h are the estimated value of the width and height relative to entire image, confidence respectively It is the confidence level for surrounding frame, is made of the product of the probability containing object and the probability for surrounding frame predictionIn test, class information and bounding the box prediction of each grid forecasting Confidence information is multiplied, and just obtains the class-specific confidence score (class of each bounding box Other confidence score):

The left side formula (4-1) first item is exactly the classification information of each grid forecasting, and the two or three is exactly each bounding The confidence of box prediction.The box of this product, that is, encode (coding) prediction belongs to certain a kind of probability, also there is this The information of box accuracy, obtain the class-specific confidence score (classification confidence score) of each box with Afterwards, threshold value is set, the low boxes of score (surrounding frame) is filtered, NMS (non-maxima suppression) processing is carried out to the boxes of reservation, Barrier in image can be obtained and correspond to confidence level, i.e. barrier classification information；

In addition, YOLO is also most suitable for needs herein while detecting the requirement of various disorders object, therefore object detection herein Part uses YOLO V2 detection system, and YOLO algorithm belongs to CNN, is also made of convolutional layer, pond layer, full articulamentum.But it is different Be its output layer be tensor, and sample of its training is from sample image without specially cutting out, but to whole image It is trained and detects, this just improves the stability of system, reduces the error detection of background.

It is S*S sub-box that the image that binocular camera obtains, which is divided into size, and each grid is responsible for having detected whether The center of examined object has been fallen in grid, and B detection frame is generated when each grid detection, and each frame ties up information comprising 5 (x, y, w, h, Cobj), what is respectively represented is the abscissa, ordinate, the width of entire image, entire image at frame center Height and the confidence level for surrounding frame, shown in confidence level such as formula (1) (2):

YOLO algorithm extracts feature in convolutional layer, and target prediction is carried out in full articulamentum, whole as Pr (Object)=1 The confidence level of certain class such as formula (3) in width picture.

One classification information of each grid forecasting, is denoted as C class；Then S × S grid, each grid forecasting B Bounding box predicts C class.Tensor of S × S × (5 × B+C) is exported,

Embodiment

The embodiment of the present application object detection flow chart is as shown in figure 5, this experiment one shares 20 type objects, then each net Lattice only predict the condition class probability P r (Class_iObject), i=1,2 of 20 type objects ..., 20；Each grid is pre- Survey the position of B bounding box；That is this B bounding box shares a set of condition class probability P r (Class_ ), iObject i=1,2 ..., 20.Based on the Pr (Class_iObject) being calculated, some can be calculated in test Bounding box class associated confidence:

YOLO neural metwork training block diagram is as shown in fig. 6, YOLO neural network training process is as follows:

(1) with 64 7 × 7 convolution kernels in entire image convolution；

(2) convolutional layer for passing through a series of 3 × 3 or 1 × 1, extracts feature；

(3) it is made of two full articulamentums and classifies and return, ultimately produce one 7 × 7 × 30 matrix, 7 × 7 represent handle Image is divided into 7 × 7 grid, and the feature vector that the result of each grid is tieed up with one 30 indicates that 2 × 5+20=30,2 represent Each grid predicts two classes, and 5 represent x, y, width, height and confidence of a type objects, and 20 represent 20 objects Body classification.

1.1 pre-training sizes

Using YOLOV2 algorithm to neural network sample pre-training, directly instructed in advance using 448 × 448 Web portal Practice, effect obtains 3.7% promotion；

1.2 more refined nets divide

YOLOV2 algorithm reduces pooling (pond layer) number in network, makes most to promote wisp detection effect Whole characteristic pattern size is bigger, and such as input is 416 × 416, then output is 13 × 13 × 125, wherein 13 × 13 be final characteristic pattern, That is the number of original image lattice, 125 be bounding box structure (5 × (classes+5)) in each grid.

1.3 full convolutional networks

In order to enable the network to receive the input picture of sizes, YOLOV2 algorithm eliminates complete in v1 network structure Even layer, because full articulamentum necessarily requires input and output regular length feature vector.Whole network is become into a full convolution net Network can input sizes and detect.Meanwhile full convolutional network can preferably retain target relative to full articulamentum Spatial positional information.

1.4 new basic networks

For different infrastructure networks do classification task pair calculation amount, abscissa is does a forward direction classification task institute The operation number needed；Use darknet-19 as basic pre-training network (totally 19 convolutional layers), high-precision can kept In the case where rapid computations.

1.5 anchor (anchor) mechanism

YOLOV2 algorithm uses the anchor mechanism in Faster-RCNN: each to improve precision and recall rate Grid is arranged k and refers to anchor, training using GT anchor as benchmark classification with return loss, when test, directly exists Predict that k anchor box, each anchor box are the offset and w, h relative to reference anchor on each grid refine；The full figure of bounding box position in original each grid is returned the refine be converted to the position reference anchor in this way；

The shape and scale of target in voc and coco data set are calculated offline by k-means algorithm, obtained It as k=5 and chooses when fixing 5 ratio value, the shape of anchors shape and scale closest to target in voc and coco.

1.6 new bounding box prediction modes

As shown in fig. 7, using strong constraint method in predicted position parameter:

Back gauge of the corresponding Cell apart from the upper left corner is (C_x, C_y), σ is defined as sigmoid activation primitive, and functional value is constrained To [0,1], for predicting the offset (not deviating by cell) relative to the center Cell；

(described in the text is the corresponding a height of (P of width of bounding box prior) to predetermined Anchor_w, P_h), prediction Location is to obtain relative to the width height of Anchor multiplied by coefficient；

Now, neural network predicts that 5 bounding boxes (are clustered on each cell of characteristic pattern (13 × 13) Value out), while each bounding box predicts 5 seat values, respectively t_x、 t_y、t_w、t_h、t_o, wherein first four are to sit Mark, t_oIt is confidence level；If the back gauge in this cell range image upper left corner is (c in formula (4-4), formula (4-5)_x,c_y) and formula The length of (4-6) box (bounding box prior) corresponding with the cell in formula (4-7) and wide respectively (p_w,p_h), then formula Predicted value σ (t in (4-8)_o) can indicate are as follows:

b_x=σ (t_x)+c_x (4-4)

b_y=σ (t_y)+c_y (4-5)

Pr (object) * IOU (b, object)=σ (t_o) (4-8)

1.7 residual error layers merge low-level features

Transfer layer structure chart as shown in Figure 8, by simply adding a transfer layer (passthrough layer), this Shallow-layer characteristic pattern (resolution ratio is 26 × 26, is 4 times of bottom resolution ratio) is connected to further feature figure by layer；

The characteristic pattern of height Resolutions has namely been done primary connection by this transfer layer, and connection type is that superposition is special Levy different channels rather than spatial position.26 × 26 × 512 characteristic pattern is connected to 13 × 13 by this method × 2048 characteristic pattern, this characteristic pattern are connected with original feature.

Stereo matching and ranging based on binocular vision:

Binocular calibration and Stereo matching ranging are carried out to binocular camera, complete robot to nearest object distance and side The perception of position；

The calibration object of this paper is the fixed sieve skill c270 camera in two relative positions.It is flat using the chessboard of a 9*6 Face is 19mm as calibrating template, the side length of grid square；Stencil plane is placed in the visual field of two cameras in experiment, is acquired The image that pixel is 640 × 480；Then the angles and positions for adjusting template enable left and right camera shooting that can successfully be detected 20, the picture of angular coordinate.Binocular camera is demarcated using Zhang Zhengyou calibration method, is obtained between two cameras Matrix parameter, obtain two camera relative positions, binocular allow to be used to carry out vision calculating, solve projection matrix, interior Portion's parameter matrix and external parameter matrix；

2.1 camera calibration

World coordinate system, image coordinate system and camera coordinate system are the coordinate systems for needing to use in binocular vision.Such as Fig. 9 Shown national forest park in Xiaokeng figure is approximately pin-hole model video camera, p=[x y] in image^TPoint and space in point p= [x y z]^TBetween relationship can indicate are as follows: p=sHP

In formula: s is scale factor, and H is indicated by 2 matrixes: H=AW, A are the Intrinsic Matrix of video camera, (f_x, f_y) be Scale factor of the image in x-axis and y-axis direction, (c_x, c_y) be video camera principal point coordinate.W=[R T] is flat for video camera The rotation translation transformation in face and target object plane, as external parameters of cameras matrix.

Actually there is radial distortions and tangential distortion for camera, so the distortion parameter of camera must be solved, so as to energy Enough corrections generate the image of distortion.The position put after correction is enabled to be expressed as (x_p, y_p), the position put after distortion is expressed as (x_d, y_d), then have:

The calibration object of this paper is the fixed sieve skill c270 camera in two relative positions.Using one 9 × 6 chessboard Plane is 19mm as calibrating template, the side length of grid square, is placed on stencil plane in the visual field of two cameras, adopts in experiment Collect the image that pixel is 640 × 480, then adjust the angles and positions of template, left and right camera shooting is enabled can successfully to detect To 20, picture of angular coordinate；Calibrating procedure is executed, solves projection matrix, inner parameter matrix and external parameter matrix such as Shown in lower:

Dl=[- 4.33424e-002 2.64939e-001 1.02745e-002 6.62989e-003-2.94308e- 001]

Dr=[- 6.57543e-002 2.86837e-001 5.98181e-003-8.11034e-004-5.05326e- 001]

T=[1.31007643 7020e+000 of -7.26800189 8891e+002,2.11795302 8748e+000] Ml and Mr is the intrinsic parameter of left and right camera, and R is the spin matrix of left and right camera, and T is the translation matrix of left and right camera, Dl and Dr It is the distortion parameter of left and right camera.

Wherein spin matrix and translation matrix describe how that a point is transformed into camera coordinates from world coordinate system jointly It is spin matrix: describes direction of the reference axis of world coordinate system relative to camera coordinates axis；Translation matrix: it describes Under camera coordinate system, the position of space origins.

2.2 three-dimensional correction

It is that there is no complete coplanar rows to be aligned in true binocular stereo vision, in two camera planes.Institute Three-dimensional correction is carried out with us, the two images of non-co-planar row alignment in practice, is corrected into coplanar row alignment.OpenCv is We provide cvStereoRectify functions to complete left images to correction.

Calculate target point parallax for being formed on two views in left and right, first have to the point on the view of left and right two it is right The Pixel matching answered.However, on two-dimensional space match corresponding points be it is very time-consuming, in order to reduce matching search model It encloses, we make the matching of corresponding points be reduced to linear search by two-dimensional search using epipolar-line constraint.And the effect of three-dimensional correction is just Be eliminate distortion after two images strictly row correspond to so that two images to polar curve just in same horizontal line On, any point and its corresponding points on another piece image are only needed with regard to inevitable line number having the same on such piece image Corresponding points can be matched to by carrying out linear search in the row.When two planes of delineation are complete coplanar row alignments, calculate three-dimensional Parallax is simplest.It is that there is no two of complete coplanar row alignment but in the Binocular Stereo Vision System of reality Camera image plane.So we will carry out three-dimensional correction, the two images of non-co-planar row alignment in practice, it is corrected into Coplanar row alignment.OpenCV provides cvStereoRectify function and completes left images to correction, cvStereoRectify Function input parameters are the results of camera calibration: camera intrinsic parameter, distortion parameter, left and right camera spin matrix and are translated towards Amount；Output parameter has the row alignment correction spin matrix between the camera plane of left and right, the projection matrix and re-projection square of left and right camera Battle array Q.Then function cvInitUndistortRectifyMap is called to left images respectively, which returns to correction image institute The mapping matrix needed.Function cvRemap is finally called, non-fault image is obtained.

2.3 Stereo matchings and ranging

The basic schematic diagram of ranging as shown in Figure 10, specific ranging formula are as follows:

In formula, f is camera focus, and T is two image centers away from x^l-x^rFor parallax, Z is depth information, and obstacle can be obtained The distance and bearing of object and robot, P are object under test position；Parallax value and depth are the relationships of inverse ratio, when parallax is closer When 0, the minor change of parallax will lead to biggish change in depth, so the ranging scheme with video camera only to being closer Object has higher precision.

Half overall situation SGBM Stereo matching is carried out to the left and right picture that the binocular camera corrected obtains, finds same object The distance between the respective coordinates of point in two images, and obtain them --- parallax, Stereo matching are exactly to match left and right phase The identical point of machine image pair, obtains disparity map；Block matching algorithm is a kind of relatively outstanding matching algorithm, and OpenCV is realized One quickly and effectively block Stereo Matching Algorithm, it uses one and the wicket method of " absolute error is accumulative " is made to control to match Identical point between two width orthoscopic images pair.The algorithm can only match the strong Texture Points between two images, and key step is such as Under:

(1) pre-filtering is carried out first, so that brightness of image normalizes while reinforcing image texture；

(2) matching search is carried out then along horizontal polar curve sliding SAD window；

(3) it is finally filtered again, removes the point of those error hidings.

In OpenCV, cvFindStereoCorrespondence SGBM function can be used and realize the non-distortion in left and right The matching of image pair, obtains disparity map.

If it is known that picture point (x, y) and parallax value d, so that it may calculate the depth of object using following formula:

Q[x y d 1]^T=[X Y Z W]^T (4-11)

The three-dimensional coordinate of spatial point is exactly (X/W Y/W Z/W)；

It is implemented using function cvReprojectImageTo3D, the results are shown in Table 1 for experiment.

Table 1

In analysis known to table when target object to be measured is closer apart from video camera, the depth information of systematic survey target Precision is higher, as distance becomes remote, measurement accuracy decline.We do not pursue the accuracy of distance herein, the experiment knot Fruit meets the requirement that environmental information is broadcasted for blind person.

If selection area finds the leftmost pixel of the pixel region with nearest object information in left one side of something of image Position, the center pixel position with whole image calculate robot and wait for deviation angle: according to object using trigonometric function Position of the edge in image pixel, and according to the direction the x difference of the location of pixels and image intermediate pixel position divided by the direction x For pixel value multiplied by the visual angle (field-of-view angle that picture represents) of whole image, the angle that real world object and robot can be obtained is inclined It moves；Robot chassis control, the angle that motor control universal wheel gyration, that is, blind person should be moved to the left are transmitted to after obtaining angle；Together If managing selection area in right one side of something of image, i.e. the angle that should move right of blind person；

Step 3) by barrier classification information that step 1) and step 2) obtain and takes obstacle by espeak speech synthesis The distance and bearing of object and robot carries out voice broadcast；This experiment utilizes the espeak speech synthesis tool under linux system, By Image Acquisition to text information be converted into voice messaging, it can be achieved that robot broadcast current barrier between robot Distance and with people immediately ahead of formed angle；

Espeak speech synthesizer is simple and fast, small in size, text can be read out from standard input, voice output can The file for saving as .WAV format provides the selection of muli-sounds characteristic, is a simple and fast speech synthesis tool.

It mainly include three parts for voice broadcasting system: the type casting of barrier, the distance casting of barrier, row The avoidance of people is broadcasted.

Motion planning and robot control and sensor signal receive system

Robot control program's raspberry pie, python are programmed.

Fig. 4 is Mecanum wheel wheeled vehicle control method flow chart,

Control Mecanum wheel wheeled vehicle detects that barrier is located at the opposite direction traveling of vehicle to infrared sensor 5；

That is: if Mecanum wheel wheeled vehicle rear has signal (i.e. barrier is located at rear of vehicle), control vehicle to Preceding traveling；

If having signal in front of Mecanum wheel wheeled vehicle, vehicle travel backwards are controlled；

If there is signal in Mecanum wheel wheeled vehicle left, vehicle is controlled to right travel；

If there is signal in Mecanum wheel wheeled vehicle right, controls vehicle and travel to the left；

If Mecanum wheel wheeled vehicle front and back has signal, vehicle stopping is controlled；

If there are signal in Mecanum wheel wheeled vehicle rear and left, traveling before controlling vehicle to the right；

If there are signal in Mecanum wheel wheeled vehicle rear and right, traveling before controlling vehicle to the left；

If in front of Mecanum wheel wheeled vehicle and there is signal in left, controls vehicle and travel afterwards to the right；

If in front of Mecanum wheel wheeled vehicle and there is signal in right, vehicle is controlled to left back traveling；

If there is signal in Mecanum wheel wheeled vehicle left front, controls vehicle and travel afterwards to the right；

If there is signal in Mecanum wheel wheeled vehicle right front, vehicle is controlled to left back traveling；

If there is signal in Mecanum wheel wheeled vehicle left back, traveling before controlling vehicle to the right；

If there is signal in Mecanum wheel wheeled vehicle right back, traveling before controlling vehicle to the left；

Hardware system analysis of experimental results

The design can control robot bobbin movement speed and eight direction movements, moreover it is possible to realize automatic obstacle avoiding function. The avoidance for mainly realizing robot with 8 near infrared sensors in the design, is separately mounted to a forward and backward left side, right, left front, right Before, it is left back, it is right after 8 orientation, for detecting whether objects in front from robot is less than 20cm.

Intelligent sound blind-guidance robot of the present invention realizes the voice broadcast of accurate intelligent, to prompt the object in front of blind person Classification, and the distance of nearest object is provided, so that blind person is had accurate understanding to environment, improves the quality of living.Main fusion herein Movement control technology, sensor technology and image processing techniques complete setting for entire intelligent sound blind-guidance robot system Meter, including hardware circuit design, the control routine of trolley are write, image identification codes are write.Judging from the experimental results, this is led Blind machine people realizes avoidance, the accurately functions such as object identification, accurate reasonable ranging interception, voice broadcast substantially.

Claims

1. a kind of intelligent sound blind-guidance robot control method, which comprises the following steps:

Step 1) obtains robot forward image by binocular camera, uses yolo algorithm to robot the image of acquisition Front obstacle carries out classification and identifies to obtain barrier classification information；

Step 2) carries out binocular calibration to binocular camera, obtains barrier and robot using calibrated binocular camera Distance and bearing；

Step 3), by espeak speech synthesis by barrier classification information that step 1) and step 2) obtain and take barrier and The distance and bearing of robot carries out voice broadcast；

Step 4) detects the position that detection barrier is located at vehicle using to infrared sensor, controls vehicle to barrier place The opposite direction of vehicle location is mobile.

2. a kind of intelligent sound blind-guidance robot control method according to claim 1, which is characterized in that in step 1), Based on darknet neural network structure, neural network sample training and test are carried out using image of the YOLO algorithm to acquisition.

3. a kind of intelligent sound blind-guidance robot control method according to claim 2, which is characterized in that calculated using YOLO Method carries out neural network sample training to the image of acquisition: it is S*S small that the image that binocular camera obtains, which is divided into size, Grid, each grid are responsible for detecting whether that the center of examined object has been fallen in grid, and B is generated when each grid detection and is examined Survey frame, each frame include 5 dimension information: x, y, w, h, Cobj, x, y, w, h, Cobj respectively represent frame center abscissa, Ordinate, the width of entire image, entire image height and surround the confidence level of frame, shown in confidence level such as formula (1) (2):

In formula (1), Pr (Object) is probability of the object in grid, and existing then is 1, otherwise the mark of reference is indicated for 0, IOU The friendship of quasi- frame and detection block and area ratio；

YOLO algorithm extracts feature in convolutional layer, and target prediction is carried out in full articulamentum, as Pr (Object)=1, whole picture figure The confidence level of certain class such as formula (3) in piece:

Pr in formula (Class | Object) is the class condition probability of target to be detected；Pr(Class_i) predict certain classification Probability；The threshold value of detection frame is set, the encirclement frame that score is lower than threshold value is filtered, non-maximum suppression is carried out to the frame of reservation System processing, can be obtained barrier in image and corresponds to confidence level, i.e. barrier classification information.

4. a kind of intelligent sound blind-guidance robot control method according to claim 1, which is characterized in that in step 2), The distance and bearing of barrier and robot is obtained specifically includes the following steps: taking the photograph to binocular using calibrated binocular camera Camera is demarcated, and is defined p=[x y]^TPoint and space in point p=[x y z]^TBetween relationship be expressed as: p=sHP

In formula: s is scale factor, and H is indicated by matrix: H=AW, A are the Intrinsic Matrix of video camera, (f_x, f_y) it is image in x Scale factor on axis and y-axis direction, (c_x, c_y) be video camera principal point coordinate；W=[R T] is used for camera plane and mesh Mark the rotation translation transformation of object plane, as external parameters of cameras matrix.

5. a kind of intelligent sound blind-guidance robot control method according to claim 4, which is characterized in that calibrated Binocular camera carries out three-dimensional correction: the position put after correction is expressed as (x_p,y_p), the position put after distortion is expressed as (x_d, y_d), then have:

6. a kind of intelligent sound blind-guidance robot control method according to claim 4, which is characterized in that specific ranging is public Formula are as follows:

In formula, f is camera focus, and T is two image centers away from x^l-x^rFor parallax, Z is depth information, can be obtained barrier with The distance and bearing of robot, P are object under test position.

7. a kind of intelligent sound blind-guidance robot control method according to claim 6, which is characterized in that what is corrected The left and right picture that binocular camera obtains carries out half overall situation SGBM Stereo matching, finds pair of the same object point in left and right picture The parallax value d for answering coordinate corresponding to coordinate and two points, is calculated by the following formula the depth of object:

Q[x y d 1]^T=[X Y Z W]^T (4-11)

The three-dimensional coordinate of spatial point is exactly (X/W Y/W Z/W).

8. a kind of intelligent sound blind-guidance robot control method according to claim 7, which is characterized in that according to object The direction the x difference of position of the edge in image pixel and image intermediate pixel position is divided by the direction x pixel value multiplied by entire figure The visual angle of picture, can be obtained real world object and robot waits for angle offset, and the visual angle of whole image is the visual field that image represents Angle.

9. a kind of blind-guidance robot system based on claim 1 the method, which is characterized in that wheeled including Mecanum wheel Vehicle (2) and the control system (1) being set on Mecanum wheel wheeled vehicle (2), binocular camera (4), voice announcer (3) With infrared sensor (5)；

Control system includes main control unit and the infrared distance measurement unit connecting with main control unit, travelling control unit and language Sound control unit；Main control unit is for loading above method program；

Infrared distance measurement unit obtains the obstacle information around Mecanum wheel wheeled vehicle (2) by infrared sensor (5), and will The obstacle information of acquisition is transmitted to main control unit；

Main control unit generates avoidance route information according to the obstacle information of acquisition, and avoidance route information is transmitted to walking control Unit processed realizes Mecanum wheel wheeled vehicle avoidance purpose；Travelling control unit is for receiving main control unit control signal driving Mecanum wheel wheeled vehicle motor movement；

Binocular camera (4) is for obtaining Mecanum wheel wheeled vehicle forward image, and the image transmitting that will acquire is to main control Unit；Main control unit carries out identification and ranging according to the image of acquisition, and identification information and ranging information are then passed through voice control Unit control voice announcer (3) processed is broadcasted.

10. a kind of blind-guidance robot system according to claim 9, which is characterized in that travelling control unit is using L298N electricity Machine drive module；The wheeled chassis of Mecanum wheel builds the wheeled of mode using the Mecanum wheel of four DC MOTOR CONTROLs Chassis.