WO2018042208A1

WO2018042208A1 - Street asset mapping

Info

Publication number: WO2018042208A1
Application number: PCT/GB2017/052576
Authority: WO
Inventors: James Kent; Angel Bueno RODRIGUEZ
Original assignee: Xihelm Limited
Priority date: 2016-09-05
Filing date: 2017-09-05
Publication date: 2018-03-08
Also published as: GB2556328A; GB201615044D0

Abstract

A method of classifying physical elements in a plurality of images for producing a map of physical objects, comprising the steps of: receiving a plurality of input data in respect of a predetermined range of physical locations; and searching the plurality of input data using a learned algorithm to identify physical elements common to the plurality of input data and a likely respective location of said identified physical elements.

Description

STREET ASSET MAPPING

Field

The present invention relates to street asset surveying. More particularly, the present invention relates to a street asset surveying method and tool for infrastructure companies utilising machine learning and computer vision techniques.

Background

Highway management organisation need reliable data on the condition and location of assets for which they are responsible, in order to properly maintain them. Examples of such organisations include utility companies, highway contractors and local authorities. The assets, or networks of assets, for which they are responsible are varied, and can include, for example, manhole covers, drain covers, pavement edges, trees or street lighting. A further example of an asset is road and pavement surfaces.

Conventional asset surveys, such as PAS128 surveys using ground penetrating radar, are generally very expensive, with a high cost per unit area surveyed. Utility companies, local authorities and highway contractors need to perform cost effective asset monitoring at the scale of their network. Therefore, these organisations often rely upon individual reports and updates, which are infrequent, inaccurate and subjective, and often result in fines for contractors.

Alternatively, they use occasional sample-based surveys, which can result in potential lawsuits when assets have not been properly managed. The vast majority of the highway system (245,000 miles in the UK) does not get frequent effective and complete inspections. Building up accurate asset location, identification and condition monitoring is therefore very important in order to achieve efficient highway management. It is estimated that full conventional mapping of the UK using PAS128 would take approximately 10,500 man years, which is impractical and disruptive to traffic, as well as having a high cost.

Current methods also often require roads to be closed, further adding to the cost. Summary of Invention

According to a first aspect, there is provided a method of classifying physical elements in a plurality of images for producing a map of physical objects, comprising the steps of: receiving a plurality of input data in respect of a predetermined range of physical locations; and searching the plurality of input data using a learned algorithm to identify physical elements common to the plurality of input data and a likely respective location of said identified physical elements.

Using a learned algorithm to search input data for physical objects and/or assets can lead to an increased efficiency in performing asset surveys, and/or improved accuracy in such asset surveys and/or reduced survey time when compared with traditional asset survey techniques.

Optionally, the method further comprises the step of dividing the predetermined range of physical locations into a grid of regular sized areas, optionally having grid points in the grid assigned Cartesian co-ordinates and/or GPS co-ordinates.

Dividing a physical location into a grid for searching purposes allows the search to be performed more efficiently. Assigning co-ordinates, such as Cartesian and/or GPS co-ordinates to the grid can allow for more accurate mapping of assets.

Optionally, the method further comprises receiving existing map data for at least the predetermined range of physical locations and overlaying the existing map data with the predetermined range of physical locations.

Overlaying the grid onto a pre-existing map can allow for a more efficient search to be performed by indicating areas of likely interest.

Optionally, the method further comprises the step of determining one or more areas within the predetermined range of physical locations that are not able to be searched for physical elements, optionally wherein such areas not able to be searched include any of: private property; housing; and buildings.

Optionally, the step of searching the plurality of input data is performed using a learned algorithm to identify physical elements common to the plurality of input data and a respective location of said identified physical elements is not performed for areas not able to be searched.

By determining areas that are unable to be searched before beginning the asset search, these areas can be eliminated from the search, leading to a reduction in the search time.

Optionally, the plurality of input data includes any of: street view images; photographs; utility maps; utility databases; third party data; or academic or research data.

Using these data sources can provide accurate input data for the learned algorithms to search in order to identify asset locations accurately.

Optionally, the method further comprises the step of performing image processing on at least one of the plurality of input data, optionally wherein the image processing includes any of: edge enhancement; mean shift segmentation; morphological operations; chain approximation and Canny edge detection.

Enhancing input data using image processing prior to searching can allow for more accurate identification of assets by the learned algorithms when compared with un- enhanced/raw input data.

Optionally, the learned algorithm includes any of: hybrid neural networks; convolutional neural networks; or recurrent neural networks.

These types of algorithm can provide fast, accurate and trainable asset recognition.

Optionally, the step of searching the plurality of input data using a learned algorithm to identify physical elements common to the plurality of input data and a respective location of said identified physical elements is performed as a search loop.

Performing the search in a loop allows the search to be iterated using multiple sets of input data, potentially increasing the accuracy of the search.

Optionally, the method further comprises the step of performing location corrections for the input data.

Previously identified assets may be incorrectly identified or located in the original input data, so may require correcting with the potentially more accurate asset locations identified by the learned algorithms.

Optionally, the method further comprises the step of generating associated probability and/or confidence data for each identified physical element; optionally including any of a likely identity and/or condition of each identified physical element.

Confidence scores can be used to indicate the likelihood of an asset being present, thereby providing additional data to a user.

Optionally, the physical element is an infrastructure element.

Infrastructure elements can include, but are not limited to: manhole covers; drain covers; lamp posts; pavement edges; trees; and/or road surfaces. Maintaining an accurate and up-to-date map of infrastructure asset locations and their associated conditions can allow for organisations, such as highway management organisations to more accurately tender and mobilise projects.

According to another aspect, there is provided a method substantially as hereinbefore described in relation to the Figures.

According to a further aspect, there is provided an apparatus or system substantially as hereinbefore described in relation to the Figures.

Brief Description of Drawings

Embodiments will now be described, by way of example only and with reference to the accompanying drawings having like-reference numerals, in which:

Figure 1 illustrates a flow diagram of an example of the survey method;

Figure 2A illustrates an example input image after mapping to a grid;

Figure 2B illustrates example asset location probabilities assigned to the grid of Figure 2A;

Figure 3A illustrates an example of a grid overlay for an existing map;

Figure 3B illustrates an example of identified viable search areas in the map of Figure 3A;

Figure 4 illustrates an example of the use of a hybrid model to identify assets in a set of images; and

Figure 5 illustrates an example of a training process for a hybrid model.

Specific Description

Referring to Figure 1 , an exemplary embodiment of the asset surveying method will now be described. Figure 1 illustrates a flow diagram of an example of the survey method. At the first stage 2 a grid overlay is constructed at a required granularity, for example a 1 m by 1 m grid. The grid points in the grid overlay are assigned Cartesian coordinates and GPS co-ordinates corresponding to the region being surveyed. Given a region of interest and the starting/end point coordinates, a grid is generated with the desired granularity. For example, for a small area, a 364x151 size grid with 1 x1 m granularity can be generated.

If existing maps of the survey region are available, the constructed grid overlay is virtually overlaid on top of the existing map 4. The GPS coordinate data assigned to the grid overlay is used to correctly align the grid with the existing map, creating a gridded map.

The areas of the gridded map are then classified 6 using an algorithm to determine which area class they belong to and whether that area is a viable search area or not. For example, roads and public gardens can be classified as searchable areas, while private property and housing can be classified as non-searchable.

Once the searchable areas have been classified, the grid squares are searched using known data sources 8, such as images of the search area. If no-existing map was available, then the method proceeds to this step from grid construction without creating a gridded map. All searchable grid squares are searched using the available data sources.

During the search, imagery, which can typically be street view imagery, for the search area is retrieved. The retrieved images can undergo automated image processing, such as edge enhancement, mean shift segmentation, morphological operations, chain approximation and Canny edge detection, prior to being used. A search is then performed on the grid to detect assets using trained computer vision and machine learning algorithms 12, for example using hybrid neural networks coupled with a region of interest method. The search can, in some embodiments, be assisted by underlying utility maps or utility databases. The search is performed in a search loop.

In some embodiments, the search is performed by mapping 10 portions of the retrieved images to the grid, and then identifying features in the resulting gridded image using computer vision and machine learning algorithms 12. During the image mapping 10, location corrections can be performed, for example those resulting from GPS inaccuracy problems like, for example, urban canyoning. The algorithms output probabilities and/or confidence 14 of the feature in the grid square being an asset, and the likely identity and condition of the asset. The condition and identity of the asset can also have a probability and/or confidence 14 score associated with them. This process is repeated until all the retrieved images/views have been mapped to the grid and analysed by the computer vision and machine learning algorithms, with the probabilities assigned to the grid being combined into a total score for each grid square. If the probability of an asset being present in a grid square is above a pre-determined threshold 16, the asset is stored in a database along with its location 18. Optionally, the asset type and condition are also stored. If assets had previously been located at a wrong position in an underlying map, then the position can be updated and the error flagged.

In other embodiments, the retrieved images/views are analysed directly using the computer vision and machine learning algorithms 12. Features in the images are assigned a probability/confidence score 14 of being an asset, an asset type and an asset condition. The condition and identity of the asset can also have a probability/confidence score associated with them. The identified feature is then mapped onto the grid, along with the associated asset, identity and condition probabilities. This process is repeated for all of the retrieved images, and the results combined into a total score for each identified feature. If the probability of a feature being an asset above a predetermined threshold 16, the asset and its location, optionally along with its identity and its condition, are stored in a database 18. If assets had previously been located at a wrong position in an underlying map, then the position can be updated and the error flagged.

In this way, an accurate asset location, identification and condition record of assets and their locations can be built up.

Referring now to Figure 2A, an example input image after mapping to a grid will be described.

Each grid square of the grid 20 is assigned a portion of the retrieved input image. Images are mapped dynamically to the grid 20 overlay to populate the grid squares. Cartesian co-ordinates on the grid are converted to a GPS WGS84 datum to find the right image portion, which is then cropped and/or tiled accordingly to fit the relevant grid squares. As it is unlikely that a single image will cover the whole of the area to be searched, each pass of the search loop may only map an image to a sub-set of the grid. In the example shown, the image has been mapped to a 3 by 3 sub-set of a larger grid (not shown). The example image is of a road surface 22, with a drain cover 24 straddling the boundary between grid square one 26 and grid square two 28.

Referring now to Figure 2B, an illustration of an example asset location probabilities assigned to the grid of Figure 2A will now be described.

The gridded image of Figure 2A is used as an input to the computer vision and machine learning algorithms. These assign a probability of an asset being present to each of the grid squares. In this example, grid square one 26 has been assigned a high probability of an asset being present, due to the image of the drain cover 24 lying mostly within that square. There is a lower calculated probability of an asset being present in the second square 28 due to the image of the drain cover only partially overlapping the second square. The other squares have been assigned much lower probabilities as features present there are unlikely to be assets. As more images are analysed, the probabilities can be updated.

The probabilities for a particular grid square are taken from the last layer of the Al or machine learning model, prior any softmax regression. Given a number of potential asset types, the highest outputted probability of a particular asset type being present is mapped to the associated grid square. The probabilities are mapped to the type of assets using the last layer of the Al or machine learning model. For example, in an Al or machine learning model with 10 different asset types, the probability of a potential asset in the grid being each of asset types can be arranged in an array:

[Typel, Type2, Type3, Type4, Type5, Type6, Type7, Type8, Typed, TypeW]

A probability is assigned to each type, with the highest probability being the one output to the grid. Note that the sum of all the probabilities is equal to one.

One of the types can correspond to the case where no asset is present. However, it can be risky to use a prediction obtained from the model when the prediction is "no-asset", as an asset, such as a manhole cover for example, may be partially occluded and therefore the Al or machine learning model may take it as part of the background. To overcome this limitation, the model can be forced to check the "no- assets" predictions and using the following technique. Take, for example, Typel O as the output for "no-asset". Referring now to Figure 2B, for the grid square 3, the array of probabilities could be for example: a = [0.01, 0.01, 0.01, 0.008, 0.002, 0.01, 0.01, 0.01, 0.93]

Typel O having a value of 0.93 is taken to mean that the model is 93% secure that a street asset, such as for example a manhole cover, is not present. By performing the subtraction 100% - 93% = 7%, a value of 7% probability that a street asset is present can be obtained.

For example, in figure 2B, for the grid square 1 the output of the last layer for that particular image would be, for example: [0.6, 0.1 , 0.05, 0.05 0.05, 0.05, 0.01 , 0.01 , 0.01 , 0.07]. Therefore, our model will select a type 1 asset, with a confidence of 60%.

Figure 3A illustrates an example of a grid overlay for an existing map. Figure 3B illustrates an example of identified viable search areas in the map of Figure 3A.

In some embodiments, when existing map data 30 is available, the grid overlay 32 is overlaid onto the map 30 to create a gridded map 34, as shown in Figure 3A. The gridded map can be used to identify relevant searchable areas of the grid, such as the roads or pavements. These areas are identified and classified by techniques including, for example, colour shift segmentation and clustering with a K-means clustering technique. This allows areas for the search to be focussed on to be identified. In the example shown, the area of the map that has been identified as housing 36 has been classified as a non-searchable area, while the roadway 38 and gardens 40 have been classified as searchable areas.

Figure 4 illustrates an example of the use of a hybrid model to identify assets in a set of images.

The hybrid model 42 comprises a Convolutional Neural Network (CNN) 44 and, optionally, a Recurrent Neural network 46, which have been trained on known images 48 to identify features 50 within images. The model is configurable to allow for the type of street asset of interest to be varied, for example by using different combinations of pre-trained CNNs 44 and RNNs 46 that have been trained under different conditions. The Convolutional Neural Network 44 part of the hybrid model is used to extract visual features from the images, and Recurrent Neural Network 46 is used as a spatial classifier. Both parts of the hybrid model 42 can be trained jointly. The RNN 46 can form the last layer of the model, with a time step of one.

In some embodiments, the hybrid model only comprises a CNN 44.

Figure 5 illustrates an example of a training process for a hybrid model.

The first step is Data Importing 52. Data is read from a database comprising images and/or maps. If necessary, street-view images are obtained via an API.

Following the importation of data, the data is processed into a usable format in the Data Processing 54 step. All the data is gathered, images are resized as necessary and inserted into the grids for posterior processing of the model. At this stage, third party 56 data can also be imported.

The generated gridded images then undergo Feature Extraction 58. Specific computer vision algorithms are used here to extract features for the Al or machine learning model. If needed, regions of interest are computed per image in order to determine if any clear asset is in the image, therefore potentially reducing posterior computational operations.

Once the features have been extracted, the images are used for Model Training 60. The Hybrid model, comprising a CNN 44 and, optionally, an RNN 46, is trained to discriminate assets within the images using deep learning techniques. Examples of such techniques include: stochastic gradient descent (SGD), with momentum ; batch normalization; and or early stopping criteria.

The trained models then undergo Evaluation 62. A test set is used to evaluate the performance of the trained hybrid model.

When the performance of the trained models satisfies the requirements, the hybrid models are deployed and undergo ongoing evaluation 64. Models are stored and deployed from a database and GIS systems. Further processing/evaluation can be performed afterwards if needed, using different frameworks, for example to update the model using new data or incorporate asset types into the model.

The neural network parameters are highly tailored to ensure sufficient quality levels (true positive, false positive etc.) and throughput in performance terms (e.g. images per second). This high level of tailoring is required as certain assets, for example manholes covers, are not easy to classify, since Al or machine learning models can sometimes be fooled by the background patterns. Therefore, special approaches need to be performed in some cases by applying special computer vision algorithms.

An example of such an approach is to re-apply a quick-segmentation algorithm to erase the background pattern. Following this, a Gaussian filter and a thresholding operation with canny edge detection are applied, followed by region of interest proposal to help localize the asset of interest. By doing this, the background patterns can be erased and the model assisted in learning only the type of assets required. This makes the problem approachable for this industry.

Any system feature as described herein may also be provided as a method feature, and vice versa. As used herein, means plus function features may be expressed alternatively in terms of their corresponding structure.

Any feature in one aspect may be applied to other aspects, in any appropriate combination. In particular, method aspects may be applied to system aspects, and vice versa. Furthermore, any, some and/or all features in one aspect can be applied to any, some and/or all features in any other aspect, in any appropriate combination.

It should also be appreciated that particular combinations of the various features described and defined in any aspects can be implemented and/or supplied and/or used independently.

Claims

CLAIMS:

1 . A method of classifying physical elements in a plurality of images for producing a map of physical objects, comprising the steps of:

receiving a plurality of input data in respect of a predetermined range of physical locations; and

searching the plurality of input data using a learned algorithm to identify physical elements common to the plurality of input data and a likely respective location of said identified physical elements.

2. The method of claim 1 , further comprising the step of dividing the predetermined range of physical locations into a grid of regular sized areas, optionally having grid points in the grid assigned Cartesian co-ordinates and/or GPS co-ordinates.

3. The method of any previous claim, further comprising receiving existing map data for at least the predetermined range of physical locations and overlaying the existing map data with the predetermined range of physical locations.

4. The method of any previous claim, further comprising the step of determining one or more areas within the predetermined range of physical locations that are not able to be searched for physical elements, optionally wherein such areas not able to be searched include any of: private property; housing; and buildings.

5. The method of claim 4, wherein the step of searching the plurality of input data is performed using a learned algorithm to identify physical elements common to the plurality of input data and a respective location of said identified physical elements is not performed for areas not able to be searched.

6. The method of any previous claim, wherein the plurality of input data includes any of: street view images; photographs; utility maps; utility databases; third party data; or academic or research data.

7. The method of any previous claim, further comprising the step of performing image processing on at least one of the plurality of input data, optionally wherein the image processing includes any of: edge enhancement; mean shift segmentation; morphological operations; chain approximation and canny edge detection.

8. The method of any previous claim, wherein the learned algorithm includes any of: hybrid neural networks; convolutional neural networks; or recurrent neural networks.

9. The method of any previous claim, wherein the step of searching the plurality of input data using a learned algorithm to identify physical elements common to the plurality of input data and a respective location of said identified physical elements is performed as a search loop.

10. The method of any previous claim, further comprising the step of performing location corrections for the input data.

1 1 . The method of any previous claim, further comprising the step of generating associated probability and/or confidence data for each identified physical element; optionally including any of a likely identity and/or condition of each identified physical element.

12. The method of any preceding claim, wherein the physical element is an infrastructure element.

13. A method substantially as hereinbefore described in relation to the Figures.

14. An apparatus substantially as hereinbefore described in relation to the Figures.