CN113658275A

CN113658275A - Visibility value detection method, device, equipment and storage medium

Info

Publication number: CN113658275A
Application number: CN202110970613.8A
Authority: CN
Inventors: 朱铖恺; 陈康; 谭发兵; 武伟
Original assignee: Shenzhen Sensetime Technology Co Ltd
Current assignee: Shenzhen Sensetime Technology Co Ltd
Priority date: 2021-08-23
Filing date: 2021-08-23
Publication date: 2021-11-16

Abstract

The embodiment of the specification provides a visibility value detection method, a visibility value detection device, equipment and a storage medium. A large number of sample images can be obtained, and the visibility value of the sample images is determined according to the depth information of the sample images and the conduction rate of the sample images. And then training the neural network through the sample images carrying the labels with the calibrated visibility values, and predicting the visibility values of the target areas in the images by using the trained neural network. The visibility value of the target area in the image is directly predicted by training the neural network, so that special detection equipment does not need to be deployed, and the visibility values in various scenes can be accurately predicted.

Description

Visibility value detection method, device, equipment and storage medium

Technical Field

The present disclosure relates to the field of artificial intelligence technologies, and in particular, to a method, an apparatus, a device, and a storage medium for detecting visibility values.

Background

The visibility in the environment is low due to the weather such as fog and haze, and the traffic trip is affected. For example, the common fog climate is affected by the microclimate environment of local areas, fog with lower visibility appears in the local range of fog, when fog occurs, the visibility suddenly and rapidly decreases, and the fog is difficult to predict and forecast in advance, so that the fog climate has great harm to road traffic safety, and particularly, serious traffic accidents are easy to cause on high-speed highways. Therefore, it is necessary to provide a scheme for conveniently and accurately determining the visibility value of the target area.

Disclosure of Invention

The disclosure provides a visibility value detection method, a device, equipment and a storage medium.

According to a first aspect of the embodiments of the present disclosure, there is provided a method for detecting visibility values, the method including:

acquiring an image of a target area;

determining the visibility value of the target area through a pre-trained neural network and the image; wherein the neural network is trained based on:

acquiring a sample image;

determining a calibrated visibility value of the sample image based on the depth information of the sample image and the conduction rate of the sample image, wherein the calibrated visibility value is used as a label of the sample image; the conductivity is related to depth information of the sample image and a media content in air in the sample image;

and training a preset initial neural network through the sample image carrying the calibration visibility value label to obtain the neural network.

In some embodiments, the sample image includes a target object including a plurality of target points having known relative positions in three-dimensional space, which are coplanar and non-collinear;

the depth information of the sample image is determined based on:

calibrating internal parameters of an image acquisition device for acquiring the sample image and external parameters of the image acquisition device based on the target point;

and determining the depth information of the pixel points in the sample image according to the internal parameters, the external parameters and the height of the image acquisition device.

In some embodiments, the target object comprises an identification line in a road, the target point comprising an end point of the identification line; or

The target object includes a sign in a roadway, and the target point includes a vertex of the sign.

In some embodiments, a nominal visibility value for the sample image is determined based on the depth information of the sample image and the conductivity of the sample image,

determining a target pixel point from the sample image, wherein a three-dimensional point corresponding to the target pixel point is coplanar with the target point, and the depth information of the target pixel point is in a preset depth range;

determining a visibility value corresponding to the target pixel point based on the depth information of the target pixel point and the conduction rate of the target pixel point;

and taking the average value of the visibility values corresponding to the target pixel points as the calibrated visibility value of the sample image.

In some embodiments, the conductivity is determined by a dark channel prior method.

In some embodiments, the training of the preset initial neural network through the sample image carrying the calibrated visibility value tag includes:

the initial neural network is trained through a main sample image and an auxiliary sample image, wherein the main sample image is a sample image carrying a label of a calibrated visibility value, the auxiliary sample image is a sample image carrying a label of a calibrated visibility class, and different calibrated visibility classes correspond to different visibility value ranges.

In some embodiments, the training of the initial neural network by the main and auxiliary sample images comprises:

determining a first loss according to the difference between a predicted visibility value corresponding to the main sample image output by the initial neural network and a calibrated visibility value corresponding to the main sample image;

determining a second loss according to a prediction result of the visibility category to which the auxiliary sample image belongs and the calibrated visibility category of the auxiliary sample image, wherein the prediction result is output by the initial neural network;

training the initial neural network based on the first loss and the second loss.

In some embodiments, said training said initial neural network based on said first loss and said second loss comprises:

training the initial neural network based on the target loss determined by the first loss and the second loss; or,

and training the initial neural network based on the first loss after the initial neural network is trained based on the second loss.

In some embodiments, the first loss is determined based on:

determining the first loss according to the average absolute error percentage of the predicted visibility value corresponding to the main sample image and the calibrated visibility value corresponding to the main sample image; and/or the presence of a gas in the gas,

the second loss is determined based on:

determining the prediction probability of the auxiliary sample image belonging to each visibility category defined in advance according to the neural network;

determining the real probability of the auxiliary image belonging to each predefined visibility category according to the calibrated visibility category of the auxiliary image;

determining the second loss based on a cross-loss of the predicted probability and the true probability.

In some embodiments, the target area comprises a road area, the image comprises an image captured by an image capturing device disposed on the road, the method further comprises:

determining the hazard level of the foggy weather based on the visibility value of the target area;

and managing and controlling the traffic in the road area according to a management and control strategy corresponding to the hazard level.

In some embodiments, the target area includes a road area, the image includes multiple frames of images acquired by image acquisition devices arranged on the road at different times, and the method includes:

and predicting the change trend of the foggy weather in the road area based on the change trend of the visibility value of the target area in the multi-frame image.

In some embodiments, the target area includes a road area, the image includes an image captured by an image capturing device disposed on a road at a preset time interval, and the method includes:

responding to the visibility value corresponding to the image exceeding a preset threshold value, and determining that the fog climate appears in the road area;

determining the frequency of the road area in which the fog climate occurs based on the total number of times the road area in the target time period has the fog climate.

According to a second aspect of embodiments of the present disclosure, there is provided a device for detecting visibility values, the device comprising:

the acquisition module is used for acquiring an image of a target area;

the prediction module is used for determining the visibility value of the target area through a pre-trained neural network and the image; wherein the neural network is trained based on:

acquiring a sample image;

According to a third aspect of the embodiments of the present disclosure, an electronic device is provided, where the electronic device includes a processor, a memory, and computer instructions stored in the memory and executable by the processor, and when the processor executes the computer instructions, the method of the first aspect may be implemented.

According to a fourth aspect of embodiments of the present disclosure, there is provided a computer-readable storage medium having stored thereon computer instructions which, when executed, implement the method mentioned in the first aspect above.

In the embodiment of the disclosure, a large number of sample images can be obtained, and the sample images are calibrated in advance according to the depth information of the sample images and the visibility value determined by the conduction rate of the sample images. And then training the initial neural network through the sample images carrying the labels with the calibrated visibility values, and predicting the visibility values of the target areas in the images by using the trained neural network. The visibility value of the target area in the image is directly predicted by training the neural network, so that special detection equipment does not need to be deployed, and the visibility values in various scenes can be accurately predicted.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and, together with the description, serve to explain the principles of the disclosure.

Fig. 1 is a flowchart of a visibility value detection method according to an embodiment of the present disclosure.

Fig. 2 is a road lane schematic of an embodiment of the present disclosure.

Fig. 3 is a schematic view of an application scenario for determining road visibility values according to an embodiment of the disclosure.

Fig. 4 is a schematic diagram of a neural network structure according to an embodiment of the present disclosure.

Fig. 5 is a schematic logical structure diagram of an apparatus for determining road visibility values according to an embodiment of the disclosure.

Fig. 6 is a schematic diagram of a logical structure of an electronic device according to an embodiment of the present disclosure.

Detailed Description

Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The implementations described in the exemplary embodiments below are not intended to represent all implementations consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the present disclosure, as detailed in the appended claims.

The terminology used in the present disclosure is for the purpose of describing particular embodiments only and is not intended to be limiting of the disclosure. As used in this disclosure and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used herein refers to and encompasses any and all possible combinations of one or more of the associated listed items. In addition, the term "at least one" herein means any one of a plurality or any combination of at least two of a plurality.

It is to be understood that although the terms first, second, third, etc. may be used herein to describe various information, such information should not be limited to these terms. These terms are only used to distinguish one type of information from another. For example, first information may also be referred to as second information, and similarly, second information may also be referred to as first information, without departing from the scope of the present disclosure. The word "if" as used herein may be interpreted as "at … …" or "when … …" or "in response to a determination", depending on the context.

In order to make the technical solutions in the embodiments of the present disclosure better understood and make the above objects, features and advantages of the embodiments of the present disclosure more comprehensible, the technical solutions in the embodiments of the present disclosure are described in further detail below with reference to the accompanying drawings.

The visibility in the environment is low due to the weather such as fog, haze, sand, dust, heavy rain and the like, and traffic accidents are easily caused. Generally, the degree of fog (haze or dust) concentration in the target area can be determined by identifying the visibility value of the target area so as to guide the travel. At present, when the visibility of a target area is determined, some methods are to directly detect the visibility value of the target area through a visibility meter, and the method needs to deploy special detection equipment in the target area, so that the cost is high, and the method is difficult to implement. There are also ways to determine the visibility value of a target area based on an image of the target area. For example, the image of the target area may be input into a neural network trained in advance to predict the visibility value of the target area. However, when calibrating a sample image for training a neural network, usually, the visibility value range of a region included in the sample image is estimated based on human experience, for example, less than 50m, less than 100m, > 1000m, and the like, so that when predicting an image of a target region, the neural network only predicts a rough visibility value range, and cannot obtain an accurate visibility value. Moreover, the visibility range of the sample image estimated through human experience is influenced by subjective factors, so that the prediction result is not accurate.

Based on this, the embodiment of the disclosure provides a method for determining visibility values, which may obtain a large number of sample images of scenes with low visibility (including scenes in fog, sand, heavy rain, and the like) and scenes with high visibility (scenes in fog-free, sand, heavy rain, and the like), and calibrate the visibility values of regions included in the sample images in advance, and when calibrating the visibility values of the regions included in the sample images, may determine the visibility values of the sample images according to depth information of pixels in the sample images and the conductivity of the pixels in the sample images. And then training the initial neural network through the sample images carrying the labels with the calibrated visibility values, and predicting the visibility values of the target areas in the images by using the trained neural network. The visibility value of the target area in the image is directly predicted by training the neural network, so that special detection equipment does not need to be deployed, and the visibility values in various scenes can be accurately predicted.

The method provided by the embodiment of the disclosure can be executed by various devices deployed with a pre-trained neural network, for example, a cloud server, a user terminal and the like, and the method can be used for predicting visibility values in climates such as fog, haze, sand and dust, heavy rain and the like.

Specifically, as shown in fig. 1, the method for determining the visibility value in the embodiment of the present disclosure may include the following steps:

s102, acquiring an image of a target area;

firstly, an image of a target area can be obtained, the target area can be various areas where visibility values need to be determined, for example, due to possible occurrence of weather such as haze, fog, dust, heavy rain and the like, the environmental visibility is too low, so that traveling of various vehicles such as vehicles, ships, airplanes and the like is influenced, or handling of various activities is influenced, and the corresponding traveling area of the vehicles or the place where the activities are handled is the target area. Therefore, for the areas of roads, activity places and sky waiting for visibility value measurement, the images of the areas can be collected, and the visibility value of the areas can be predicted according to the images of the areas.

S104, determining the visibility value of the target area through a pre-trained neural network and the image; wherein the neural network is trained based on:

acquiring a sample image;

determining the calibrated visibility of the sample image based on the depth information of the sample image and the conduction rate of the sample image, and using the visibility as a label of the sample image; the conductivity is related to depth information of the sample image and a media content in air in the sample image;

After the image of the target area is acquired, the image of the target area can be input into a pre-trained neural network, so that the visibility value of the target area is output through the neural network. When the neural network is trained, some sample images of scenes not including fog (haze, dust, and the like) and scenes including fog (haze, dust, and the like) with different concentrations can be collected, visibility values of regions included in each sample image are calibrated in advance to be used as labels of the sample images, and then the sample images carrying the calibrated visibility value labels are used for training the initial neural network to obtain the neural network. The calibrated visibility value of the sample image may be automatically calibrated through some preset application programs and functional components, for example, the sample image may be input into a preset application program and functional component, the visibility value of the sample image may be automatically determined, and then the determined visibility may be automatically input into a neural network as a label of the sample image together with the sample image to train the neural network. In some embodiments, after the visibility value tag of the sample image is determined, the sample image and the visibility value tag may be stored correspondingly, so as to train the neural network in the following.

When the real visibility value of the area included in the sample image is calibrated, in order to make the calibrated visibility value more accurate, the visibility value of the sample image can be determined according to the depth information of the pixel points in the sample image and the conduction rate corresponding to the pixel points in the sample image, and the specific principle is as follows:

according to the Koschmieder equation, the visibility value V (in m) of the environment can be calculated according to equation (1):

where ε is the visual luminance contrast threshold, which can typically be taken as an empirical value of 0.02; beta is the extinction coefficient of the atmosphere, reflecting the light decay of the atmospheric medium during observation. From this, the atmospheric extinction coefficient β is inversely proportional to the visibility value.

Assuming that the atmosphere is a homogeneous medium, i.e., the atmospheric extinction coefficient is equal everywhere, the atmospheric extinction coefficient β can be determined by equation (2):

t(x)＝e^-βd(x)formula (2)

Wherein d (x) represents the distance between each pixel point in the image and the image acquisition device (i.e. the depth information of each pixel point), and t (x) represents the conduction rate corresponding to each pixel point in the image. The conductivity represents the ratio of the light emitted by an object in space to various media in the air and finally incident to the photosensitive unit of the image acquisition device to the light originally emitted by the photosensitive unit. The conduction rate corresponding to each pixel point is related to the distance between the three-dimensional point corresponding to each pixel point and the image acquisition device and the content of a medium in the air, the farther the distance is, the more the content of the medium is, and the lower the conduction rate is, wherein the medium can be various mediums floating in the air and influencing light transmittance, such as fog, haze, sand and dust.

According to the formula, after a frame of sample image is obtained, the visibility value of the region included in the sample image is calculated, and the depth information of the pixel points in the image and the conduction rate of the pixel points can be determined firstly. The pixel points can be partial pixel points in the sample image, and can also be all pixel points in the sample image, and the pixel points can be specifically set according to requirements.

For example, in some embodiments, an image capture device that captures the sample image may be equipped with a depth camera, such that after the sample image is captured, depth information for pixel points in the sample image may be determined based on the depth image captured by the depth camera. Of course, in some embodiments, a binocular camera may also be used to collect a sample image, a relative position relationship between the two cameras may be calibrated in advance, and based on a parallax of a pixel point corresponding to the same three-dimensional point in an image collected by the binocular camera, depth information of the pixel point corresponding to the three-dimensional point may be determined. In some embodiments, depth information of pixel points in the sample image may also be predicted based on a pre-trained neural network.

Certainly, the weather such as haze, group's fog detect in order to carry out the scene of traffic control according to the visibility value of the road that obtains that detects, because the image of road is the image acquisition device collection that has set up in the road directly borrowing now, in order can directly to utilize current image acquisition device under this kind of circumstances, need not to carry out too much improvement on the hardware, with reduce cost, can adopt and directly mark image acquisition device's inside and outside parameter, the depth information of the sample image that image acquisition device gathered is confirmed to inside and outside parameter based on image acquisition device. In order to calibrate the internal and external parameters of the image acquisition device, a target object can be included in the visual field range of the image acquisition device, the target object comprises a plurality of coplanar and non-collinear target points with known relative position relations, so that the internal parameters of the image acquisition device and the external parameters of the image acquisition device can be calibrated based on the coplanar and non-collinear target points with known relative position relations, and the depth information of the pixel points in the sample image is determined according to the internal parameters, the external parameters and the height of the image acquisition device.

The plurality of target points may be 4 or more than 4 target points, and the coplanarity of the plurality of pixel points means that the plurality of pixel points are completely on one plane or approximately on one plane. The external parameter of the image acquisition device can be the external parameter of the image acquisition device relative to one of a plurality of target points, the horizontal distance between the target point and the image acquisition device can be determined based on the external parameter and the height of the image acquisition device, then the horizontal distance between other pixel points of the depth information to be determined in the sample image and the target point can be determined, the horizontal distance between the other pixel points and the image acquisition device is further determined, and the distance between the other pixel points and the image acquisition device, namely the depth information can be determined according to the pythagorean theorem based on the horizontal distance between the other pixel points and the image acquisition device and the height of the image acquisition device.

In some embodiments, the target object may be a target object set by a user in a field of view of the image capturing device in advance, for example, a labeled box, a labeled point, and the like labeled by the user in the field of view of the image capturing device. Of course, in some embodiments, in order to avoid manual labeling by the user, it is cumbersome to directly select a target object with known parameters such as size, dimension, and interval within the field of view of the image capturing device, and a plurality of target points with known relative position relationships may be determined from the target object.

Taking a scene of predicting visibility values in a road as an example, since a road usually includes some identification lines, such as lane lines, zebra crossings, stop lines, and the like, and the lengths, widths, intervals, and the like of these identification lines are usually set according to fixed standards, and thus the relative position relationship between the end points of these identification lines is usually known, in some embodiments, the target object may be an identification line in a road, and the target point may be an end point of an identification line. For example, as shown in fig. 2, on a high-speed road surface, the dashed line of the lane line is usually 6 meters long, 9 meters apart, and 3.75 meters wide, so that a rectangle formed by the dashed line of the lane line can be selected as a target object, four vertexes of the rectangle can be selected as a target point, and the relative position coordinates of the target point can be determined based on the parameters of the dashed line length, width, interval, and the like of the dashed line of the lane line.

In addition, roads typically include rectangular signs, and the signs are typically fixed in size, so that in some embodiments, the target object may be a sign and the target point may be a vertex of the sign. For another example, the size of the license plate of a vehicle is also fixed, so the target object may also be the license plate, and the target points may be four vertices of the license plate. For another example, the model size of a car is also fixed, so the target object may also be a car, and the target points may be the key points of the four wheels of the car.

It is understood that the internal parameters and the external parameters of the image acquisition device can be calibrated as long as the target object is positioned in the visual field range of the image acquisition device and can determine four or more than four target points with known relative position relations. The internal parameters and the external parameters of the image acquisition device can be determined by a Zhangyingyou calibration method and the like, and the embodiment of the disclosure is not limited.

In some embodiments, when the visibility value of the sample image is determined according to the depth information of each pixel point in the sample image and the conduction rate of each pixel point, the visibility values corresponding to all the pixel points in the sample image may be determined, and then the visibility values corresponding to all the pixel points are averaged to obtain a calibrated visibility value corresponding to the sample image. In some embodiments, in order to improve the calculation efficiency, only visibility values corresponding to a part of pixel points in the sample image may be calculated, and then the visibility values corresponding to the part of pixel points are averaged to obtain a calibrated visibility value corresponding to the sample image.

If the internal parameters and the external parameters of the image acquisition device are calibrated according to a plurality of coplanar and non-collinear target points with known relative position relations, and then the depth information of the pixel points in the sample image is solved based on the calibrated internal parameters and external parameters, the depth information of the pixel points in the pixel area corresponding to the coplanar area of the target points in the sample image can only be solved. For example, if the target point is an end point of the marking line on the road surface, only depth information of the pixel point corresponding to the road surface area in the sample image can be solved. Therefore, in some embodiments, when the calibrated visibility value of the sample image is determined based on the depth information of the pixel points in the sample image and the conduction rate corresponding to the pixel points, a target pixel point may be determined from the sample image, the three-dimensional point corresponding to the target pixel point is coplanar with the target point, then the visibility value corresponding to the target pixel point may be determined based on the depth information of the target pixel point and the conduction rate of the target pixel point, and then the average value of the visibility values corresponding to the target pixel point is used as the calibrated visibility value of the sample image.

In some embodiments, when the average value of visibility values corresponding to target pixel points in the sample image is used as the calibrated visibility value of the sample image, the target pixel points may be pixel points corresponding to a reference region, the reference region may be a region separated from the image acquisition device by a certain distance range, and the reference region may be obtained by user calibration in advance. Taking the road area as an example, through a large number of tests and verifications, the visibility value of the road area can be predicted more accurately by using the visibility value of the area which is 100 + 500m away from the image acquisition device, so that the user can identify the area which is 100 + 500m away from the image acquisition device in the road in advance. After a sample image including a road area is obtained, a pixel area corresponding to the area can be identified from the sample image, visibility values corresponding to the pixel points are determined based on the conduction rate and the depth information of the pixel points in the pixel area corresponding to the area, and the visibility values are averaged to be used as the calibrated visibility values of the sample image.

In some embodiments, if the pose of the image capturing device can be changed, for example, the image capturing device is disposed on the cradle head and can rotate with the cradle head. The sample image may include a plurality of three-dimensional target points with known relative position relationships, and when the internal parameter and the external parameter of the image acquisition device are calibrated and the depth information of the pixel points in the sample image is determined based on the internal parameter and the external parameter of the image acquisition device, the pose of the image acquisition device may be changed to acquire a plurality of frames of images, where each of the plurality of frames of images includes a plurality of target points with known relative positions in a three-dimensional space, and the sample image may be one of the plurality of frames of images. Then, the relative position coordinates of the target Point in the three-dimensional space can be determined based on the relative position relationship of the target Point, the relative pose (external parameter) when the image acquisition device acquires any two frames of images in the multi-frame images and the internal parameter of the image acquisition device can be solved based on the relative position coordinates of the target Point in the three-dimensional space and the pixel coordinates of the pixel Point corresponding to the target Point in the multi-frame images, for example, the relative pose and the internal parameter can be solved through a PnP (coherent-n-Point) algorithm, and after the relative pose and the internal parameter are determined, the depth information of the pixel Point in the sample image can be determined based on the multi-frame images and the solved relative pose and internal parameter.

Certainly, since the visibility value of the sample image is difficult to calibrate, there is a certain requirement for automatically calculating the visibility value of the sample image, for example, the sample image needs to include a plurality of target objects of target points with known relative position relationships. Therefore, the number of sample images carrying visibility value labels and the scenes contained in the sample images are limited to a certain extent, and it is difficult to obtain a large number of sample images carrying visibility value labels. The neural network to be trained has high prediction accuracy, and a large number of sample images of different scenes need to be adopted for training. Therefore, in some embodiments, to make the trained neural network more accurate, the sample images used to train the neural network may include two types, a main sample image and an auxiliary sample image, wherein the main sample image may be the sample image carrying visibility value labels, and the auxiliary sample image may be the sample image carrying visibility class labels, wherein each visibility class corresponds to a range of visibility values, e.g., the visibility values corresponding to the auxiliary sample image may be divided into 6 classes, ≦ 50m, ≦ 100m, ≦ 200m, ≦ 500m, ≦ 1000m, > 1000m, and then the auxiliary sample image may be manually classified into the various classes based on experience. Since only the visibility value range corresponding to the auxiliary sample image needs to be roughly estimated, for example, the depth information and the conduction rate of the sample image do not need to be determined, there is no limitation on the scenes of the sample image, the auxiliary sample image may include various scenes, and the number of the auxiliary sample images may also be greatly increased. When the neural network is trained, a multi-task training mode can be adopted, and the main sample image and the auxiliary sample image can be input into the initial neural network so as to train the initial neural network and obtain the trained neural network.

In some embodiments, when the initial neural network is trained by using the main sample image and the auxiliary sample image, for the main sample image, the visibility value of the main sample image can be predicted through the initial neural network, so as to obtain a predicted visibility value of the main sample image, and then a first loss is obtained according to a difference between the predicted visibility value and the calibrated visibility value of the main sample image. For the auxiliary sample image, the visibility category to which the auxiliary sample image belongs can be predicted through the initial neural network to obtain a prediction result of the visibility type, a second loss is determined based on the prediction result output by the neural network and the corresponding calibrated visibility category, and then the initial neural network can be trained based on the first loss and the second loss.

In some embodiments, when the initial neural network is trained using the main sample image and the auxiliary sample image, the initial neural network may be trained using both the main sample image and the auxiliary sample image, for example, after determining the first loss and the second loss, a target loss may be determined according to the first loss and the second loss, for example, the target loss may be an addition of the two, or a weighted average of the two, for example, since the main sample image is more important, the weight of the first loss may be set to be larger, and then the target loss may be used as an optimization target, and network parameters of the initial neural network may be continuously adjusted to train the neural network.

In some embodiments, when the initial neural network is trained by using the main sample image and the auxiliary sample image, the initial neural network may also be trained by using the auxiliary sample image, and then the trained neural network may be further trained by using the main sample image. For example, the initial network parameters of the neural network may be adjusted based on the second loss to obtain the neural network with more accurate network parameters, and then the network parameters of the trained neural network may be further optimized based on the first loss.

Since the prediction of the main sample images is a regression task, in some embodiments, the first loss may be determined based on the average absolute error percentage of the predicted visibility values and the calibrated visibility values of the main sample images.

Since the prediction of the auxiliary sample image is a classification task, in some embodiments, when the second loss is determined, the prediction probability that the auxiliary sample image belongs to each of the predefined visibility categories may be output through the neural network, then the true probability that the auxiliary sample image belongs to each of the predefined visibility categories is determined based on the calibrated visibility categories of the auxiliary sample image, and the second loss is determined based on the cross-entropy loss of the prediction probability and the true probability. The predefined visibility categories are categories represented by labels of the auxiliary sample images, for example, the labels of the auxiliary sample images include 6 visibility categories, that is, the prediction probability is the probability that the auxiliary sample image is the 6 visibility categories, the true probability is the true probability that the auxiliary sample image is the 6 visibility categories, and the second loss is determined based on the cross entropy loss of the distribution situation of the prediction probability and the distribution situation of the true probability.

In some embodiments, the target area may be a road area, the image of the target area may be an image acquired by an image acquisition device arranged in the road, after the visibility value of the target area is determined through a pre-trained neural network and the image of the target area, the hazard level of the current foggy weather may be determined based on the visibility value, and a control strategy corresponding to the hazard level is selected to control traffic in the road. For example, a plurality of levels for evaluating the degree of hazy weather hazard may be preset, different levels correspond to different visibility value ranges, and each hazard level may preset a corresponding management and control policy to manage and control traffic in roads. For example, the control strategy may be to perform a heavy fog weather warning on vehicles in the road, the control strategy may also be to prompt the speed of the vehicles in the road to be controlled below a certain speed, and the control strategy may also be to close a road intersection, prohibit vehicles from passing, and the like.

In some embodiments, the target area may be a road area, the image of the target area may be multiple frames of images acquired by an image acquisition device arranged in the road at different times, and after the visibility values corresponding to the multiple frames of images are determined through a neural network, the change trend of the weather in the foggy days may be predicted based on the visibility values. For example, an image of a frame of road area may be acquired every 30min, and the visibility value corresponding to the image may be predicted through a neural network, so as to obtain a change trend of the visibility value of the road in a period of time, thereby predicting whether the fog concentration in the road is gradually increasing or decreasing, and predicting the change trend of the climate in the foggy day.

Certainly, the fog cloud phenomenon is influenced by the microclimate environment of local areas, fog with lower visibility appears in a local range of large fog, when the fog cloud occurs, the visibility is suddenly and rapidly reduced, the forecast and the forecast are difficult, the harm to road traffic safety is large, and particularly, serious traffic accidents are easy to cause on high-speed highways. Therefore, in some embodiments, the target area may be a road area, and the image of the target area may be an image acquired by an image acquisition device arranged in the road at preset time intervals, for example, one or more frames of images are acquired every hour, and then it is determined through a neural network whether the fog climate currently occurs or not by determining visibility values corresponding to the one or more frames of images. For example, for each frame of image, it may be determined whether the visibility value corresponding to the image exceeds a preset threshold, and if the visibility value corresponding to each frame of image exceeds the preset threshold, or the visibility values corresponding to a certain number of images exceed the preset threshold, it is determined that the fog cloud phenomenon is currently occurring in the road area. Alternatively, an average value of the visibility values of the one or more images may be determined, and if the average value exceeds a preset threshold value, it is determined that the fog climate is currently occurring. By counting the total times of the cluster fog weather in the target time period, the frequency of cluster fog in a certain area can be predicted, and the method is used for researching the climate rule of the area.

For example, one or more frames of images of the road area may be collected every 2 hours, the visibility values corresponding to the one or more frames of images of the road area are predicted by using the neural network, if the visibility values of the images exceed a preset threshold, whether the road area is fogged or not is determined, and the frequency of fogged every day is determined by counting the total times of fogged in one day.

To further explain the method for determining regional visibility values provided by the embodiments of the present disclosure, the following is explained with reference to a specific embodiment.

In general, weather such as fog and haze can cause low visibility in roads, and traffic accidents are easily caused. As shown in fig. 3, in order to monitor visibility values in roads, vehicles in the roads are warned and managed based on the visibility values. An image acquisition device 31 arranged in a road can be used for acquiring an image of a road area and sending the image to a server 32 in communication connection with the image acquisition device, and the server 32 is deployed with a pre-trained neural network, so that the visibility value of the road area can be predicted in real time. And then managing and controlling the traffic in the road based on the visibility value.

The process for predicting the visibility value of the road area specifically comprises the following steps:

1. a neural network training stage:

a large number of images that do not include foggy day scenes and that include foggy day scenes of different concentrations may be collected as sample images for training the neural network. When collecting the sample images, two types of sample images may be collected, one being a main sample image and one being an auxiliary sample image. And calibrating the corresponding visibility value of the main sample image. For the auxiliary sample images, it is only necessary to determine the visibility categories to which the auxiliary sample images belong without calibrating the specific visibility values of the auxiliary sample images, where each visibility category corresponds to one visibility value range, for example, the auxiliary sample images may be divided into 6 categories, which are: less than or equal to 50m, less than or equal to 100m, less than or equal to 200m, less than or equal to 500m, less than or equal to 1000m and more than 1000 m.

The main sample image may include a road image acquired by an image acquisition device provided in the road. When the visibility value of the main sample image is calibrated, the depth information of the pixel points in the main sample image and the conduction rate of the pixel points can be determined firstly. The determination of depth information and conductivity is described below:

(1) determination of depth information of pixel points in observation region in main sample image

Since the road surface usually includes lane lines, and the length, width and interval of the lane lines are fixed, for example, the length of a rectangular frame formed by dashed lane lines in an expressway is generally 6m, the width of the rectangular frame is 3.75m and the interval of the rectangular frame is 9 m. Therefore, the internal parameters and the external parameters of the image acquisition device can be calibrated by using a rectangular frame formed by lane dotted lines in a road, and the calibration of the internal parameters and the external parameters of the image acquisition device can be realized by adopting a Zhang friend calibration method and the like. Because the calibrated external parameter can be a relative position parameter of the image acquisition device and a certain vertex in the lane dotted line rectangular frame, when the depth information is determined based on the internal parameter and the external parameter, the depth information of the road pavement (namely, an area coplanar with the vertex of the rectangular frame) can only be solved. The user can mark a region which is 500m away from the image acquisition device 100 on the road surface in advance as an observation region, and then can determine the depth information of each pixel point corresponding to the observation region based on the internal parameter and the external parameter of the image acquisition device.

(2) Determination of pixel conductivity of observation region in main sample image

The conduction rate of the pixel points in the sample image can be determined according to a dark channel prior method, for example, the dark channel pixel value of the neighborhood space of each pixel point in the sample image can be determined through a dark channel algorithm, and the conduction rate corresponding to the pixel points in the sample image is determined based on the dark channel pixel value.

After the depth information and the conduction rate of the pixel points in the observation area of the main sample image are determined, the visibility values of the positions where the three-dimensional points corresponding to the pixel points are located can be determined according to the depth information and the conduction rate, and then the visibility values of the positions where the three-dimensional points are located are averaged to obtain the visibility value of the main sample image.

For the auxiliary sample image, it may include various foggy scenes, and the visibility category to which it belongs may be determined directly from human experience.

After determining the label of the main sample image and the label of the auxiliary sample image, the main sample image and the auxiliary sample image may be used to train a neural network, the structure of the neural network is shown in fig. 4, where a renet 18 network may be selected by the base convolutional network, the network includes 5 convolutional layers (conv1, conv2.. conv5), 3 fully-connected layers (fc1, fc2, fc3) may be accessed after the conv5 layer, the output dimension of the fully-connected layer fc1 is 6, the activation function thereof may select tanh, the output dimension of the fully-connected layer fc2 is 64, the activation function thereof selects prelu, the output dimension of the fully-connected layer fc3 is 1, and the activation function thereof selects prelu.

For the main sample image, a first loss function may be obtained from the average absolute error percentage of the predicted visibility value and the calibrated visibility value. Assuming the predicted energyHas a visibility of

And the visibility obtained by calibration is y, the first loss function map can be determined by the following formula (3):

for the auxiliary sample image, a second loss function may be obtained using cross entropy loss (cross entropy) of the predicted result and the true result.

And then determining a target loss function according to the first loss function and the second loss function, and training by taking the target loss function as an optimization target to obtain the neural network. Or, the parameters of the neural network may be adjusted based on the second loss function to obtain the trained neural network, and then the network parameters of the trained neural network are further optimized by using the first loss function to obtain the finally trained neural network.

2. A neural network prediction stage:

after the neural network is trained, the image of the target area of the visibility to be detected can be obtained, and the image of the target area is input into the neural network trained in advance, so that the visibility value of the target area can be output. After the visibility value of the target area is determined, traffic in the road can be controlled based on the visibility value, for example, the vehicle can be subjected to dense fog early warning, road intersections can be closed, vehicle passing is forbidden, and the like, so that traffic accidents caused by too low visibility due to severe dense fog weather are avoided. Of course, the fog weather change trend, the frequency of the occurrence of the foggy weather and the like can also be determined based on the visibility change area of the target area in a period of time.

Corresponding to the above method, an embodiment of the present disclosure further provides a device for detecting a visibility value, as shown in fig. 5, where the device 50 includes:

an acquisition module 51, configured to acquire an image of a target area;

a prediction module 52, configured to determine a visibility value of the target area through a pre-trained neural network and the image; wherein the neural network is trained based on:

acquiring a sample image;

the depth information of the sample image is determined based on:

In some embodiments, the first loss is determined based on:

the second loss is determined based on:

In some embodiments, the target area comprises a road area, the image comprises an image captured by an image capturing device disposed on the road, the device is further configured to:

In some embodiments, the target area includes a road area, the image includes multiple frames of images acquired at different times by an image acquisition device disposed on the road, and the device is further configured to:

In some embodiments, the target area includes a road area, the image includes an image captured by an image capturing device disposed on the road at a preset time interval, and the device is further configured to:

In addition, as shown in fig. 6, the electronic device includes a processor 61, a memory 62, and computer instructions stored in the memory 62 and executable by the processor 61, and when the processor executes the computer instructions, the method according to any of the foregoing embodiments may be implemented.

The embodiments of the present disclosure also provide a computer-readable storage medium, on which a computer program is stored, which when executed by a processor implements the method of any of the foregoing embodiments.

Computer-readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, a computer readable medium does not include a transitory computer readable medium such as a modulated data signal and a carrier wave.

From the above description of the embodiments, it is clear to those skilled in the art that the embodiments of the present disclosure can be implemented by software plus necessary general hardware platform. Based on such understanding, the technical solutions of the embodiments of the present specification may be essentially or partially implemented in the form of a software product, which may be stored in a storage medium, such as a ROM/RAM, a magnetic disk, an optical disk, etc., and includes several instructions for enabling a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments of the present specification.

The systems, devices, modules or units illustrated in the above embodiments may be implemented by a computer chip or an entity, or by a product with certain functions. A typical implementation device is a computer, which may take the form of a personal computer, laptop computer, cellular telephone, camera phone, smart phone, personal digital assistant, media player, navigation device, email messaging device, game console, tablet computer, wearable device, or a combination of any of these devices.

The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the apparatus embodiment, since it is substantially similar to the method embodiment, it is relatively simple to describe, and reference may be made to some descriptions of the method embodiment for relevant points. The above-described apparatus embodiments are merely illustrative, and the modules described as separate components may or may not be physically separate, and the functions of the modules may be implemented in one or more software and/or hardware when implementing the embodiments of the present disclosure. And part or all of the modules can be selected according to actual needs to achieve the purpose of the scheme of the embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.

The foregoing is only a specific embodiment of the embodiments of the present disclosure, and it should be noted that, for those skilled in the art, a plurality of modifications and decorations can be made without departing from the principle of the embodiments of the present disclosure, and these modifications and decorations should also be regarded as the protection scope of the embodiments of the present disclosure.

Claims

1. A method for detecting visibility values, the method comprising:

acquiring an image of a target area;

acquiring a sample image;

2. The method according to claim 1, wherein the sample image includes a target object, and the target object includes a plurality of target points with known relative positions, which are coplanar and non-collinear in a three-dimensional space;

the depth information of the sample image is determined based on:

3. The method of claim 2, wherein the target object comprises an identification line in a road, and the target point comprises an end point of the identification line; or

4. The method of claim 2 or 3, wherein a calibrated visibility value for the sample image is determined based on depth information of the sample image and a conductivity of the sample image,

5. The method of any of claims 1-4, wherein the conductivity is determined by a dark channel prior method.

6. The method according to any one of claims 1 to 5, wherein the training of the preset initial neural network by the sample image carrying the calibrated visibility value tag comprises:

7. The method of claim 6, wherein training the initial neural network with the main and auxiliary sample images comprises:

8. The method of claim 7, wherein training the initial neural network based on the first and second losses comprises:

9. The method of claim 7, wherein the first loss is determined based on:

the second loss is determined based on:

10. The method according to any one of claims 1-9, wherein the target area comprises a road area, the image comprises an image captured by an image capturing device disposed on the road, the method further comprising:

11. The method according to any one of claims 1 to 10, wherein the target area comprises a road area, the image comprises a plurality of frames of images acquired by image acquisition devices arranged on the road at different moments, and the method comprises the following steps:

12. The method according to any one of claims 1 to 11, wherein the target area comprises a road area, the image comprises an image captured by an image capturing device provided on a road at preset time intervals, and the method comprises:

13. A device for detecting visibility values, characterized in that it comprises:

the acquisition module is used for acquiring an image of a target area;

acquiring a sample image;

14. An electronic device, comprising a processor, a memory, and computer instructions stored in the memory for execution by the processor, the computer instructions when executed by the processor implementing the method of any of claims 1-12.

15. A computer-readable storage medium having computer instructions stored thereon that, when executed, implement the method of any one of claims 1-12.