CN114005104A - Intelligent driving method and device based on artificial intelligence and related products - Google Patents

Intelligent driving method and device based on artificial intelligence and related products Download PDF

Info

Publication number
CN114005104A
CN114005104A CN202111283225.9A CN202111283225A CN114005104A CN 114005104 A CN114005104 A CN 114005104A CN 202111283225 A CN202111283225 A CN 202111283225A CN 114005104 A CN114005104 A CN 114005104A
Authority
CN
China
Prior art keywords
automobile
determining
action
time
dangerous
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN202111283225.9A
Other languages
Chinese (zh)
Inventor
艾的梦
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Chuang Le Hui Technology Co ltd
Original Assignee
Shenzhen Chuang Le Hui Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Chuang Le Hui Technology Co ltd filed Critical Shenzhen Chuang Le Hui Technology Co ltd
Priority to CN202111283225.9A priority Critical patent/CN114005104A/en
Publication of CN114005104A publication Critical patent/CN114005104A/en
Withdrawn legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Traffic Control Systems (AREA)

Abstract

The embodiment of the application provides an artificial intelligence intelligent driving method, an artificial intelligence intelligent driving device and a related product, wherein the method comprises the following steps: the method comprises the steps of acquiring image frames collected by a camera installed in an automobile when the automobile is detected to run; performing action recognition on a target human body in an image frame acquired by the camera to obtain an action recognition result of the target human body; acquiring road condition data of a road surface on which the automobile currently runs, and determining candidate dangerous actions according to the road condition data; when the dangerous action of the target human body in the candidate dangerous action is determined according to the action recognition result, and the time length for executing the dangerous action exceeds the preset time length, determining the dangerous grade of the dangerous action and the duration time length of the dangerous action; determining an alarm level according to the danger level and the duration; and determining alarm information according to the alarm grade, and outputting the alarm information through the alarm equipment of the automobile.

Description

Intelligent driving method and device based on artificial intelligence and related products
Technical Field
The application relates to the technical field of image processing, in particular to an intelligent driving method and device based on artificial intelligence and a related product.
Background
With the development of traffic systems, the driving of automobiles is becoming more and more popular, and the number of automobiles is also increasing. The driving of the automobile on the road needs to follow certain traffic rules, but because the automobile is operated manually, dangerous traffic accidents may be caused once improper manual operation is implemented.
Disclosure of Invention
The embodiment of the application provides an intelligent driving method and device based on artificial intelligence and a related product, and driving safety is achieved.
An artificial intelligence based intelligent driving method, the method comprising:
the method comprises the steps of acquiring image frames collected by a camera installed in an automobile when the automobile is detected to run;
performing action recognition on a target human body in an image frame acquired by the camera to obtain an action recognition result of the target human body;
acquiring road condition data of a road surface on which the automobile currently runs, and determining candidate dangerous actions according to the road condition data;
when the dangerous action of the target human body in the candidate dangerous action is determined according to the action recognition result, and the time length for executing the dangerous action exceeds the preset time length, determining the dangerous grade of the dangerous action and the duration time length of the dangerous action;
determining an alarm level according to the danger level and the duration;
and determining alarm information according to the alarm grade, and outputting the alarm information through the alarm equipment of the automobile.
Further, the acquiring an image frame collected by a camera installed in the automobile when the automobile is detected to be running includes:
continuously acquiring running speed data of the automobile after the automobile is started;
when the duration that the driving speed data is continuously greater than the first speed threshold exceeds a first duration threshold, starting to acquire image frames acquired by a camera installed in an automobile;
and when the duration of the running speed data which is continuously less than the second speed threshold exceeds the second duration threshold, stopping acquiring the image frames acquired by the camera arranged in the automobile.
Further, the obtaining road condition data of the road surface on which the automobile currently runs and determining candidate dangerous actions according to the road condition data includes:
acquiring road condition data of a road surface on which the automobile currently runs and driving data of the automobile;
determining a first candidate dangerous action according to the road condition data, and determining a second candidate dangerous action according to the driving data;
and obtaining the candidate dangerous action according to the first candidate dangerous action and the second candidate dangerous action.
Further, the determining a first candidate dangerous action according to the traffic data includes:
acquiring the road type, the congestion condition and pedestrian data of the road surface on which the automobile currently runs, and determining a road condition danger coefficient according to the road type, the congestion condition and the pedestrian data;
determining a first candidate dangerous action according to the road condition danger coefficient;
the determining a second candidate dangerous action according to the driving data comprises:
acquiring the continuous running time length and the average running speed of the automobile, and determining a driving risk coefficient according to the continuous running time length and the average running speed;
and determining a second candidate dangerous action according to the driving danger coefficient.
Further, the determining an alarm level according to the risk level and the duration includes:
when the danger level exceeds a preset level and the duration exceeds a preset duration, determining an automobile control parameter according to the danger level and the duration, and controlling the driving performance of the automobile according to the automobile control parameter;
and when the danger level does not exceed a preset level or the duration does not exceed a preset duration, determining an alarm level according to the danger level and the duration.
Further, the motion recognition of the target human body in the image frame acquired by the camera to obtain a motion recognition result of the target human body includes:
extracting spatial interactive characteristics through a spatial flow convolution neural network aiming at the image frames collected by the camera, and extracting global spatial discriminative characteristics by utilizing a bidirectional LSTM;
extracting time interactive features through a time flow convolutional neural network, extracting global time features from the time interactive features through a three-dimensional convolutional neural network, and constructing a time attention model guided by optical flow to calculate global time discriminative features according to the global time features;
performing classification processing according to the global time discriminative feature to obtain a first classification result, and performing classification processing according to the global space discriminative feature to obtain a second classification result;
and fusing the first classification result and the second classification result to obtain a fusion classification result, and obtaining an action recognition result of the target human body according to the fusion classification result.
Further, the extracting the spatial interactivity features through a spatial stream convolutional neural network comprises:
inputting the image frame into a behavior significance detection network model to obtain a detection result, and obtaining a spatial interactivity characteristic according to the detection result;
constructing a mask-guided spatial attention model according to the image frame and the spatial interactive characteristics to obtain spatial discriminative characteristics;
determining a spatial interactivity characteristic according to the temporal attention weight and the spatial discriminative characteristic;
the method comprises the steps of extracting time interactive features through a time flow convolution neural network, extracting global time features from the time interactive features through a three-dimensional convolution neural network, and constructing a time attention model guided by an optical flow to calculate global time discriminative features according to the global time features, and comprises the following steps:
performing optical flow calculation on the shot image through a TVNet network to obtain an optical flow frame;
weighting the obtained optical flow frame according to the spatial attention weight to obtain the time interactive feature;
extracting global time characteristics from the time interactive characteristics through a three-dimensional convolutional neural network;
inputting the global time characteristic into a time attention model guided by optical flow to obtain a time attention weight, and weighting the global time characteristic through the time attention weight to obtain a global time discriminative characteristic;
the method for fusing the first classification result and the second classification result comprises the following steps:
Sr=((1+C1^2)/(1+C2^2))*S1+(1-((1+C1^2)/(1+C2^2)))*S2
wherein S is1Representing the first classification result, S2Representing the second classification result, SrRepresents the fusion classification result, C1And C2Representing a variable defined during the fusion, C1Less than or equal to C2
An intelligent driving apparatus based on artificial intelligence, the apparatus comprising:
the system comprises an image acquisition module, a data acquisition module and a data processing module, wherein the image acquisition module is used for acquiring image frames acquired by a camera arranged in an automobile in the process of detecting that the automobile runs;
the action recognition module is used for carrying out action recognition on a target human body in the image frame acquired by the camera to obtain an action recognition result of the target human body;
the danger identification module is used for acquiring road condition data of a road surface on which the automobile runs currently and determining candidate dangerous actions according to the road condition data;
the time length obtaining module is used for determining the danger level of the dangerous action and the duration of the dangerous action when the dangerous action of the target human body in the candidate dangerous action is determined according to the action recognition result and the time length for executing the dangerous action exceeds the preset time length;
the grade acquisition module is used for determining an alarm grade according to the danger grade and the duration;
and the information output module is used for determining alarm information according to the alarm grade and outputting the alarm information through the alarm equipment of the automobile.
An electronic device comprising a memory having computer-executable instructions stored thereon and a processor that implements the method when executing the computer-executable instructions on the memory.
A computer-readable storage medium, on which a computer program is stored which, when executed by a processor, carries out the above-mentioned method.
According to the intelligent driving method and device based on artificial intelligence and the related products, the action of the driver can be identified according to the image collected by the camera, the dangerous action of the driver is determined according to the current driving road condition data and the identified action, and corresponding alarm information is output according to the execution duration and the danger level of the dangerous action so as to prompt the dangerous action of the user and reduce the driving risk.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings used in the description of the embodiments will be briefly introduced below.
FIG. 1 is a flow diagram illustrating an artificial intelligence based intelligent driving method in one embodiment.
Fig. 2 is a schematic structural diagram of an intelligent driving device based on artificial intelligence in one embodiment.
Fig. 3 is a schematic diagram of a network structure for performing motion recognition on a target human body in one embodiment.
FIG. 4 is a diagram illustrating the hardware components of an artificial intelligence based intelligent driving system in one embodiment.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some, but not all, embodiments of the present application. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
It will be understood that the terms "comprises" and/or "comprising," when used in this specification and the appended claims, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
It is also to be understood that the terminology used in the description of the present application herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the application. As used in the specification of the present application and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.
It should be further understood that the term "and/or" as used in this specification and the appended claims refers to and includes any and all possible combinations of one or more of the associated listed items.
As used in this specification and the appended claims, the term "if" may be interpreted contextually as "when", "upon" or "in response to a determination" or "in response to a detection". Similarly, the phrase "if it is determined" or "if a [ described condition or event ] is detected" may be interpreted contextually to mean "upon determining" or "in response to determining" or "upon detecting [ described condition or event ]" or "in response to detecting [ described condition or event ]".
FIG. 1 is a flow diagram illustrating an artificial intelligence based intelligent driving method in one embodiment. The intelligent driving method based on artificial intelligence comprises the following steps:
and 102, acquiring image frames collected by a camera installed in the automobile when the automobile is detected to run.
In the embodiment provided by the application, the camera can be installed in the automobile, the camera keeps an open state in the automobile running process, and the image frame of a driver is collected through the camera. The monitoring of the driver is realized through the image frames shot by the driver.
And 104, performing motion recognition on the target human body in the image frame acquired by the camera to obtain a motion recognition result of the target human body.
The target human body in the image is identified as the human body outline of the driver by identifying the image collected by the driver. And then, performing action recognition on the target human body to obtain an action recognition result.
In this embodiment, the specific algorithm for motion recognition is not limited, and motion recognition of the target human body in the image may be implemented by any method. For example, it is recognized that the driver is performing actions such as "hands off steering wheel", "smoke", "make a call", and the like.
And step 106, acquiring road condition data of the road on which the automobile runs currently, and determining candidate dangerous actions according to the road condition data.
Specifically, the road condition data represents the road condition of the road on which the vehicle is currently driving. For example, the current road is an expressway or an urban road, the congestion degree of the traffic of the current road section, the road surface flatness of the current road, and the like.
According to the road condition data, dangerous actions are determined, for example, when the danger level is higher when the vehicle runs at high speed, the types of the candidate dangerous actions are set to be more. For some urban roads or roads with a low driving speed, the category of the candidate dangerous action setting is less.
And step 108, determining the danger level of the dangerous action and the duration of the dangerous action when the dangerous action of the target human body in the candidate dangerous actions is determined according to the action recognition result and the time length of the executed dangerous action exceeds the preset time length.
After the candidate dangerous action is determined, the identified action of the target human body is compared with the candidate dangerous action, whether the identified action of the target human body is matched with the candidate dangerous action or not is determined, and if the action of the target human body is matched with the candidate dangerous action, the driver is indicated to execute the dangerous action.
When the identified action matches a dangerous action in the candidate dangerous actions, the danger level of the dangerous action may be determined first, and a higher danger level indicates a higher degree of danger for performing the action. The length of time for performing the hazardous action is then determined, and may specifically be determined by multiplying the number of image frames in which the hazardous action occurs by the time interval between two image frames. If the duration of continuously executing the dangerous action exceeds the preset duration, the alarm operation can be executed.
And step 110, determining an alarm level according to the danger level and the duration.
Specifically, the alarm level may be determined based on the hazard level and the duration. Generally, the higher the danger level and the longer the duration, the higher the corresponding alarm level; the lower the risk level and the shorter the duration, the lower the corresponding alarm level. In the present embodiment, the specific correspondence relationship is not limited.
And 112, determining alarm information according to the alarm grade, and outputting the alarm information through the alarm equipment of the automobile.
And determining alarm information according to the determined alarm grade, and outputting the alarm information through the alarm equipment of the automobile to prompt a driver to stop dangerous actions and reduce driving risks.
The artificial intelligence intelligent driving method provided by the embodiment can identify the action of the driver according to the image acquired by the camera, determine the dangerous action of the current driver according to the current driving road condition data and the identified action, and output corresponding alarm information according to the execution duration and the danger level of the dangerous action so as to prompt the dangerous behavior of the user, thereby reducing the driving risk.
In one embodiment, acquiring image frames collected by a camera installed in an automobile during the process of detecting the driving of the automobile comprises: continuously acquiring running speed data of the automobile after the automobile is started; when the duration that the driving speed data is continuously greater than the first speed threshold exceeds a first duration threshold, starting to acquire image frames acquired by a camera installed in an automobile; and when the duration of the running speed data which is continuously less than the second speed threshold exceeds the second duration threshold, stopping acquiring the image frames acquired by the camera arranged in the automobile.
Specifically, after the automobile is started, the running speed data of the automobile can be continuously acquired, for example, the running speed data is 80km/h (kilometer per hour). And judging the state of the automobile according to the driving speed data, and if the driving speed continuously exceeds a certain value, considering that the automobile gradually tends to a stable driving state after starting, starting to acquire images and recognizing the driving action of a driver. If the driving speed is continuously lower than a certain value, the automobile is considered to gradually tend to a parking driving state, the image acquisition can be stopped, and the driving action of the driver can be recognized
In one embodiment, obtaining road condition data of a road on which an automobile is currently driving, and determining candidate dangerous actions according to the road condition data comprises: acquiring road condition data of a road surface on which an automobile currently runs and driving data of the automobile; determining a first candidate dangerous action according to the road condition data, and determining a second candidate dangerous action according to the driving data; and obtaining candidate dangerous actions according to the first candidate dangerous actions and the second candidate dangerous actions.
Specifically, road condition data of the road surface and driving data of the vehicle may also be obtained, and the driving data of the vehicle may specifically represent data such as driving speed, driving duration, driving distance, and the like of the vehicle, which is not limited herein. And then obtaining a first candidate dangerous action according to the road condition data, obtaining a second candidate dangerous action according to the driving data, and then obtaining a final result.
In one embodiment, determining a first candidate risky action based on the traffic data comprises: acquiring the road type, the congestion condition and pedestrian data of a road surface on which the automobile currently runs, and determining a road condition danger coefficient according to the road type, the congestion condition and the pedestrian data; determining a first candidate dangerous action according to the road condition danger coefficient; determining a second candidate hazardous action from the driving data, comprising: acquiring the continuous running time length and the average running speed of the automobile, and determining a driving risk coefficient according to the continuous running time length and the average running speed; and determining a second candidate dangerous action according to the driving danger coefficient.
Further, the road condition data may include a road type, a congestion condition and pedestrian data, and then the road condition risk factor of the currently driving road surface is determined according to the road type, the congestion condition and the pedestrian data. The driving data includes a continuous driving time period and an average driving speed of the vehicle, and the driving risk coefficient is determined according to the continuous driving time period and the average driving speed of the vehicle. And finally, obtaining the final candidate dangerous action according to the road condition danger coefficient and the driving danger coefficient. The higher the risk factor, the more dangerous actions are ultimately determined.
In one embodiment, determining the alert level based on the hazard level and the duration includes: when the danger level exceeds a preset level and the duration exceeds a preset duration, determining automobile control parameters according to the danger level and the duration, and controlling the driving performance of the automobile according to the automobile control parameters; and when the danger level does not exceed the preset level or the duration does not exceed the preset duration, determining the alarm level according to the danger level and the duration.
It is understood that when the danger level and the duration exceed certain values, the vehicle parameters can be appropriately controlled according to the danger level and the duration, thereby controlling the drivability of the vehicle, and thus forcibly reducing the danger level of driving.
In one embodiment, the motion recognition of the target human body in the image frames acquired by the camera to obtain the motion recognition result of the target human body includes: extracting spatial interactive characteristics through a spatial flow convolution neural network aiming at an image collected by a camera, and extracting global spatial discriminative characteristics by utilizing a bidirectional LSTM; extracting time interactive characteristics through a time flow convolutional neural network, extracting global time characteristics from the time interactive characteristics through a three-dimensional convolutional neural network, and constructing a time attention model guided by an optical flow to calculate global time discriminative characteristics according to the global time characteristics; performing classification processing according to the global time discriminative features to obtain a first classification result, and performing classification processing according to the global space discriminative features to obtain a second classification result; and fusing the first classification result and the second classification result to obtain a fusion classification result, and obtaining an action recognition result of the target human body according to the fusion classification result.
Specifically, the motion recognition process mainly obtains motion features of human body motion according to temporal features and spatial features of continuous images. And then, obtaining a final action recognition result through an action recognition result obtained by temporal characteristic recognition and an action recognition result obtained by spatial characteristic recognition. The motion recognition structure obtained in this way can integrate the temporal and spatial characteristics of human motion to obtain the final recognition result.
Specifically, the method for extracting the spatial interactivity features through the spatial stream convolutional neural network comprises the following steps:
inputting image frames into a behavior significance detection network model to obtain a detection result, and obtaining a spatial interactivity characteristic according to the detection result;
constructing a mask-guided spatial attention model according to the image frame and the spatial interactive characteristics to obtain spatial discriminative characteristics;
determining a spatial interactivity characteristic according to the temporal attention weight and the spatial discriminative characteristic;
extracting time interactive features through a time flow convolutional neural network, extracting global time features from the time interactive features through a three-dimensional convolutional neural network, and constructing a time attention model guided by optical flow to calculate global time discriminative features according to the global time features, wherein the method comprises the following steps:
performing optical flow calculation on the shot image through a TVNet network to obtain an optical flow frame;
weighting the obtained optical flow frame according to the spatial attention weight to obtain a time interactive characteristic;
extracting global time characteristics from the time interactive characteristics through a three-dimensional convolution neural network;
inputting the global time characteristic into a time attention model guided by optical flow to obtain a time attention weight, and weighting the global time characteristic through the time attention weight to obtain a global time discriminant characteristic;
the method for fusing the first classification result and the second classification result is as follows:
Sr=((1+C1^2)/(1+C2^2))*S1+(1-((1+C1^2)/(1+C2^2)))*S2
wherein S is1Denotes the first classification result, S2Denotes the second classification result, SrRepresents the fusion classification result, C1And C2Representing a variable defined during the fusion, C1Less than or equal to C2
In the embodiment provided by the present application, a network structure for performing motion recognition on a target human body is shown in fig. 3, and the motion recognition method specifically may include the following steps:
1) acquiring RGB captured images in a continuous image stream: obtaining an original RGB captured image
Figure BDA0003332005940000121
Where N is the number of frame samples, fiRepresenting the ith frame.
2) Calculating a light flow graph: image F shot from RGB by applying TVNet networkRGBCalculating pairwise to obtain a light flow graph
Figure BDA0003332005940000131
oiRepresenting the ith optical flow frame.
3) Training a specific behavior significance detection network model based on Mask R-CNN segmentation technology, and taking each original shot image FRGBGenerating a detection image for input
Figure BDA0003332005940000132
Then, the output form is modified to obtain the space interactive characteristic
Figure BDA0003332005940000133
4) Image F taken with original RGBRGBAnd spatial interactivity features MRGBConstructing a mask-guided spatial attention model for input, and calculating a spatial attention weight WSGenerating spatially discriminative features K by attention weightingRGB
5) Weighting W the spatial attention calculated in the step 4)SAnd optical flow frame FOPTWeighting and calculating the time interactive characteristics IOPT
6) By means of a temporal interactivity characteristic IOPTFor input, a three-dimensional convolutional neural network is used to extract the global temporal features GOPT
7) By global temporal features GOPTFor input, a time attention model guided by optical flow is constructed, and a time attention weight W is calculatedtGenerating a global time-discriminative feature GK by attention weightingOPT
8) The time attention weight W calculated in the step 7) is usedtCharacteristic K distinguishable from spaceRGBWeighting and calculating the space interactive characteristics IRGB
9) By spatial interactivity features IRGBFor input, further extracting global space discriminative characteristics GK based on a bidirectional long-time and short-time memory networkRGBAnd then calculating a first classification result, namely a space probability score S through the full connection layer and Softmax classification1
10) With global time discriminative feature GKOPTFor input, calculating a second classification result, i.e. a time probability score S, through the full connection layer and the Softmax classification2
11) Score the spatial probability S1And the time probability S2The scores are fused to generate a final predicted result score Sr
The third step of the above process is directed to detecting the image
Figure BDA0003332005940000141
Modifying its output form, calculating local mask characteristic diagram
Figure BDA0003332005940000142
That is, only the detected discrimination area is left, and the pixel tone value of the remaining image area is set to 0. The calculation process is represented as (formula 1).
Figure BDA0003332005940000143
Wherein (p, q) represents a pixel value of a pixel point whose position is (p, q). For example, the data sets each contain different objects and human bodies. The foreground and background of each inspection image are separated by computing a local mask feature map.
RGB image frame F in the above processRGBAnd spatial interactivity features MRGBFor input, a spatial attention model guided with a mask is constructed. Each space interactivity feature miEach RGB image frame f is passed through an L-Net networkiThrough a G-Net network. L-Net and G-Net have the same network structure, but the network parameters are not shared with each other. These two networks each generate a respective signature, denoted FL,FG. The execution process of L-Net and G-Net can be expressed by the following mathematical forms (equation 2) to (equation 5):
Ii=Inc(mi) (formula 2)
FL=GAP(Ii) (formula 3)
Gi=Inc(fi) (formula 4)
FG=GAP(Gi) (formula 5)
Wherein, FLAnd FGRespectively representing a local feature and a global feature; inc for the inclusion v3 network; GAP represents global average pooling, and for a feature with one dimension of W × H × C, output with the dimension of 1 × 1 × C can be obtained through the global average pooling, namely, global information of each feature channel is obtained. Then the two characteristics are connected in series along the channel as F, and the formula is shown in the specification
Figure BDA0003332005940000151
The representation channels are connected in series, and richer feature representation is obtained.
Figure BDA0003332005940000152
Taking F as an input, constructing a spatial attention model to re-weight F to obtain a weighted feature map, wherein the weighting process can be described by the following formula:
WS1=γ(FCS1(GAP (F)) (equation 7)
WS=σ(FCS2(WS1) Equation 8)
KRGB=F⊙WS(formula 9)
Where γ denotes a ReLU activation function, σ denotes a Sigmoid activation function, FCS1,FCS2Represents two fully connected layers; GAP represents global average pooling; an indication of channel level multiplication; after passing through GAP, WS1Has an output size of
Figure BDA0003332005940000153
Final weight WSHas an output size of
Figure BDA0003332005940000154
Weighting spatial attention WSAnd carrying out weighted multiplication with the original characteristic F to selectively highlight the valid characteristic and weaken the invalid characteristic.
The above-mentioned flow 7) with a global temporal feature GOPTFor input, a temporal attention model guided by optical flow is constructed. The calculation of the temporal attention weight is converted into a calculation of the channel attention. Then, dimensions of the feature map are changed and global average pooling is performed, compressing all information into channel descriptors whose statistics can represent the entire video. This process of global average pooling can be expressed as:
Figure BDA0003332005940000155
wherein W and H represent the width and height, respectively, and o represents the number of channels. And inputting the compressed feature diagram into a network consisting of two fully connected layers so as to obtain the mutual dependence on time. The size of the second full connection layer is consistent with the channel number o of the input feature graph, and the newly learned weight and the original feature G are combinedOPTPerforms channel-level multiplication:
Wt1=γ(FCt1(Fg') Equation 11
Wt=σ(FCt2(Wt1) Equation 12
Figure BDA0003332005940000161
Wherein, WtRepresenting temporal attention weight; gamma denotes the ReLU activation function and sigma denotes the Sigmoid activation function; FCt1,FCt2Two fully connected layers are shown.
In step 11) of the above flow, the method for fusing the first classification result and the second classification result is as follows:
Sr=((1+C1^2)/(1+C2^2))*S1+(1-((1+C1^2)/(1+C2^2)))*S2(formula 14)
Wherein S is1Denotes the first classification result, S2Denotes the second classification result, SrRepresenting fusion classificationsAs a result, C1And C2Representing a variable defined during the fusion, C1Less than or equal to C2。C1And C2The variables may be empirically set or may be set in advance, and are not limited herein.
Fig. 3 is a schematic structural diagram of an intelligent driving device based on artificial intelligence in one embodiment. As shown in fig. 3, the intelligent driving apparatus based on artificial intelligence includes:
the image acquisition module 302 is configured to acquire an image frame acquired by a camera mounted in an automobile when the automobile is detected to be running;
the action recognition module 304 is configured to perform action recognition on a target human body in the image frame acquired by the camera to obtain an action recognition result of the target human body;
the danger identification module 306 is configured to obtain road condition data of a road on which the automobile currently runs, and determine candidate dangerous actions according to the road condition data;
a duration obtaining module 308, configured to determine a danger level of the dangerous action and a duration of the dangerous action when it is determined that the dangerous action of the candidate dangerous action is executed by the target human body according to the action recognition result, and the duration of the dangerous action exceeds a preset duration;
a grade obtaining module 310, configured to determine an alarm grade according to the risk grade and the duration;
and an information output module 312, configured to determine alarm information according to the alarm level, and output the alarm information through an alarm device of the automobile.
The artificial intelligence intelligent driving device provided by the embodiment can identify the action of the driver according to the image collected by the camera, determine the dangerous action of the current driver according to the current driving road condition data and the identified action, and output corresponding alarm information according to the execution duration and the danger level of the dangerous action so as to prompt the dangerous action of the user, thereby reducing the driving risk.
FIG. 4 is a diagram illustrating the hardware components of an artificial intelligence based intelligent driving system in one embodiment. It will be appreciated that fig. 4 only shows a simplified design of the electronic device. In practical applications, the electronic devices may further include necessary other components, including but not limited to any number of input/output systems, processors, controllers, memories, etc., respectively, and all electronic devices that can implement the method for managing big data across cloud platforms according to the embodiments of the present application are within the scope of the present application.
The memory includes, but is not limited to, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM), or a portable read-only memory (CD-ROM), which is used for storing instructions and data.
The input system is for inputting data and/or signals and the output system is for outputting data and/or signals. The output system and the input system may be separate devices or may be an integral device.
The processor may include one or more processors, for example, one or more Central Processing Units (CPUs), and in the case of one CPU, the CPU may be a single-core CPU or a multi-core CPU. The processor may also include one or more special purpose processors, which may include GPUs, FPGAs, etc., for accelerated processing.
The memory is used to store program codes and data of the network device.
The processor is used for calling the program codes and data in the memory and executing the steps in the method embodiment. Specifically, reference may be made to the description of the method embodiment, which is not repeated herein.
In the several embodiments provided in the present application, it should be understood that the disclosed system and method may be implemented in other ways. For example, the division of the unit is only one logical function division, and other division may be implemented in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. The shown or discussed mutual coupling, direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, systems or units, and may be in an electrical, mechanical or other form.
Units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In the above embodiments, the implementation may be wholly or partially realized by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. The procedures or functions according to the embodiments of the present application are wholly or partially generated when the computer program instructions are loaded and executed on a computer. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable system. The computer instructions may be stored on or transmitted over a computer-readable storage medium. The computer instructions may be transmitted from one website, computer, server, or data center to another website, computer, server, or data center by wire (e.g., coaxial cable, fiber optic, Digital Subscriber Line (DSL)), or wirelessly (e.g., infrared, wireless, microwave, etc.). The computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device, such as a server, a data center, etc., that includes one or more of the available media. The usable medium may be a read-only memory (ROM), or a Random Access Memory (RAM), or a magnetic medium, such as a floppy disk, a hard disk, a magnetic tape, a magnetic disk, or an optical medium, such as a Digital Versatile Disk (DVD), or a semiconductor medium, such as a Solid State Disk (SSD).
The above description is only for the specific embodiments of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily conceive various equivalent modifications or substitutions within the technical scope of the present application, and these modifications or substitutions should be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (9)

1. An artificial intelligence based intelligent driving method, characterized in that the method comprises:
the method comprises the steps of acquiring image frames collected by a camera installed in an automobile when the automobile is detected to run;
performing action recognition on a target human body in an image frame acquired by the camera to obtain an action recognition result of the target human body;
acquiring road condition data of a road surface on which the automobile currently runs, and determining candidate dangerous actions according to the road condition data;
when the dangerous action of the target human body in the candidate dangerous action is determined according to the action recognition result, and the time length for executing the dangerous action exceeds the preset time length, determining the dangerous grade of the dangerous action and the duration time length of the dangerous action;
determining an alarm level according to the danger level and the duration;
determining alarm information according to the alarm grade, and outputting the alarm information through alarm equipment of the automobile;
the motion recognition of the target human body in the image frame acquired by the camera to obtain the motion recognition result of the target human body includes:
extracting spatial interactive characteristics through a spatial flow convolution neural network aiming at the image frames collected by the camera, and extracting global spatial discriminative characteristics by utilizing a bidirectional LSTM;
extracting time interactive features through a time flow convolutional neural network, extracting global time features from the time interactive features through a three-dimensional convolutional neural network, and constructing a time attention model guided by optical flow to calculate global time discriminative features according to the global time features;
performing classification processing according to the global time discriminative feature to obtain a first classification result, and performing classification processing according to the global space discriminative feature to obtain a second classification result;
and fusing the first classification result and the second classification result to obtain a fusion classification result, and obtaining an action recognition result of the target human body according to the fusion classification result.
2. The method according to claim 1, wherein the acquiring image frames collected by a camera installed in the automobile during the detection of the driving of the automobile comprises:
continuously acquiring running speed data of the automobile after the automobile is started;
when the duration that the driving speed data is continuously greater than the first speed threshold exceeds a first duration threshold, starting to acquire image frames acquired by a camera installed in an automobile;
and when the duration of the running speed data which is continuously less than the second speed threshold exceeds the second duration threshold, stopping acquiring the image frames acquired by the camera arranged in the automobile.
3. The method of claim 1, wherein the obtaining road condition data of a road surface on which the vehicle is currently traveling and determining candidate dangerous actions according to the road condition data comprises:
acquiring road condition data of a road surface on which the automobile currently runs and driving data of the automobile;
determining a first candidate dangerous action according to the road condition data, and determining a second candidate dangerous action according to the driving data;
and obtaining the candidate dangerous action according to the first candidate dangerous action and the second candidate dangerous action.
4. The method of claim 3, wherein determining a first candidate risky action based on the traffic data comprises:
acquiring the road type, the congestion condition and pedestrian data of the road surface on which the automobile currently runs, and determining a road condition danger coefficient according to the road type, the congestion condition and the pedestrian data;
determining a first candidate dangerous action according to the road condition danger coefficient;
the determining a second candidate dangerous action according to the driving data comprises:
acquiring the continuous running time length and the average running speed of the automobile, and determining a driving risk coefficient according to the continuous running time length and the average running speed;
and determining a second candidate dangerous action according to the driving danger coefficient.
5. The method of claim 1, wherein said determining an alarm level based on said hazard level and said duration comprises:
when the danger level exceeds a preset level and the duration exceeds a preset duration, determining an automobile control parameter according to the danger level and the duration, and controlling the driving performance of the automobile according to the automobile control parameter;
and when the danger level does not exceed a preset level or the duration does not exceed a preset duration, determining an alarm level according to the danger level and the duration.
6. The method of claim 1, wherein the extracting spatial interactivity features through a spatial stream convolutional neural network comprises:
inputting the image frame into a behavior significance detection network model to obtain a detection result, and obtaining a spatial interactivity characteristic according to the detection result;
constructing a mask-guided spatial attention model according to the image frame and the spatial interactive characteristics to obtain spatial discriminative characteristics;
determining a spatial interactivity characteristic according to the temporal attention weight and the spatial discriminative characteristic;
the method comprises the steps of extracting time interactive features through a time flow convolution neural network, extracting global time features from the time interactive features through a three-dimensional convolution neural network, and constructing a time attention model guided by an optical flow to calculate global time discriminative features according to the global time features, and comprises the following steps:
performing optical flow calculation on the shot image through a TVNet network to obtain an optical flow frame;
weighting the obtained optical flow frame according to the spatial attention weight to obtain the time interactive feature;
extracting global time characteristics from the time interactive characteristics through a three-dimensional convolutional neural network;
inputting the global time characteristic into a time attention model guided by optical flow to obtain a time attention weight, and weighting the global time characteristic through the time attention weight to obtain a global time discriminative characteristic;
the method for fusing the first classification result and the second classification result comprises the following steps:
Sr=((1+C1^2)/(1+C2^2))*S1+(1-((1+C1^2)/(1+C2^2)))*S2
wherein S is1Representing the first classification result, S2Representing the second classification result, SrRepresents the fusion classification result, C1And C2Representing a variable defined during the fusion, C1Less than or equal to C2
7. An intelligent driving device based on artificial intelligence, the device comprising:
the system comprises an image acquisition module, a data acquisition module and a data processing module, wherein the image acquisition module is used for acquiring image frames acquired by a camera arranged in an automobile in the process of detecting that the automobile runs;
the action recognition module is used for carrying out action recognition on a target human body in the image frame acquired by the camera to obtain an action recognition result of the target human body;
the danger identification module is used for acquiring road condition data of a road surface on which the automobile runs currently and determining candidate dangerous actions according to the road condition data;
the time length obtaining module is used for determining the danger level of the dangerous action and the duration of the dangerous action when the dangerous action of the target human body in the candidate dangerous action is determined according to the action recognition result and the time length for executing the dangerous action exceeds the preset time length;
the grade acquisition module is used for determining an alarm grade according to the danger grade and the duration;
the information output module is used for determining alarm information according to the alarm grade and outputting the alarm information through the alarm equipment of the automobile;
the action recognition module is used for carrying out action recognition on a target human body in an image frame acquired by the camera to obtain an action recognition result of the target human body, and comprises: extracting spatial interactive characteristics through a spatial flow convolution neural network aiming at the image frames collected by the camera, and extracting global spatial discriminative characteristics by utilizing a bidirectional LSTM; extracting time interactive features through a time flow convolutional neural network, extracting global time features from the time interactive features through a three-dimensional convolutional neural network, and constructing a time attention model guided by optical flow to calculate global time discriminative features according to the global time features; performing classification processing according to the global time discriminative feature to obtain a first classification result, and performing classification processing according to the global space discriminative feature to obtain a second classification result; and fusing the first classification result and the second classification result to obtain a fusion classification result, and obtaining an action recognition result of the target human body according to the fusion classification result.
8. An electronic device comprising a memory having computer-executable instructions stored thereon and a processor that, when executing the computer-executable instructions on the memory, implements the method of any of claims 1-6.
9. A computer-readable storage medium, on which a computer program is stored which, when executed by a processor, carries out the method of any one of claims 1 to 6.
CN202111283225.9A 2021-03-23 2021-03-23 Intelligent driving method and device based on artificial intelligence and related products Withdrawn CN114005104A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111283225.9A CN114005104A (en) 2021-03-23 2021-03-23 Intelligent driving method and device based on artificial intelligence and related products

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202111283225.9A CN114005104A (en) 2021-03-23 2021-03-23 Intelligent driving method and device based on artificial intelligence and related products
CN202110310081.5A CN113011347B (en) 2021-03-23 2021-03-23 Intelligent driving method and device based on artificial intelligence and related products

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
CN202110310081.5A Division CN113011347B (en) 2021-03-23 2021-03-23 Intelligent driving method and device based on artificial intelligence and related products

Publications (1)

Publication Number Publication Date
CN114005104A true CN114005104A (en) 2022-02-01

Family

ID=76405646

Family Applications (2)

Application Number Title Priority Date Filing Date
CN202111283225.9A Withdrawn CN114005104A (en) 2021-03-23 2021-03-23 Intelligent driving method and device based on artificial intelligence and related products
CN202110310081.5A Active CN113011347B (en) 2021-03-23 2021-03-23 Intelligent driving method and device based on artificial intelligence and related products

Family Applications After (1)

Application Number Title Priority Date Filing Date
CN202110310081.5A Active CN113011347B (en) 2021-03-23 2021-03-23 Intelligent driving method and device based on artificial intelligence and related products

Country Status (1)

Country Link
CN (2) CN114005104A (en)

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TW201330827A (en) * 2012-01-19 2013-08-01 Utechzone Co Ltd Attention detection device based on driver's reflex action and method thereof
CN204810370U (en) * 2015-05-25 2015-11-25 浙江商业职业技术学院 A device is answered to phone for car
CN105590466A (en) * 2016-03-14 2016-05-18 重庆邮电大学 Monitoring system and monitoring method for dangerous operation behaviors of driver on cloud platform
WO2019028798A1 (en) * 2017-08-10 2019-02-14 北京市商汤科技开发有限公司 Method and device for monitoring driving condition, and electronic device
CN108189783B (en) * 2017-12-29 2021-07-09 徐州重型机械有限公司 Vehicle running state monitoring method and device and vehicle
CN110143202A (en) * 2019-04-09 2019-08-20 南京交通职业技术学院 A kind of dangerous driving identification and method for early warning and system
CN111062240B (en) * 2019-10-16 2024-04-30 中国平安财产保险股份有限公司 Monitoring method and device for automobile driving safety, computer equipment and storage medium
CN110696834B (en) * 2019-11-20 2022-01-14 东风小康汽车有限公司重庆分公司 Driver state monitoring method, device and system and controller
CN111814637A (en) * 2020-06-29 2020-10-23 北京百度网讯科技有限公司 Dangerous driving behavior recognition method and device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN113011347B (en) 2022-01-07
CN113011347A (en) 2021-06-22

Similar Documents

Publication Publication Date Title
CN111274881B (en) Driving safety monitoring method and device, computer equipment and storage medium
CN108725440B (en) Forward collision control method and apparatus, electronic device, program, and medium
CN109584507B (en) Driving behavior monitoring method, device, system, vehicle and storage medium
Omerustaoglu et al. Distracted driver detection by combining in-vehicle and image data using deep learning
CN111310562B (en) Vehicle driving risk management and control method based on artificial intelligence and related equipment thereof
CN110765807A (en) Driving behavior analysis method, driving behavior processing method, driving behavior analysis device, driving behavior processing device and storage medium
JP2011014037A (en) Risk prediction system
CN114494158A (en) Image processing method, lane line detection method and related equipment
US11250279B2 (en) Generative adversarial network models for small roadway object detection
JP2009096365A (en) Risk recognition system
JP5185554B2 (en) Online risk learning system
CN112330964B (en) Road condition information monitoring method and device
JP2011014038A (en) Online risk recognition system
CN114373189A (en) Behavior detection method and apparatus, terminal device and storage medium
CN113269111B (en) Video monitoring-based elevator abnormal behavior detection method and system
JP2011003076A (en) Risk recognition system
CN113011347B (en) Intelligent driving method and device based on artificial intelligence and related products
CN116461546A (en) Vehicle early warning method, device, storage medium and processor
CN116486334A (en) High-altitude parabolic monitoring method, system and device based on vehicle and storage medium
CN113283286A (en) Driver abnormal behavior detection method and device
CN114708498A (en) Image processing method, image processing apparatus, electronic device, and storage medium
JP2010267134A (en) Risk recognition system
CN117774992A (en) Driving intention recognition method, device and system
EP4194300A1 (en) Providing a prediction of a radius of a motorcycle turn
CN115384541A (en) Method and system for driving risk detection

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WW01 Invention patent application withdrawn after publication
WW01 Invention patent application withdrawn after publication

Application publication date: 20220201