CN109109863A - Smart machine and its control method, device - Google Patents

Smart machine and its control method, device Download PDF

Info

Publication number
CN109109863A
CN109109863A CN201810850160.3A CN201810850160A CN109109863A CN 109109863 A CN109109863 A CN 109109863A CN 201810850160 A CN201810850160 A CN 201810850160A CN 109109863 A CN109109863 A CN 109109863A
Authority
CN
China
Prior art keywords
data
model
control
smart machine
task
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201810850160.3A
Other languages
Chinese (zh)
Other versions
CN109109863B (en
Inventor
袁庭球
黄韬
黄永兵
刘兵
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Priority to CN201810850160.3A priority Critical patent/CN109109863B/en
Publication of CN109109863A publication Critical patent/CN109109863A/en
Application granted granted Critical
Publication of CN109109863B publication Critical patent/CN109109863B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W40/00Estimation or calculation of non-directly measurable driving parameters for road vehicle drive control systems not related to the control of a particular sub unit, e.g. by using mathematical models
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W40/00Estimation or calculation of non-directly measurable driving parameters for road vehicle drive control systems not related to the control of a particular sub unit, e.g. by using mathematical models
    • B60W40/02Estimation or calculation of non-directly measurable driving parameters for road vehicle drive control systems not related to the control of a particular sub unit, e.g. by using mathematical models related to ambient conditions
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W40/00Estimation or calculation of non-directly measurable driving parameters for road vehicle drive control systems not related to the control of a particular sub unit, e.g. by using mathematical models
    • B60W40/10Estimation or calculation of non-directly measurable driving parameters for road vehicle drive control systems not related to the control of a particular sub unit, e.g. by using mathematical models related to vehicle motion

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Automation & Control Theory (AREA)
  • Mathematical Physics (AREA)
  • Transportation (AREA)
  • Mechanical Engineering (AREA)
  • Feedback Control In General (AREA)

Abstract

This application provides a kind of smart machine and its control methods, device, belong to machine learning field.This method can obtain detection data after receiving for the executing instruction of goal task, and will test data and goal task is input to sensor model, obtain and the associated representative detection data of the goal task;Then goal task and representative detection data can be input to plan model, obtains target state data;Later can be by target state data, and partly or entirely representative detection data is input to Controlling model, obtains the control parameter for controlling smart machine, and control smart machine based on the control parameter.Solve larger to the dependence of training sample in the smart machine control process of the prior art, the undesirable problem of training effect can be realized and preferably control smart machine.

Description

Smart machine and its control method, device
Technical field
This application involves machine learning field, in particular to a kind of smart machine and its control method, device.
Background technique
Smart machine is referred to as intelligent body (Intelligent Agent, IA), is a kind of autonomous entity (autonomous entity).Smart machine can perceive the environment of surrounding by sensor, and can pass through actuator (actuators) operation is executed.Common smart machine generally comprises robot and automatic driving vehicle etc..
In the related technology, the control obtained based on machine learning algorithm training is typically provided in the control device of smart machine Simulation, the data which can acquire sensor are as input data, and after handling the input data The control parameter for controlling actuator is generated, which can serve to indicate that actuator executes corresponding operating.For example, right In automatic driving vehicle, Controlling model can according to the collected road image of camera, generate for control throttle, brake and The control parameter of at least one of steering wheel actuator.
But the control effect of Controlling model in the related technology used sample data when depending on model training Sample size, when sample size is less, the control effect of the Controlling model is poor.
Summary of the invention
The embodiment of the invention provides a kind of smart machine and its control methods, device, can solve in the related technology The poor problem of the control effect of Controlling model.Technical solution is as follows:
On the one hand, a kind of control method of smart machine is provided, this method can be applied to the control dress of smart machine In setting.This method may include: to receive for after the executing instruction of goal task, and obtain detection data, which can To include the environmental data of the smart machine ambient enviroment and the status data of the smart machine.Later, control device can be with It will test data and goal task be input to sensor model, obtain and the associated representative detection data of the goal task;Then Goal task and representative detection data can be input to plan model, obtain target state data;It later can be by mesh Status data is marked, and partly or entirely representative detection data is input to Controlling model, obtains for controlling smart machine Control parameter, and the goal task is executed based on control parameter control smart machine.Wherein, which is based on control reason It is obtained by data initialization.
Control method provided by the present application, Controlling model are obtained based on control theory data initialization, therefore the control Smaller to the dependence of training sample when simulation training, training effect is preferable.Based on the control method of the Controlling model to intelligence Control effect when equipment is controlled is preferable.
Optionally, which can be obtained based on the training of the mode of deep learning.The plan model can be based on strong The mode training that chemistry is practised obtains.The Controlling model can be obtained based on the training of the mode of intensified learning.
Certainly, the mode training which can also be learnt based on intensified learning or deeply obtains, the planning The mode training that model and Controlling model can also be learnt based on deep learning or deeply obtains.
Optionally, before receiving for the executing instruction of goal task, this method can also include:
It obtains detection sample data and detects sample data, the detection sample data with the associated representativeness of appointed task The environmental samples data of ambient enviroment including smart machine when executing appointed task and the state sample of the smart machine Data;Based on the mode of deep learning to the detection sample data, the appointed task and the representativeness detect sample data into Row training, obtains the sensor model.
During the mode based on deep learning is trained, sample data can be will test and the appointed task is defeated Enter into initial sensor model, the representative detection data and the representativeness for being then based on sensor model output detect sample number According to difference, adjust the parameter of the initial sensor model, obtain the sensor model.
Optionally, before receiving for the executing instruction of goal task, this method further include:
It obtains and the associated representative detection sample data of appointed task and Effect value sample data;Based on intensified learning Mode, using representativeness detection sample data, the appointed task and the Effect value sample data to initial plan model It is trained, obtains the plan model.
During the mode based on intensified learning is trained, representativeness detection sample data can be specified with this Task is input in initial plan model, and is adjusted based on the Effect value sample data to the parameter of initial plan model, To obtain the plan model.
Optionally, before receiving for the executing instruction of goal task, this method can also include:
Initial Controlling model is initialized based on the control theory data;Obtain with the associated part of appointed task or All representative detection sample datas, dbjective state sample data and Effect value sample data;Side based on intensified learning Formula detects sample data, the dbjective state sample data and the Effect value sample data using the representativeness got, The initial Controlling model is trained, the Controlling model is obtained.
Since control theory data can directly reflect and show the control law and principle of smart machine, it is based on the control After gross data initializes initial Controlling model, sample size required when following model training can be effectively reduced, is mentioned The high efficiency of training, reduces trained cost.
Optionally, which may include: the control submodel for calculating weight, and for calculating the control The one or more of parameter calculate submodel;Before receiving for the executing instruction of goal task, this method further include:
Acquisition some or all of be associated with representativeness with appointed task and detects sample data, dbjective state sample data, with And Effect value sample data;Mode based on intensified learning detects sample data, the target-like using the representativeness got Aspect notebook data and the Effect value sample data are trained initial control submodel, obtain the control submodel;Base Each calculating submodel is determined in the control theory data.
The calculating submodel for calculating control parameter is determined based on control theory data, effectively increases Controlling model Training effectiveness reduces trained cost.
Optionally, which may include: the control submodel for calculating weight, and for calculating the control The one or more of parameter calculate submodel;
By the target state data, and partly or entirely, the representativeness detection data is input to Controlling model, is used May include: in the process for the control parameter for controlling the smart machine
It is obtained from the target state data, and partly or entirely in the representativeness detection data and each calculating submodule The corresponding one group of target input data of type;Every group of target input data is input to corresponding calculating submodel respectively, is obtained every The value of control parameter corresponding to group target input data;By the target state data, and partly or entirely representative inspection Measured data is input to the control submodel, obtains one group of weight;It is corresponding according to one group of weight and each group target input data Control parameter value, determine the target value of the control parameter.
For example, can according to using one group of weight, to the value of the corresponding control parameter of each group target input data into Row weighted sum obtains the target value of the control parameter.
Since the target value of the control parameter is by the weight of control submodel output and to calculate based on submodel What the value of obtained control parameter determined, thus the weight can the target value to control parameter form constraint, the i.e. power The value range that finally determining target value can be constrained again ensure that the reasonability of the control parameter of Controlling model output, Ensure safety and reliability when controlling based on the control parameter smart machine.
Optionally, this method can also include:
After controlling the smart machine based on the control parameter, the new status data of the smart machine is obtained;According to this New status data and the goal task determines control effect;According to the control effect, the sensor model, the plan model are adjusted With the parameter of models one or more in the Controlling model.
During controlling smart machine, control effect is assessed according to the new status data of smart machine, And the parameter based on control effect adjustment model, the on-line tuning of model and perfect is realized, can constantly be improved to intelligence Control effect when equipment is controlled.
Optionally, the process for determining control effect according to the new status data and the goal task may include:
The new status data and the goal task are input to evaluation model, obtain the control effect of the control parameter. It can store the corresponding evaluation algorithms of different task in the evaluation model, it, can be with after which gets goal task Evaluation algorithms corresponding with the goal task are first chosen, then new status data is handled based on the evaluation algorithms again, To determine control effect.Evaluation model evaluates control effect when executing different task using different evaluation algorithms, Effectively increase the flexibility and reliability when evaluating control effect.
Optionally, which can be automatic driving vehicle or intelligent robot.
On the other hand, a kind of control device of smart machine is provided, the apparatus may include at least one modules, this is extremely A few module can be used to implement the control method of smart machine provided by above-mentioned aspect.
Another aspect provides a kind of control device of smart machine, the apparatus may include memory, processor and It is stored in the computer program that can be run on the memory and on the processor, which executes real when the computer program Now such as the control method of the smart machine provided in terms of above-mentioned.
In another aspect, providing a kind of computer readable storage medium, finger is stored in the computer readable storage medium It enables, when the computer readable storage medium is run on computers, so that computer executes the intelligence as provided in terms of above-mentioned The control method of equipment.
In another aspect, providing a kind of smart machine, which may include the smart machine that above-mentioned aspect provides Control device.
In another aspect, a kind of computer program product comprising instruction is provided, when the computer program product is calculating When being run on machine, so that computer executes the control method of the smart machine as provided in terms of above-mentioned.
Technical solution bring beneficial effect provided by the present application at least may include:
This application provides a kind of smart machine and its control method, device, the testing number that the program can will acquire It is input to sensor model according to goal task, is obtained and the associated representative detection data of the goal task.It then can be by mesh Mark task and the representativeness detection data are input to plan model, obtain target state data.It later can be by the dbjective state Data and the representativeness detection data are input to Controlling model, obtain the control parameter for controlling the smart machine.Finally it is The smart machine can be controlled based on the control parameter.Since the Controlling model is obtained based on control theory data initialization , which can directly express and reflect the control law and principle of smart machine, compared in the related technology It directlys adopt training sample to be trained, not only reduces dependence of the Controlling model to training sample, improve training effectiveness, also It may insure the control effect to smart machine.
Detailed description of the invention
Fig. 1 is a kind of schematic diagram of smart machine provided in an embodiment of the present invention;
Fig. 2 is a kind of flow chart of the control method of smart machine provided in an embodiment of the present invention;
Fig. 3 is a kind of architecture diagram of control system provided in an embodiment of the present invention;
Fig. 4 is a kind of structural schematic diagram of sensor model provided in an embodiment of the present invention;
Fig. 5 is the method for a kind of sensor model acquisition and the associated perception data of goal task provided in an embodiment of the present invention Flow chart;
Fig. 6 is a kind of architecture diagram of sensor model provided in an embodiment of the present invention;
Fig. 7 is a kind of architecture diagram of plan model provided in an embodiment of the present invention;
Fig. 8 is a kind of architecture diagram of Controlling model provided in an embodiment of the present invention;
Fig. 9 is a kind of architecture diagram for controlling submodel provided in an embodiment of the present invention;
Figure 10 is the method flow diagram of the parameter of each model in a kind of adjustment control system provided in an embodiment of the present invention;
Figure 11 is a kind of architecture diagram of evaluation algorithms model provided in an embodiment of the present invention;
Figure 12 is a kind of local architecture diagram of control system provided in an embodiment of the present invention;
Figure 13 is a kind of flow chart of the training method of sensor model provided in an embodiment of the present invention;
Figure 14 is a kind of flow chart of training method for planning perception model provided in an embodiment of the present invention;
Figure 15 is a kind of flow chart of the training method of Controlling model provided in an embodiment of the present invention;
Figure 16 is a kind of structural schematic diagram of the control device of smart machine provided in an embodiment of the present invention;
Figure 17 is the structural schematic diagram of the control device of another smart machine provided in an embodiment of the present invention;
Figure 18 is the structural schematic diagram of the control device of another smart machine provided in an embodiment of the present invention;
Figure 19 is the structural schematic diagram of the control device of another smart machine provided in an embodiment of the present invention.
Specific embodiment
With the development and maturation of artificial intelligence technology, the intelligence such as intelligent robot and automatic driving vehicle has greatly been pushed The development of equipment related industry, while the requirement to the control effect of smart machine is also higher and higher.The embodiment of the present invention provides A kind of control method of smart machine, this method can be applied to the control device of smart machine.The control device can match It is placed in smart machine or the control device can also be configured at the control equipment established with the smart machine and have communication connection In.The control equipment can in smart machine each sensor and driving device communicate, each sensor can be obtained Collected detection data, and controlled according to driving device of the detection data to the smart machine.With reference to Fig. 1, the intelligence Energy equipment 00 can be automatic driving vehicle or intelligent robot etc..The control equipment can be server, and the server It can be a server, or the server cluster consisted of several servers or a cloud computing service center.
In embodiments of the present invention, the environment for detecting smart machine ambient enviroment can be set on smart machine 00 Multiple sensors of data, and multiple sensors for detecting smart machine oneself state data.Due to different types of Sensor detection range, detection accuracy and in terms of have respective advantage, therefore generally set in smart machine A plurality of types of sensors are equipped with, the function of a plurality of types of sensors can be complementary.For example, for detecting environmental data Sensor may include visual sensor, laser radar, ultrasonic sensor and millimetre-wave radar etc.;For detecting state data Sensor may include global positioning system (Global Positioning System, GPS) sensor, velocity sensor and Rotation direction sensor etc..The available multiple sensor of the control device of smart machine collects detection data (i.e. environmental data And status data), and can be according to the goal task (such as straight trip, reversing or adjust automatically temperature etc.) received, to this Detection data is analyzed and processed, and obtains the control parameter for controlling smart machine.
For the automatic driving vehicle shown in Fig. 1, be arranged on the automatic driving vehicle for detecting environmental data Sensor may include laser radar (after being respectively arranged at roof, Chinese herbaceous peony and vehicle), camera (before being respectively arranged at vehicle To, it is backward and lateral) and preceding millimetre-wave radar (after being respectively arranged at Chinese herbaceous peony and vehicle), sensor for detecting state data It can be set in the inside of automatic driving vehicle, be not shown in the figure.
The environmental data and automatic driving vehicle that the sensor being arranged on the automatic driving vehicle can detecte surrounding are certainly The status data of body.After the control device of automatic driving vehicle gets above-mentioned detection data, the mesh that can be issued according to user Mark task and the detection data generate control parameter.Later, control device can pass through the control bus of automatic driving vehicle Control parameter is sent to the transmission device of automatic driving vehicle, and then controls the speed and steering of automatic driving vehicle, is guaranteed certainly The dynamic vehicle safety that drives reliably travels on road.
It should be noted that for different types of smart machine, the type of the sensor being arranged thereon, quantity and set Seated position can be adjusted according to the actual situation, and it is not limited in the embodiment of the present invention.
Fig. 2 is a kind of flow chart of the control method of smart machine provided in an embodiment of the present invention, and this method can be applied In the control device of smart machine.With reference to Fig. 2, this method may include:
Step 101 is received for after the executing instruction of goal task, and obtains detection data.
It in embodiments of the present invention, can be with when user wishes to carry out automatically controlling smart machine by control device By preset trigger action, triggers this and execute instruction.For example, task image can be shown in the display interface of the smart machine Mark, the trigger action can be that the operation for the icon that user clicks goal task can after smart machine detects the trigger action To generate executing instruction for goal task.The goal task can for automatic Pilot to specified destination, automatic follow the bus, from Dynamic reversing automatically adjusts the tasks such as interior temperature and humidity.Optionally, the trigger action can also for voice operating, slide or Person presses the operation etc. of specified button, and it is not limited in the embodiment of the present invention.
Control device receives after this executes instruction, the sensor of available smart machine (such as image, laser thunder Reach, the sensors such as millimetre-wave radar and GPS) the collected detection data of institute.The detection data may include smart machine week The environmental data in collarette border and the status data of the smart machine.Wherein, which may include around smart machine The attribute data of different objects in environment, such as may include road data (such as width and number of track-lines etc. of road), obstacle Object data (such as size, position and movement speed of barrier etc.) and indicator light data (such as color of indicator light) etc..It should Status data may include the behavior state data of smart machine, such as may include the data such as movement speed and steering angle.
Optionally, which can also include: temperature data, humidity data and barometric information etc., the status data It can also include: that remaining capacity, Fuel Oil Remaining, tire pressure, vehicle interior temperature and the vehicle humidity etc. of smart machine is able to reflect intelligence The data of equipment running status or operational effect.Data type included by the environmental data and status data can be according to intelligence The type for the sensor being arranged in equipment is adjusted flexibly, and the embodiment of the present invention is to without limitation.
Optionally, the detection data that control device is got can be the collected initial data of sensor, such as laser The point cloud data of radar acquisition, the data being also possible to after sensor preliminary treatment, such as laser radar is from point cloud data The data such as the middle size for analyzing the object obtained and distance.
It is exemplary, it is assumed that the smart machine is automatic driving vehicle, and the control device of the automatic driving vehicle has received Automatic follow the bus executes instruction, then the collected testing number of sensor institute of the available automatic driving vehicle of the control device According to.Environmental data in the detection data may include: lane line curvature k1, vehicle present speed v1, the kind of front obstacle Class b, speed v2 and this vehicle distance d2, the data such as temperature t1 and wind speed v3.Status data in the detection data may include The data such as the speed v0 and steering angle α of automatic driving vehicle.
The detection data and the goal task are input to sensor model by step 102, are obtained associated with the goal task Representative detection data.
In embodiments of the present invention, control system can be configured in the control device, control device can pass through the control System processed controls smart machine.Fig. 3 is a kind of architecture diagram of control system provided in an embodiment of the present invention.Such as Fig. 3 institute Show, which may include sensor model 01, plan model 02 and Controlling model 03.The sensor model 01 is used for from the inspection It is obtained in measured data and the associated representative detection data of goal task.The plan model 02 is used for according to the representativeness testing number Accordingly and the goal task, the target state data of smart machine is determined.The Controlling model 03 is used to be somebody's turn to do according to part or all of Representative detection data and the target state data, obtain the control parameter for controlling smart machine.
Control device, can be first by the detection data and the target after determining goal task and getting detection data Task is input to sensor model 01.Sensor model 01 can obtain and the associated representative of the goal task according to the input data Property detection data.
Optionally, in order to improve the treatment effeciencies of data, sensor model 01 can first carry out the detection data of input pre- Processing, which may include: at least one of extraction, classification and fusion.For example, sensor model 01 can be to input Detection data successively extracts, classifies and fusion treatment.After pretreatment, sensor model 01 can filter out hash, Get the attribute data of the attribute data of each object and smart machine itself in smart machine ambient enviroment.
Fig. 4 is a kind of structural schematic diagram of sensor model provided in an embodiment of the present invention, as shown in figure 4, the sensor model 01 may include perception fusion submodel 011 and feature extraction submodel 012.Perception fusion submodel 011 can be used for pair The detection data is pre-processed, and pretreated detection data is sent to feature extraction submodel 012.This feature is extracted Submodel 012 is used to obtain from pretreated detection data and the associated representative detection data of goal task.
The detection data y of the at a time t output of submodel 011 is merged in the perceptiont, it can be understood as to multiple sensors In a large amount of detection data z of moment t outputtIt is generated after being merged.Therefore, which merges submodel 011 in t moment The detection data y of outputtProbability P can be expressed as P (yt|zt).Perception fusion submodel 011 carries out in advance to detection data In treatment process, the different y of output can be calculatedtProbability, and the maximum y of select probabilitytAs reality output.
Exemplary, by taking automatic driving vehicle as an example, perception fusion submodel 011 can be known according to the environmental data of input It Chu not stationary body and dynamic object in the ambient enviroment of automatic driving vehicle.For stationary body, perception fusion submodel 011 can detecte its data such as (i.e. type) and size of classifying;For dynamic object, perception fusion submodel 011 be can detecte The data such as its speed and intention (i.e. prediction locus).Later, the needle that perception fusion submodel 011 can export different sensors The initial data of same object is classified and merged, to obtain the attribute data and the attribute number of itself of different objects According to.For example, the initial data of radar output is the point cloud data of many point compositions, the initial data of camera output is picture number According to these initial data do not include the semantic information of any object.Perception fusion submodel 011 can use Kalman filtering The initial data that scheduling algorithm exports each sensor is classified, and the initial data that multiple sensors export is merged Afterwards, the attribute data for generating feature and automatic driving vehicle unique characteristics for characterizing ambient enviroment (is referred to as spy Levy data).If each feature of automatic driving vehicle and its ambient enviroment data representation of one dimension, this feature number According to dimension can be more than 100 dimension.
Further, feature extraction submodel 012 can be obtained from the pretreated detection data appoints with the target It is engaged in associated representative detection data, which is referred to as characteristic features (Representation State) data.The key input of the representativeness detection data as plan model 02, directly affects and determines plan model 02 implementing result, therefore the precision of representative detection data selected by the sensor model 01 will have a direct impact on the control system The control effect of system.
Optionally, with reference to Fig. 3, which can also include knowledge base 04, which, which can store, is used for Assist the data (being referred to as knowledge) of each model running.Exemplary, which may include for assisting the perception mould Type 01 obtains the perception data of representative detection data, for assisting the plan model 02 to determine the planning number of target state data Accordingly and for assisting the Controlling model raw 03 at the control data of control parameter.The knowledge base 04 can use table or matrix Etc. forming storing data.Alternatively, knowledge base 04 can also combine geometry using other more complicated formation, such as simplicial complex etc. The traveling of shape stores the data, and it is not limited in the embodiment of the present invention.
The perception data stored in the knowledge base 04 may include the corresponding perception data of different task, and this feature extracts son After model 012 gets goal task and the detection data of input, it can be obtained from the knowledge base 04 and the goal task Associated perception data.The perception data may include the data that the sensor model can be assisted to obtain representative detection data. For example, the perception data may include the type of representative detection data.When the sensor model is based on deep learning (Deep Learning, DL) or the obtained model of the machine learning methods training such as intensified learning (Reinforcement Learning, RL) When, which can also include the parameter of the sensor model corresponding with the goal task.The parameter may include perception The model parameter of model, input and output parameter.
It may include steps of as shown in figure 5, sensor model is obtained with the process of the associated perception data of goal task:
Step 1021 determines the scene that the smart machine is presently according to the detection data.
Sensor model can determine current accessed testing number according to the corresponding relationship of detection data and scene identity It is determined as the scene that smart machine is presently according to corresponding scene identity, and by the scene that the scene identity indicates.
It is exemplary, it is assumed that sensor model inquires the corresponding field of detection data currently obtained from the corresponding relationship relationship Scape is identified as 1, then the scene that can be indicated the scene identity 1 is determined as the scene that automatic driving vehicle is presently in.
Step 1022, Detection task, scene and perception data corresponding relationship in whether recorded and worked as with the smart machine The preceding locating corresponding perception data of scene.
In embodiments of the present invention, it can store the corresponding relationship of task, scene and perception data in control system.By It is more various in the scene that smart machine can be located, therefore sensor model is after determining the scene that smart machine is presently in, it can be with First detect the corresponding perception data of the scene for whether recording and being presently in the corresponding relationship.
When the corresponding perception data of the scene that record has the smart machine to be presently in the corresponding relationship, sensor model Step 1023 can be executed;When the corresponding scene identity of the detection data has not been obtained in sensor model in above-mentioned steps 1021, Or the scene identity that gets of sensor model is when being not recorded in the corresponding relationship of the task, scene and perception data, perception Model can determine the corresponding perception data of scene for not recording that the smart machine is presently in the corresponding relationship, and can hold Row step 1024.
It is exemplary, it is assumed that smart machine is automatic driving vehicle, then task, scene and the perception stored in its control system The corresponding relationship of data can be as shown in table 1.Reference table 1, the scene that the task and scene identity that task identification is 10 are 1 Corresponding perception data may include the type of representative detection data, and the type of the representativeness detection data includes: lane Line curvature, vehicle present speed, vehicle at a distance from lane center and the type of front obstacle, speed and with this The distance of vehicle.And perception data corresponding to the scene that the task and scene identity that task identification is 20 are 3 then may include The model parameter of the sensor model, inputs the type of parameter: lane line curvature, the type of front obstacle, speed and with this The type of vehicle distance and output parameter: abstract characteristics 1 and abstract characteristics 2.Due to based on machines such as deep learning or intensified learnings The representative detection data that the sensor model that the training of device learning method obtains is exported is not directly to choose from detection data , but obtained after handling the detection data of input, therefore the representative detection data of its output is properly termed as taking out As feature or hidden feature.
Table 1
Step 1023 obtains perception data corresponding with the scene that the goal task and the smart machine are presently in.
When the corresponding perception data of the scene that record has the smart machine to be presently in the corresponding relationship, sensor model The scene identity of scene can be directly presently according to the task identification of goal task and smart machine, from the corresponding relationship It is middle to obtain corresponding perception data.Wherein, the task identification of the goal task can carry in this is executed instruction, or perception Model can determine the task identification of the goal task with the corresponding relationship of task according to the pre-stored data and mark.
It is exemplary, it is assumed that the task identification of goal task is 10, and automatic driving vehicle is presently in the scene identity of scene It is 1, then the corresponding relationship according to shown in table 1, the perception data that sensor model is got can be the class of representative detection data Type: lane line curvature, vehicle present speed, vehicle at a distance from lane center, the type of front obstacle, speed and with This vehicle distance.
Step 1024 determines similar with the scene that this is presently in similar scene from the corresponding relationship, and obtain and The goal task and the corresponding perception data of the similar scene.
It is Unrecorded new in the corresponding relationship of the task, scene and perception data in the scene locating for the smart machine When scene, in the scene that sensor model can be recorded from the corresponding relationship, the determining scene being presently in the smart machine Similar similar scene, and obtain perception data corresponding with the goal task and the similar scene.
Optionally, sensor model is when determining similar scene similar with the scene that smart machine is presently in, Ke Yifen Do not calculate the similarity of the corresponding detection data and the detection data currently got of each scene in the corresponding relationship, and by phase It is determined as the similar scene like the corresponding scene of highest detection data is spent.
It is exemplary, it is assumed that the sensor model scene that identified automatic driving vehicle is presently in above-mentioned steps 1021 Scene identity be 5.The scene identity is not recorded in the corresponding relationship as shown in table 1 then, sensor model can be distinguished Calculate the similarity of each scene identity corresponding detection data and the detection data currently got in scene identity 1 to 3.If The wherein corresponding detection data of scene identity 2 and the similarity highest of detection data currently got, then sensor model can be with The scene that the scene identity 2 indicates is determined as similar scene similar with the scene that the scene identity 5 indicates.Further, The sensor model can obtain perception number corresponding with the task identification 10 and scene identity 2 from corresponding relationship shown in table 1 According to.
When not recording perception data corresponding to the scene being presently in corresponding relationship, by obtaining similar scene Perception data enables the control system to rapidly adapt to new scene, and the adaptability of the control system is stronger, and application scenarios are not The application scenarios by training sample data are limited again, effectively increase the application flexibility of the control system and expansible Property.
In embodiments of the present invention, the part perception data obtained recently can also be stored in sensor model 01, and The corresponding relationship of above-mentioned task, scene and perception data then can store in knowledge base 04.Therefore, sensor model 01 is in determination After the scene that smart machine is presently in, can first judge locally whether to be stored with corresponding with the goal task and the scene Perception data.If so, then sensor model 01 can directly acquire the corresponding perception data.Otherwise, sensor model 01 can be with The scene identity for the scene that pretreated detection data or smart machine are presently in is sent to knowledge base 04.Knowledge base After 04 receives the data of the transmission of sensor model 01, available perception number corresponding with smart machine is presently in scene According to, and the perception data is fed back into sensor model 01.
In an optional implementation manner, if the perception data that the sensor model 01 is got includes closing with goal task The type of the representative detection data of connection, then the feature extraction submodel 012 in the sensor model 01 can be directly from pretreatment The detection data that the type is extracted in detection data afterwards, to obtain the representativeness detection data.
It is exemplary, it is assumed that pretreated detection data includes: lane line curvature k1, vehicle present speed v1, vehicle with The distance d1 of lane center, type b, the speed v2 and this vehicle distance d2, temperature t1 and wind speed v3 of front obstacle, from The dynamic speed v0, steering angle α for driving vehicle0.Sensor model 01 get with generation in the associated perception data of goal task The type of table detection data includes: lane line curvature, and vehicle present speed, vehicle is at a distance from lane center, front barrier Hinder the type of object, speed and with this vehicle distance.Then feature extraction submodel 012 is mentioned from above-mentioned pretreated detection data The representative detection data of taking-up may include: lane line curvature k1, vehicle present speed v1, vehicle and lane center away from From d1, type b, speed v2 and this vehicle distance d2 of front obstacle.
In another optional implementation, if the sensor model is based on engineerings such as deep learning or intensified learnings The model that the mode training of habit obtains, then the perception data may include: the parameter of model corresponding with the goal task.Pass through When the sensor model extracts representative detection data associated with the goal task, directly the detection data and target can be appointed Business is input to the sensor model using the parameter, and the output of the sensor model is and the associated representative detection of the goal task Data.
Exemplary, which can be the neural network model that the training of the mode based on deep learning obtains.Such as It can be Recognition with Recurrent Neural Network (recurrent neural network, RNN) model or convolutional neural networks (Convolutional Neural Networks, CNN) model.Fig. 6 is a kind of sensor model provided in an embodiment of the present invention Architecture diagram, as shown in fig. 6, the sensor model can be the neural network model of multilayer interconnection, every layer of neural network is by multiple minds It is formed through member.It may include the weight of each neuron in the parameter for the model in perception data that the sensor model is got (Weight).After control device gets the parameter of the model, it can configure and correspond to for each neuron in the sensor model Weight, then pretreated detection data and goal task can be input to the sensor model, and by the perception mould The output of type is determined as representative detection data.
Optionally, as it was noted above, can also include the defeated of the sensor model in the perception data that sensor model is got Enter the type of parameter.Then the control device can be chosen from pretreated detection data according to the type of the input parameter The detection data of corresponding types is simultaneously input to the sensor model, obtain the sensor model output with the goal task associated generation Table detection data.
It is exemplary, it is assumed that the type of the input parameter of the sensor model in the perception data includes: feature 1, feature 2 and spy Sign 3, then control device can first from pretreated detection data selected characteristic 1 to totally 3 seed types of feature 3 testing number According to, and the detection data of 3 seed type is input to sensor model.It later, can be by abstract characteristics 1 that the sensor model exports Plan model is provided to as representative detection data with abstract characteristics 2.
In embodiments of the present invention, the needs of feature extraction submodel 012 are extracted from detection data and can be accurately reflected The representative detection data of smart machine local environment and smart machine current state, for the processing of plan model 02.The representative Property detection data on the one hand need it is complete, on the other hand cannot artificial priori it is specified.Feature extraction submodel 012 is after pretreatment Detection data y in extract the probability of representative detection data h and can satisfy following mathematical model: P (y1:N,z1:N,h1:N)= ∏T=1 ... Np(yt|zt)p(ht|yt)p(ht|ht-1).Wherein, N refers to the N number of moment for calculating the probability, p (yt|zt) refer to that perception is melted The detection data z that zygote model 011 is exported in t moment from sensortMiddle selection ytThe probability of output, p (ht|yt) refer to that feature mentions The data y for taking submodel 012 to export in t moment from perception fusion submodel 011tMiddle selection htThe probability of output, p (ht|ht-1) Refer to that feature extraction submodel 012 chooses h at the t-1 momentt-1Under the premise of output, h is chosen in t momenttThe probability of output.∏ For quadrature symbol, calculation expression p (y is indicatedt|zt)p(ht|yt)p(ht|ht-1) N number of numerical value when t value is followed successively by 1 to N Product.
The working principle of the sensor model is it is to be understood that detection data of the sensor model based on input, passes through above-mentioned number The probability that model calculates the different representative detection data h of output is learned, and using the highest representativeness detection data h of probability as practical Output.
In another optional implementation, if the sensor model is based on engineerings such as deep learning or intensified learnings The model that the mode training of habit obtains, then can also include multiple perception submodules corresponding with different task in the sensor model Type.After pretreated detection data and goal task are input to the sensor model by control device, which can be true Fixed target apperception submodel corresponding with the goal task, and the pretreated detection data is input to target apperception Model.The output of the target apperception submodel is and the associated representative detection data of the goal task.Wherein, each perception The framework of submodel can be similar with the framework of sensor model shown in fig. 6, and details are not described herein again.
Optionally, in embodiments of the present invention, what sensor model was got may be used also with the associated perception data of goal task To include the environment empirical data analyzed and obtained of summarizing to the history environment data of smart machine.For example, for driving automatically Vehicle or intelligent robot are sailed, which may include: weather empirical data, road empirical data and obstacle At least one of object empirical data.Wherein, weather empirical data may include weather forecasting common sense data and weather forecast number According to.Road empirical data may include the attribute data of the different roads got in advance (such as in width, number of track-lines and lane Heart line curvature etc.).Barrier empirical data can be different types of barrier (including static-obstacle thing and dynamic barrier) General Properties data (such as mean size and average movement speed etc.).
It is exemplary, it is assumed that goal task is automatic follow the bus, then sensor model is available is associated with the automatic follow the bus task Perception data.For example, the perception data may include: the type of representative detection data, automatic driving vehicle is currently located The attribute data and barrier empirical data of road.Alternatively, being perceived if the goal task is automatic adjustment vehicle interior temperature What model obtained may include: the class of representative detection data with the associated perception data of automatic adjustment vehicle interior temperature task Type and weather empirical data.
Sensor model be also based on environment empirical data the representative detection data got is carried out it is perfect, to ensure The completeness and reliability of the representativeness detection data.For example, it does not include specified in perception data for working as in the detection data When data (such as temperature, the size of barrier or speed etc.) of some type, sensor model can be from the environment empirical data The middle data for extracting the type are as representative detection data.Alternatively, working as some generation that sensor model is extracted from detection data When the numerical value of table detection data is more than teachings, sensor model can be according to the number of same type in the environment empirical data According to being modified to the numerical value of the representativeness detection data.For example, it is assumed that the temperature that sensor model is extracted from detection data is 100 DEG C, then since the temperature is considerably beyond theoretical temperatures range, sensor model can be according in environment empirical data Weather empirical data determines Current Temperatures, and using the temperature of the determination as representative detection data.
The goal task and the representativeness detection data are input to plan model by step 103, obtain dbjective state number According to.
In embodiments of the present invention, which gets the representative detection data of sensor model transmission, and should After goal task, the behavior of smart machine can be planned, to determine the target state data of the smart machine.The target Status data is used to indicate the state reached needed for smart machine.For example, for automatic driving vehicle or intelligent robot, it should Target state data may include the position of the target point reached needed for smart machine, and speed at the target point and turn To data such as angles.
Optionally, plan model is available with the associated layout data of the goal task, and in the finger of the layout data The lower determination target state data is led, to guarantee the reliability and accuracy of the target state data finally determined.Wherein, should Layout data may include the control empirical data that the history control experience to smart machine is summarized, alternatively, the rule Drawing data can also include control theory data (for example, kinetic theory data and the physical knowledge of some common-senses etc.), The layout data can be used for the intention that auxiliary programming model determines smart machine.
For automatic driving vehicle or intelligent robot, the control empirical data may include: traveling empirical data and Travel at least one of regular data.Wherein, traveling empirical data may include: several roads that smart machine frequently travels Empirical data (such as accident rate, congestion rate, enclosing situation and the magnitude of traffic flow obtained according to big data analysis and potential The data such as accident point);Traveling regular data may include that the traveling rule for several roads that smart machine frequently travels is (such as single The driving direction on trade).When the plan model includes the model that the mode training based on machine learning obtains, the layout data It can also include the parameter of the model corresponding with the goal task.
It is exemplary, for automatic driving vehicle, it is assumed that goal task is automatic follow the bus task, which includes current The accident rate of road, representative detection data include vehicle present speed, preceding vehicle speed, and and leading vehicle distance.Then plan mould Type can determine the speed kept needed for automatic driving vehicle according to layout data and representative detection data.Wherein, it is representing Under the premise of property detection data is constant, the accident rate in the layout data is higher, automatic Pilot vehicle determined by the plan model The speed of holding needed for is lower.
Optionally, it can store the part layout data obtained recently in plan model 02, and above-mentioned task and rule The corresponding relationship for drawing data then can store in knowledge base 04.Therefore, plan model 02 is receiving the transmission of sensor model 01 Representative detection data and the goal task after, can first judge local whether be stored with the corresponding planning of the goal task Data, if so, can then directly acquire the corresponding layout data;Otherwise, plan model 02 can be obtained from knowledge base 04 Take the corresponding layout data of the goal task.
Fig. 7 is a kind of architecture diagram of plan model provided in an embodiment of the present invention, as shown in fig. 7, the plan model 02 can To include Intention Anticipation submodel 021, be intended to decompose submodel 022 and intention execution submodel 023.The Intention Anticipation submodule Type 021 can be based on the goal task, and the environmental data in the representative detection data got, predict the smart machine Intention.Later, which decomposes submodel 022 and can decompose to the intention, obtains one or more subtasks.The meaning Figure, which executes submodel 023, can determine target-like corresponding with each subtask according to layout data and representative detection data State data.
Wherein, it is intended that the intention that prediction submodel 021 is predicted may include that global intention and part are intended to, it is intended that decompose sub 022 pair of the model each subtask for being intended to obtain after decomposing is referred to as atom intention.Overall situation intention refers to that the intelligence is set The macro-goal of standby required realization, such as under automatic Pilot scene, if the goal task is to travel from A point to B point, this is complete Office is intended to travel from the A point to the driving trace of B point (i.e. navigation information) for automatic driving vehicle.This is locally intended to can be with It is that combining environmental data are intended to the intention decomposed to the overall situation, such as under automatic Pilot scene, this is locally intended to It may include: to keep current lane traveling or lane-change traveling etc. to be intended in A point to some section of B point.Atom intention can To be that the minimum for generating control parameter decomposed to overall situation intention and part intention is intended to, such as can wrap Include the intention such as acceleration or brake.
It in an optional implementation manner, may include Intention Anticipation number in the layout data that plan model is got According to, be intended to decomposition data and be intended to execute data.The Intention Anticipation data can assist Intention Anticipation submodel 021 to be intended to Prediction.Under automatic Pilot scene, which may include traveling empirical data and travels regular data, such as can To include the accident rate and congestion rate of several roads.Intention Anticipation submodel 021 can be according to A point to the accident in each section of B point Rate and congestion rate determine that automatic driving vehicle is travelled from A point to the driving trace of B point.
The intention decomposition data can be for that will be intended to be decomposed into the rule of subtask, which may include each intention institute Corresponding one or more subtask.For example, it is assumed that being intended to the right-hand rotation of 100 meters of front, then the corresponding subtask of the intention can be with Include: to change lane to rightmost side lane, turns right.Being intended to decomposition submodel 022 can be based on the intention decomposition data to Intention Anticipation The intention that submodel 021 exports is decomposed, and one or more subtasks are obtained.For example, it is assumed that goal task is to overtake other vehicles, with this The associated intention decomposition data of task of overtaking other vehicles may include: to accelerate, change left side road, change the right road and deceleration.If plan model obtains The environmental data in representative detection data got includes: lane line curvature k1, vehicle present speed v1, in vehicle and lane The distance d1 of heart line, type b, speed v2 and this vehicle distance d2 of front obstacle.And it can be with based on lane line curvature k1 Determine that current lane is straight way, the speed v2 of front obstacle is less than preset threshold.Then the intention decompose submodel 022 can be with The Task-decomposing that will overtake other vehicles is four subtasks: accelerating, changes left side road, changes the right road and deceleration.
It can be that the rule of target state data is determined based on subtask and representative detection data that the intention, which executes data, The rule can be corresponding relationship, or physics or mathematical formulae.Being intended to execution submodel 023 can be according to the rule Then, representative detection data is handled, obtains the corresponding target state data in each subtask.For example, for subtask: Change left side road, it is intended that dynamics formula and mathematical formulae can be used by executing submodel 023, to vehicle in representative detection data With at a distance from lane center, front obstacle and this vehicle distance etc. data calculated, obtain the left side being moved to needed for vehicle The position of the target point of by-pass.
Optionally, above-mentioned Intention Anticipation submodel 021 and intention execute submodel 023 and are also possible to based on machine learning The model that mode training obtains.Correspondingly, the Intention Anticipation data can be the model parameter of Intention Anticipation submodel 021, meaning Figure executes data can execute the model parameter of submodel 023 for intention.
In another optional implementation, which be can be based on the machines such as deep learning or intensified learning The model that the mode training of device study obtains.The layout data may include the parameter of model corresponding with the goal task.Rule Model is drawn when obtaining target state data, parameter configuration can be carried out using parameter corresponding with the goal task, then be The representative detection data and goal task of input can be handled, and export target state data.
In another optional implementation, which be can be based on the machines such as deep learning or intensified learning The model that the mode training of study obtains, and can also include multiple planning submodules corresponding with different task in the plan model Type.After representative detection data and goal task are input to the plan model by control device, the plan model can determine with The corresponding goal programming submodel of the goal task, and the representativeness detection data can be input to the goal programming submodule Type.The output of the goal programming submodel is target state data.
Optionally, if the smart machine is automatic driving vehicle or intelligent robot, which is being obtained To after representative detection data, it is also based on the attribute data of the barrier in the ambient enviroment got, to the barrier Motion track predicted, and prediction result is sent to plan model.Plan model can be in conjunction with the prediction result to intelligence Reasonable decision is made in the behavior of energy equipment, that is, determines the target state data.If the goal task is traveling task, the rule Drawing model also needs the position current in conjunction with route planning information and smart machine, determines the target state data.
Step 104, by the target state data, and partly or entirely, the representativeness detection data is input to control mould Type obtains the control parameter for controlling the smart machine.
In embodiments of the present invention, which can be the model obtained based on control theory data initialization.It should Control theory data may include: the physical knowledge (example of kinetic theory data (such as mechanics law) and some common-senses The coefficient of friction on such as road surface).
Optionally, it after which receives the target state data that plan model is sent, can first obtain and the mesh The associated control data of mark task.The control data can be used for that control is assisted to draw model to generate control parameter.The control data It may include control theory data.When the Controlling model is the mode training learnt based on deep learning or intensified learning even depth When obtained model, which may include the parameter of model corresponding with the goal task.The Controlling model can be Control data guidance under, based on sensor model output representative detection data, plan model output target state data, Generate the control parameter.
Optionally, the part control data obtained recently be can store in Controlling model 03, and can in knowledge base 04 To be stored with task and control the corresponding relationship of data.Therefore, Controlling model 03 is in the target-like for receiving plan model transmission After state data, can first judge locally whether be stored with the corresponding control data of the goal task, if so, then can directly obtain Take the corresponding control data;Otherwise, Controlling model 03 can obtain the corresponding control number of the goal task from knowledge base 04 According to.
If the target state data of plan model output includes target state data corresponding with each subtask, control Device can be by one or more subtasks, and the corresponding target state data in currently pending subtask and the representativeness are examined Measured data is input to Controlling model, obtains control parameter corresponding with the currently pending subtask.
As a kind of optional implementation, which can train obtained mould for the mode based on machine learning Type.Correspondingly, the control data may include the model parameter and Controlling model of Controlling model corresponding with the goal task The type (inputting the type of parameter) of the representative detection data of middle required input.It that is to say, for different tasks, the control The model parameter of simulation is different, and the type of the representative detection data inputted is also different.Corresponding, above-mentioned steps 104 can To include:
Step 1041a, the representative detection data of corresponding types is obtained from the representativeness detection data.
It is exemplary, it is assumed that corresponding with the automatic follow the bus task, the type of the input parameter of the Controlling model includes: lane The speed of line curvature and front obstacle.If the representative detection data of sensor model output includes: lane line curvature k1, vehicle Present speed v1, vehicle and lane center distance d1, type b, speed v2 and this vehicle distance d2 of front obstacle.Then The representative detection data for the corresponding types that Controlling model is got from the representative detection data of input may include: lane Line curvature k1, the speed v2 of front obstacle.
Optionally, corresponding with automatic backing task, the type of the input parameter of the Controlling model may include: and rear The distance between barrier, at a distance from the barrier of left side and at a distance from the barrier of right side.With automatic Pilot to specified mesh Ground task it is corresponding, the type of the input parameter of the Controlling model may include: lane line curvature, with front obstacle The speed of distance and front obstacle.
According to above-mentioned analysis it is found that for different tasks, the representative detection data of Controlling model required input can be with It is all representative detection data of sensor model output, is also possible to part representativeness detection data.Optionally, with each It is engaged in corresponding, the type of the input parameter of Controlling model can be what developer set previously according to experience.
Step 1042a, by using the Controlling model of the model parameter to target state data and the representativeness got Detection data is handled, and the control parameter for controlling the smart machine is obtained.
The neural network model that the Controlling model can obtain for the mode training based on machine learning, such as can be RNN model or CNN model.Model parameter in the control data may include each neuron in the neural network model Weight.After Controlling model gets the model parameter, corresponding weight can be configured for its each neuron.It later, can be right The target state data of input and the representative detection data got are handled, and the control for controlling the smart machine is obtained Parameter processed.
If the target state data includes target state data corresponding with each subtask, for each subtask, Control device can be all made of the Controlling model to the corresponding target state data in the subtask and the representativeness detection data It is handled, to determine control parameter corresponding with each subtask.
By taking automatic Pilot scene as an example, as described in Figure 8, it is assumed that the output parameter of Controlling model includes acceleration and steering wheel Steering angle, the representative detection data inputted in Controlling model are as follows: representative detection data 1: lane line curvature k1 is represented Property detection data 2: the speed v2 of front obstacle.Then Controlling model can to plan model export target state data and This two representative detection datas are handled.If the acceleration of Controlling model output is a1, the steering angle of steering wheel is α1, then control device can be by the control bus of automatic driving vehicle, transmission and power device to the automatic driving vehicle It is controlled, so that the acceleration of the automatic driving vehicle is a1, the steering angle of steering wheel is α1
It can also include corresponding with different task multiple in the Controlling model as in another optional implementation Submodel is controlled, each control submodel can be based on machine learning such as deep learning, intensified learning or deeply study Mode training obtain.By target state data and partly or entirely, representative detection data is input to the control to control device After model, which can determine target control submodel corresponding with the goal task, and by the representativeness testing number The target control submodel is input to according to the target state data.The output of the target control submodel is control parameter.
Wherein, the type for being input to the representative detection data of the Controlling model can be control device and be appointed based on the target Business determination.For example, control device is available with the associated control data of goal task, with the associated control of the goal task It may include the type of the representative detection data of Controlling model required input in data.It that is to say, it, should for different tasks The type of the representative detection data of Controlling model required input can be different.Control device can be based on closing with the goal task The type specified in the control data of connection obtains the representativeness of corresponding types from the representative detection data that sensor model exports Detection data, and it is input to the Controlling model.
As another optional implementation, which may include corresponding with the goal task for generating First rule of the control parameter, and the type of the representative detection data for generating the control parameter.It that is to say, for Different tasks, the rule for generating the control parameter is different, and the representative detection data for generating the control parameter Type it is also different.Corresponding, above-mentioned steps 104 may include:
Step 1041b, the representative detection data of corresponding types is obtained from the representativeness detection data.
It is exemplary, it is assumed that it is corresponding with the automatic follow the bus task, for generating the representative detection data of the control parameter Type include: lane line curvature and vehicle present speed.The representative detection data of sensor model output includes: lane line song Rate k1, vehicle present speed v1, vehicle and lane center distance d1, type b, speed v2 and this vehicle of front obstacle Distance d2.The representative detection data for the corresponding types that then Controlling model is got from the representative detection data of input can be with It include: lane line curvature k1, vehicle present speed v1.
Step 1042b, using this first rule to target state data and the representative detection data got at Reason, obtains the control parameter for controlling the smart machine.
In embodiments of the present invention, which can join for target state data, representative detection data and control Several corresponding relationships.Controlling model gets the target state data of plan model transmission, and gets the generation of corresponding types After table detection data, corresponding control parameter can be directly obtained from the corresponding relationship.
Alternatively, first rule can also be the public affairs between target state data, representative detection data and control parameter Formula (such as physical equation or mathematical formulae).Controlling model can will acquire the target state data of plan model transmission, with And the representative detection data of the corresponding types got is brought into the formula, so that control parameter be calculated.
It is exemplary, it is assumed that it is corresponding with the automatic follow the bus task, for generating control parameter: the first rule of acceleration a For mathematical formulae f1, then the expression formula of acceleration a can be with are as follows: a=f1(s, k, v), wherein s is target state data, and k is vehicle Diatom curvature, v are vehicle present speed.If the target state data that Controlling model receives is s1, the corresponding types got Representative detection data are as follows: lane line curvature k1, vehicle present speed v1.Then Controlling model above-mentioned parameter can be brought into For generating the mathematical formulae f of acceleration1In, to obtain the acceleration a2 for controlling automatic driving vehicle, the acceleration A2 meets: a2=f1(s1, k1, v1).
As another optional implementation, which may include the control submodel for calculating weight, And submodel is calculated for calculating the one or more of the control parameter.Wherein, which can be based on depth The model that the mode training of the machine learning such as study or intensified learning obtains, the calculating submodel can be based on control theory number According to the calculation formula (such as physical equation or mathematical formulae) determined after initialization.The control data may include: and each meter The type of the corresponding one group of input data of Operator Model.Alternatively, the control data also may include the control theory data, i.e., should One or more submodels that calculate for calculating control parameter are also possible to what Controlling model was got from the control data. Corresponding, above-mentioned steps 104 may include:
Step 1041c, it is obtained and each calculating submodel pair from the representativeness detection data and the target state data The one group of target input data answered.
It is each to calculate the representative detection that the corresponding one group of target input data of submodel may include: at least one type Data, and/or, the target state data of at least one type.And any two calculate the corresponding target input data of submodel Included data type can be entirely different, can also part it is identical, it is not limited in the embodiment of the present invention.
It is exemplary, it is assumed that include by calculating control parameter in the Controlling model: based on two of the steering angle of steering wheel Operator Model, one group of target input data corresponding with first calculating submodel includes: lane line curvature, is calculated with second The corresponding one group of input data of submodel includes: vehicle and lane line center line distance.If the representative inspection of sensor model output Measured data includes: lane line curvature k1, vehicle present speed v1, vehicle and lane center distance d1, front obstacle Type b, speed v2 and this vehicle distance d2.Then Controlling model got from the representative detection data of input with first Calculating the corresponding one group of target input data of submodel may include: lane line curvature k1;It is corresponding with second calculating submodel One group of target input data may include: vehicle and lane center distance d1.
Step 1042c, every group of target input data is input to corresponding calculating submodel respectively, it is defeated obtains every group of target Enter the value of control parameter corresponding to data.
Wherein, each submodel that calculates can formula between corresponding one group of input data and the value of control parameter (such as physical equation or mathematical formulae).Controlling model can will acquire every group of target input data and be brought into respectively to corresponding In formula, so that the value of the corresponding control parameter of every group of target input data be calculated.
It is exemplary, it is assumed that include for calculating control parameter in control data corresponding with the automatic follow the bus task: direction Two calculating submodels of the steering angle α of disk are mathematical formulae, and first calculating submodel are as follows: α=f2(k), second A calculating submodel are as follows: α=f3(d).Wherein, k is lane line curvature, and d is vehicle at a distance from lane center.If controlling mould One group of target input data corresponding with first calculating submodel that type is got are as follows: lane line curvature k1 is counted with second The corresponding one group of target input data of Operator Model are as follows: vehicle and lane center distance d1.Then Controlling model can be by vehicle Diatom curvature k1 substitutes into corresponding mathematical formulae f2, obtain the value of the corresponding steering angle of one group of target input data are as follows: α2 =f2(k1).Similarly, vehicle and lane center distance d1 can be substituted into corresponding mathematical formulae f by Controlling model3, obtain The value of the corresponding steering angle of one group of target input data are as follows: α3=f3(d1)。
Step 1043c, by the target state data, and partly or entirely representative detection data is input to control submodule Type obtains one group of weight.
It may include the model parameter of the control submodel corresponding with goal task in the control data.Controlling model can To carry out parameter configuration to control submodel according to the model parameter got.It later, can be by target state data and portion Divide or all representative detection data is input to the control submodel using the model parameter, to obtain one group of weight.This one Group weight may include multiple weights.Wherein, be input to the representative detection data of control some or all of submodel can be with Be determined according to goal task, such as can be Controlling model according to recorded in the associated control data of goal task What the type of characteristic features data determined.
Exemplary, Fig. 9 is a kind of architecture diagram for controlling submodel provided in an embodiment of the present invention, as shown in Figure 9, it is assumed that control The part representativeness detection data that simulation is got are as follows: representative detection data 3: lane line curvature k1, representative testing number According to 4: vehicle and lane center distance d1.The target state data and be somebody's turn to do that then Controlling model can export plan model Two representative detection datas are input to control submodel.One group of weight of control submodel output may include: with first A weight for calculating the corresponding one group of target input data of submodel is w1, and one group corresponding with second calculating submodel The weight w2 of target input data.
Optionally, in embodiments of the present invention, can also join including control with the associated control data of the goal task Several a reference values, a reference value can be a constant, for reflecting other implicit influences of the related data to the control parameter.Phase It answers, the output parameter of the control submodel can also include weight corresponding with a reference value.It is exemplary, as shown in figure 9, should Control submodel can export weight w3 corresponding with a reference value of the steering angle of steering wheel.
Step 1044c, according to one group of weight and the value of the corresponding control parameter of each group target input data, really The target value of the fixed control parameter.
In a kind of optional implementation of the embodiment of the present invention, Controlling model can be by the power of every group of target input data The value of weight control parameter corresponding with this group of target input data is multiplied, and obtains the corresponding product of every group of target data, later Again by the corresponding product addition of each group target input data, the target value of the control parameter can be obtained.That is the Controlling model The value of the corresponding control parameter of each group target input data can be weighted according to the weight of each group target input data Summation, to obtain the target value of the control parameter.
It is exemplary, it is assumed that the value of the steering angle based on the obtained steering wheel of lane line curvature k1 is α2, it is based on vehicle The value of the steering angle of the steering wheel obtained with lane center distance d1 is α3, a reference value of the steering angle of steering wheel For α0.Also, lane line curvature k1 respective weights w1, vehicle and lane center distance d1 respective weights w2, steering angle The a reference value respective weights w3 of degree.After then Controlling model is weighted summation to the value of the steering angle of above-mentioned steering wheel, obtain The target value α of the steering angle of the direction disk arrivedavIt can satisfy:
αav=w1 × α2+w2×α3+w3×α0
In another optional implementation of the embodiment of the present invention, Controlling model can be by every group of target input data pair The value of the weight answered control parameter corresponding with this group of target input data is multiplied, and it is corresponding to obtain every group of target input data Product.Controlling model can be by the corresponding product of each group target input data later, and numerical value maximum or the smallest product are made For the target value of the control parameter.Alternatively, Controlling model can also first choose numerical value maximum or the smallest product is corresponding One group of target input data, then again using the value of the corresponding control parameter of one group of target input data selected as this The target value of control parameter.
In the embodiment of the present invention in another optional implementation, it can also be previously stored with based in Controlling model The weighted sum algorithm of the target value is calculated, Controlling model can be based on the weighted sum algorithm, using one group of weight to each The value of the corresponding control parameter of group target input data is weighted summation, to obtain the target value of the control parameter.
Optionally, above-mentioned steps 1043c can also be executed before step 1042c, i.e., Controlling model can first obtain one Group weight, then calculates the value of the corresponding control parameter of every group of target input data, and the part in one group of weight again Weight can also be used as the input parameter of calculating submodel to calculate the value of the control parameter.
It is exemplary, it is assumed that one group of weight of control submodel output includes: w1, w2 and w3.Calculate the steering angle of steering wheel It spends in two calculating submodels of α, the corresponding one group of target input data of first calculating submodel may include: vehicle wheelbase W, the turning radius R2 at the turning radius R1, next aiming spot of current position, with lane center distance d with And weight w1.This first calculating submodel can be with are as follows:
α2=asin (W/ ((R1+R2-d) * 0.5*w1)), wherein asin is arcsin function.Turning radius can refer to The distance between longitudinal direction of car (i.e. length direction) plane of symmetry and instantaneous center of turn O.
The corresponding one group of target input data of second calculating submodel may include: with lane center linear distance d and The farthest identification distance A_d of vehicle, this second calculating submodel can be with are as follows: α3=asin (d/A_d).
What is stored in Controlling model is used to calculate the target value α of steering angleavWeighted sum algorithm can be with are as follows:
αav=w2* α2+(1-w2)*(w3*α3+(1-w3)*α0).Wherein, α0For a reference value of the steering angle of steering wheel.
Step 105 executes the goal task based on control parameter control smart machine.
In embodiments of the present invention, which can pass through the control bus of smart machine and the bottom of smart machine Drive module (such as transmission and power device of automatic driving vehicle) connection.It, can be with base after control device obtains control parameter Operational order is generated in the control parameter, and the operational order is sent to the bottom layer driving module of smart machine, which refers to Order can serve to indicate that the bottom layer driving module drive smart machine executes corresponding operation, that is, execute the goal task.For Automatic driving vehicle, the operation are generally the steering angle of adjustment direction disk, adjust acceleration, trample throttle or brake etc..
For example, it is assumed that the steering angle that the control parameter includes: steering wheel is αav, then control device can control drives automatically Sailing vehicle and adjusting the steering angle of its steering wheel is αav
The control method of smart machine provided in an embodiment of the present invention, the detection data that can be will acquire and the target are appointed Business is input to sensor model, obtains and the associated representative detection data of the goal task.Then it by goal task and can be somebody's turn to do Representative detection data is input to plan model, obtains target state data.It later can be by the target state data and the generation Table detection data is input to Controlling model, obtains the control parameter for controlling the smart machine.Can finally be based on should Control parameter controls the smart machine.Since the Controlling model is obtained based on control theory data initialization, control reason The control law and principle that smart machine can be directly expressed and reflected by data, compared to directlying adopt training in the related technology Sample is trained, and is not only reduced dependence of the Controlling model to training sample, is improved training effectiveness, it may also be ensured that intelligence The control effect of energy equipment.
In embodiments of the present invention, in order to further improve the control effect to smart machine, control device can also be right The control effect of control parameter is evaluated, and is based on the evaluation result, the ginseng of one or more models in adjustment control system Number.Figure 10 is the method flow diagram of the parameter of each model in a kind of adjustment control system provided in an embodiment of the present invention, with reference to figure 10, this method may include:
Step 106, based on control parameter control smart machine after, obtain the new status data of smart machine.
Control device it is new can to obtain smart machine after executing corresponding operation based on control parameter control smart machine Status data.Similar with detection data, it is collected which can be the sensor being arranged on smart machine. In embodiments of the present invention, the type of the new status data can be identical as the type of the detection data;Alternatively, can be with this The type for the representative detection data that sensor model extracts is identical;Or task and new can also be stored in control device Status data type corresponding relationship, which can be based on the corresponding relationship, determine that goal task is corresponding new Status data type, and obtain the new status data of corresponding types.
Step 107, according to the new status data and the goal task, determine control effect.
By taking automatic Pilot scene as an example, it is assumed that the goal task is to travel (i.e. ideally, vehicle along lane center It is 0) that then control device can be according to vehicle in new status data at a distance from lane center at a distance from lane center The difference of d and 0 that is to say the size of the vehicle with lane center distance d, determine the control effect of control parameter.Also, Distance d is smaller, and control device can determine that control effect is better.
Since the type of the control parameter generated for different task, control device is different, control device control intelligence is set Complexity when standby is also different, therefore for different tasks, control device can also be using different evaluation algorithms come really Its fixed control effect.
With reference to Fig. 3, the control system in control device can also include evaluation model 05, the evaluation model 05 or knowledge It can store the corresponding relationship of task and evaluation algorithms in library 04.Evaluation model 05 gets the new status number of smart machine Accordingly and after the goal task, evaluation algorithms corresponding with goal task can be obtained, and can use from the corresponding relationship The evaluation algorithms corresponding with goal task got determine control effect.
For relatively simple task (such as automatic follow the bus or along lane center travel), which can be Calculation formula between new status data and evaluation result.After control system gets new status data, can directly by It is brought into the calculation formula, so that the evaluation result for reflecting control effect quality be calculated.
It is exemplary, by taking automatic follow the bus task as an example, the target of automatic follow the bus task be maintained a certain distance with front truck, and And this vehicle is maintained in lane line.Control system is executed instruction based on the automatic follow the bus received, and control smart machine executes , can be according to the corresponding relationship of task and new status data type after corresponding operation, the determining and automatic follow the bus task pair The new status data type answered, and obtain the new status data of the corresponding types.Hypothesis evaluation model 05 is got new Status data include: vehicle and lane center distance D1, with front vehicles distance D2.With the automatic follow the bus task pair The evaluation algorithms answered are formula f0, according to formula f0The evaluation result being calculated can satisfy:
S=f0(b×D1,(1-b)×D2).Wherein, b is the predetermined coefficient more than or equal to 0, and less than or equal to 1.
For complex task, which can train obtained evaluation algorithms mould for the mode based on machine learning Type.Exemplary, which can be obtained based on the training of the mode of intensified learning (such as can be with reference in intensified learning The implementation of value network), or can be obtained using the deep learning method training based on supervised learning.Also, this is commented The training process of valence algorithm model can be off-line training or on-line training, and it is not limited in the embodiment of the present invention.
With reference to Figure 11, the input parameter of the evaluation algorithms model can also include perception in addition to including new status data The representative detection data of model output.New status data and representative detection data can be input to by the evaluation model The evaluation algorithms model, to obtain evaluation result.The evaluation result can be one and be more than or equal to 0, and be less than or equal to 1 number Value, and the quality of numerical values recited and evaluation result is positively correlated, i.e., numerical value is bigger, shows that evaluation result is better, i.e., control effect is got over It is good.If numerical value is less than some threshold value, show that evaluation result is poor, i.e. the control effect of the control device do not meet it is expected or It is unsatisfactory for requiring.If the statistical value (such as average value) of evaluation result whithin a period of time is always lower than some threshold value, control Device processed can determine that the operating status of some model or entire control system in the control system is poor, need to each mould The parameter of type is adjusted.
Step 108, according to the control effect, adjust the parameter of the control system.
The parameter may include at least one of the model parameter of the control system, input and output parameter.It can Choosing, as shown in figure 3, since the control system may include sensor model, plan model and Controlling model, the evaluation mould After type determines control effect, as a kind of optional implementation, which can be sent to each by evaluation model Model.One or more models in the sensor model, plan model and Controlling model can according to the control effect to its from The parameter of body is adjusted, and the parameter of each model adjustment may include model parameter, in input and output parameter At least one.As another optional implementation, which can adjust separately each according to the evaluation effect The parameter of model, or the random parameter for adjusting wherein several models, or, it can be according to the class of preset model Type adjusts the parameter of the model of corresponding types.
Exemplary, evaluation model 05 can send respectively to sensor model 01, plan model 02 and Controlling model 03 and be used for Reflect the evaluation result of control effect quality.Each model can be adjusted when the numerical value of the evaluation result is less than preset threshold Itself whole parameter.By taking sensor model 01 as an example, if sensor model 01 detects that the numerical value of the control effect is less than preset threshold, It can determine the scene that the type for the representative detection data currently chosen is not suitable for the goal task and is presently in, therefore The type of adjustable representative detection data corresponding with the goal task and the scene being presently in.Alternatively, if the perception Model 01 is the model that the mode training based on deep learning obtains, then the weight of the adjustable each neuron of sensor model 01, Or the type of the representative detection data of the sensor model 01 output.
Optionally, in embodiments of the present invention, each model get with can be in the associated data of goal task Including the constrained parameters being defined for the adjusting range of the parameter to the model.Correspondingly, each model is according to control When effect adjusts the parameter of itself, it can be adjusted in the range of the constrained parameters limit.It is possible thereby to guarantee control system The output of system tallies with the actual situation demand, and then safety and reliability when may insure to control smart machine.
With reference to Fig. 3 as can be seen that the sensor model 01, plan model 02 and Controlling model 03 are combined closely, each model Input parameter or the adjustment of output parameter may will affect adjacent model, therefore some model is inputting parameter or defeated to it When parameter is adjusted out, the parameter of consecutive phantom also needs to adjust accordingly.For example, if sensor model adjusts its output ginseng Number, i.e., the type of representative detection data, correspondingly, the input parameter of plan model also will be adjusted correspondingly.
In embodiments of the present invention, the control effect that control device can generate every time according to evaluation model, constantly to it Parameter carries out on-line tuning, so as to constantly improve itself model, improves its control effect.The control of evaluation model feedback Effect processed is more direct, so that adjustment direction when control device progress parameter adjustment is more acurrate.
By taking sensor model 01 as an example, due to being directed to complex task or complex scene, the representative testing number of sensor model 01 According to selection it is more complicated, it is difficult to suitable representative detection data is extracted by experience or simple algorithm.And in this hair In bright embodiment, the representative detection data of its selection can constantly be adjusted by the on-line evaluation and feedback of evaluation model 05 Type, so as to constantly improve its performance, to guarantee subsequent to extract more suitably representative detection data.For example, By constantly adjusting, the dimension of the extracted representative detection data of sensor model 01 can be reduced to 5 dimensions from 100 multidimensional.
Optionally, control device can initial phase when controlling smart machine, to entire control system Control effect is evaluated, and is based on the control effect, is adjusted to the parameter of each model in control system.It Afterwards, which only can carry out effect assessment and parameter tune to model (such as Controlling model) specific in control system It is whole.Wherein, when carrying out effect assessment to the effect of some particular model, the parameter constant of other models can be kept, only to this The parameter of particular model is adjusted, and is then evaluated the control effect of the control parameter of control system output, this is commented The evaluation result that valence obtains can be used as the evaluation result after the particular model adjusting parameter.To the control effect of particular model into The method of row evaluation, and method that the parameter of particular model is adjusted can with reference to above-mentioned steps 106 to step 108, Details are not described herein again.
In embodiments of the present invention, control device can be after every secondary control smart machine executes operation, by above-mentioned Method shown in step 106 to step 108 evaluates its control effect, and is adjusted to the parameter of control system.Or Person, control device can also be after control smart machine be executed and is operated several times, then executes above-mentioned steps 106 to step 108 institute The method shown.Or the control device can also be after receiving adjustment instruction, then executes above-mentioned steps 106 to step 108 Shown in method, the adjustment instruction can be by user triggering.
Step 109, according to the parameter of each model adjusted, the data that are stored in more new knowledge base.
It in embodiments of the present invention, can be with root after each model in control system is completed to the adjustment of inherent parameters According to the data stored in parameter adjusted more new knowledge base.For example, sensor model, plan model and Controlling model can be distinguished Parameter adjusted is sent to knowledge base 04, the study submodel 041 in the knowledge base 04 can be according to the sense adjusted The parameter of perception model 01 updates the perception data stored in the knowledge base submodel 042;According to the plan model 02 adjusted Parameter, update the layout data stored in the knowledge base submodel 042, and the ginseng according to the Controlling model 03 adjusted Number, updates the control data stored in the knowledge base submodel 042.
Wherein, learn the realization of submodel 041, it is straight with the algorithm of sensor model 01, plan model 02 and Controlling model 03 Correlation is connect, which can be a part of knowledge base 04, be also possible to sensor model 01,02 and of plan model A part of Controlling model 03, i.e., can be in each model in the sensor model 01, plan model 02 and Controlling model 03 It is provided with a study submodel 041.
Optionally, which also may include one for learning and extracting the neural network model of data, The parameter adjusted that the knowledge base 04 can send each model is separately input to the neural network model, and is based on the nerve The corresponding data of each model stored in the output more new knowledge base submodel 042 of network model.
It is exemplary, by taking the feature extraction submodel 012 in sensor model 01 as an example, with reference to Figure 12, when this feature extracts son After model 012 adjusts the type of the representative detection data of its extraction according to control effect, this feature extracts submodel 012 can be with The type of representative detection data adjusted is sent to the study submodel 041 in knowledge base 04.Learning submodel 041 can With based on the type received, update stored in the knowledge base submodel 042 with the goal task and the field being presently in The type of the corresponding representative detection data of scape.It is possible thereby to realize the online updating to perception data and adjustment, base ensure that In the reliability for the representative detection data that the perception data extracts.
It should be noted that the sequencing of the step of control method of smart machine provided in an embodiment of the present invention can be with Appropriate adjustment is carried out, step according to circumstances can also accordingly be increased and decreased.Such as step 106 to step 109 can also be according to feelings Condition is deleted, and anyone skilled in the art within the technical scope of the present application, can readily occur in variation Method should all cover within the scope of protection of this application, therefore repeat no more.
In conclusion this method can will acquire the embodiment of the invention provides a kind of control method of smart machine Detection data and the goal task be input to sensor model, obtain and the associated representative detection data of the goal task.So After goal task and the representativeness detection data can be input to plan model, obtain target state data.It can incite somebody to action later The target state data and the representativeness detection data are input to Controlling model, obtain the control ginseng for controlling the smart machine Number.Finally the smart machine can be controlled based on the control parameter.Since the Controlling model is at the beginning of being based on control theory data What beginningization obtained, which can directly express and reflect the control law and principle of smart machine, compared to phase Training sample is directlyed adopt in the technology of pass to be trained, and is not only reduced dependence of the Controlling model to training sample, is improved instruction Practice efficiency, it may also be ensured that the control effect of smart machine.
Further, method provided in an embodiment of the present invention can also evaluate the control effect of control device, and It can be adjusted according to parameter of the control effect to model each in control device, it is possible thereby in the use of the control device In the process, its performance is constantly improve, the control effect to smart machine is improved.
The embodiment of the invention provides a kind of training method of the control system of smart machine, this method can be used for training Sensor model included by control system, plan model and Controlling model in above method embodiment.The training method can answer For training device.The control device of the training device and the smart machine can be that same device, or both may be The difference being configured in same equipment, such as two devices can be configured in the smart machine.Or the training cartridge Setting can also be configured in different equipment from the control device, such as the training device can be configured in training server, The control device can be configured in the smart machine.It, can be with after training device is completed to the training of model each in control system Trained each model is sent to the control device.
Wherein, which can be obtained based on the training of the mode of deep learning, such as can be learned using based on supervision The deep learning method training of habit obtains;The plan model and the Controlling model can the mode based on intensified learning it is trained It arrives.Certainly, the mode training which can also be learnt based on intensified learning or deeply obtains, the plan model The mode training that can also be learnt based on deep learning or deeply with the Controlling model is obtained.The embodiment of the present invention is to each The type for the machine learning method being based on when model training is without limitation.
Optionally, the mode of the intensified learning may include Q study (Q-learning) method or State-Action-prize Encourage-State-Action (State Action Reward State Action, SARSA) method etc..The side of deeply study Formula may include depth Q network (Deep Q Network, DQN) or depth deterministic policy gradient (Deep Deterministic Policy Gradient, DDPG) etc..
As a kind of optional implementation, with reference to Figure 13, the training process of the sensor model may include:
Step 201a, it obtains detection sample data and detects sample data with the associated representativeness of appointed task.
The detection data may include the environmental samples data of ambient enviroment of the smart machine when executing appointed task, with And the state sample data of the smart machine.It should can be with the associated representative detection sample data of appointed task from sample number According to what is obtained in library.
Step 202a, the detection sample data, the appointed task and the representativeness are examined based on the mode of deep learning Test sample notebook data is trained, and obtains the sensor model.
During the mode based on deep learning is trained, training device can by the detection sample data and The appointed task is input to initial sensor model, obtain initial sensor model output with the appointed task associated generation Table detection data.Later, training device can be according to the representative detection data and the generation that the initial sensor model exports Table detect sample data between difference, constantly adjust the initial sensor model parameter (such as model parameter, input ginseng At least one of several and output parameter), to obtain the sensor model.
Optionally, in embodiments of the present invention, training device can use detection sample data and the generation of different task Table detection sample data is trained initial sensor model, to obtain the ginseng of sensor model corresponding with different task Number, and can perceptually data are stored into knowledge base 04 by the parameter of the corresponding sensor model of the different task.Alternatively, instruction Initial sensor model can be instructed using different task detection sample data and representative detection sample data by practicing device Practice, to obtain perception submodel corresponding with different task.
As a kind of optional implementation, with reference to Figure 14, the training process of the plan model may include:
Step 201b, acquisition and the associated representative detection sample data of appointed task and and Effect value sample number According to.
It should may each be with the associated representative detection sample data of appointed task and dbjective state sample data from sample It is got in database.Wherein, by taking automatic driving vehicle as an example, which be can be according to same scene Under, difference when manual drive between the target state data of vehicle and the target state data of initial plan model output is true Fixed.The same scene refers to that the task of execution is identical, and the representative detection data got is identical.
Step 202b, based on the mode of intensified learning, using representativeness detection sample data, the appointed task and effect Fruit value sample data is trained initial plan model, obtains the plan model.
Further, training device can the mode based on intensified learning initial plan model is trained.In training During, which can be detected to sample data and appointed task is input to initial plan model, and be based on the effect Fruit value (Q value) sample data is adjusted the parameter of initial plan model, to obtain the plan model.Wherein, the extensive chemical The mode of habit may include Q-learning method or SARSA method etc..
Optionally, in embodiments of the present invention, training device can be using the representative detection sample data of different task And Effect value sample data is trained initial plan model, to obtain the ginseng of plan model corresponding with different task Number, and the parameter of the corresponding plan model of the different task can be stored as layout data into knowledge base 04.Alternatively, instruction Practice device can using different task representative detection sample data and Effect value sample data to initial plan model into Row training, to obtain planning submodel corresponding with different task.
As a kind of optional implementation, with reference to Figure 15, the training process of the Controlling model may include:
Step 201c, initial Controlling model is initialized using the control theory data.
Training device can be that the initial Controlling model configures initial value according to the control theory data, thus to the initial control Simulation is initialized.For example, training device can if the Controlling model is trained based on Q-learning method To be initialized according to Q table (Q-table) of the control theory data to the initial Controlling model.The control theory data can To include: physical knowledge (such as the friction system on road surface of kinetic theory data (such as mechanics law) and some common-senses Number) etc., since the control theory data can directly express the control law and principle of smart machine, acquired without machine learning A large amount of training sample data are trained, and can not only effectively reduce the training burden of machine learning, improve training speed (such as Training speed can promote 100 times or so), training cost is lower, and training effect is more preferable.
Step 202c, acquisition some or all of is associated with representative detection sample data, target-like aspect with appointed task Notebook data and Effect value sample data.
It should be with the associated representative detection sample data of appointed task, dbjective state sample data and Effect value sample number According to can be obtained from sample database.Wherein, by taking automatic driving vehicle as an example, which be can be According under same scene, the control parameter of vehicle and initial Controlling model are exported when manual drive the difference between control parameter Different determination.The same scene can refer to that the target state data got and representative detection data are all the same.
Step 203c, based on the mode of intensified learning, using the representative detection sample data got, the dbjective state Sample data and the Effect value sample data, are trained the initial Controlling model, obtain Controlling model.
Wherein, the mode of the intensified learning may include Q-learning method or SARSA method etc..Strong based on this The process that the mode that chemistry is practised is trained, the representative detection sample data that can be will acquire and the dbjective state sample number According to being input to initial Controlling model, and the parameter of the initial Controlling model can be constantly adjusted according to the Effect value sample data, To constantly improve its performance, and finally obtain Controlling model.
Controlling model is initialized by control theory data, is instructed compared to training sample data are directlyed adopt Practice, the training effectiveness of training method provided in an embodiment of the present invention is higher, and training cost is lower, more to the dependence of training sample It is small.For example, if in the control theory data including the formula for calculating the steering angle α of steering wheel according to lane line curvature k: α= f2(k), the formula of the steering angle α of steering wheel: α=f and according to vehicle with lane center distance d is calculated3(d), then it adopts After being initialized with the control theory data to initial Controlling model, training device is just no longer needed to through a large amount of number of training Pass according to the study lane line curvature k and the vehicle and lane center distance d, between the steering angle α of steering wheel System reduces the required sample size of training so as to effectively improve trained efficiency.
Optionally, in embodiments of the present invention, training device can detect sample data using the representative of different task, Dbjective state sample data and Effect value sample data are trained initial Controlling model, to obtain and different task pair The parameter for the Controlling model answered, and can store the parameter of the corresponding Controlling model of the different task as control data to knowing Know in library 04.Alternatively, training device can using the representative detection sample data of different task, dbjective state sample data with And Effect value sample data is trained initial Controlling model, to obtain control submodel corresponding with different task.
According to above-mentioned steps 1041c to step 1044c it is found that the Controlling model may include control submodel and one Or multiple calculating submodels.Therefore when being trained to Controlling model, as another optional implementation, above-mentioned steps In 201c, training device can be initialized one or more computation models based on the control theory data, that is, determining should The calculation formula of each computation model.It, can be based on the mode of intensified learning, using obtaining correspondingly, in above-mentioned steps 203c The representative detection sample data got, the dbjective state sample data and the Effect value sample data, the control initial to this System model is trained, to obtain the control submodel for calculating weight.
It optionally, in embodiments of the present invention, can also be according to each model after training device is completed to the training of each model Parameter more new knowledge base in the data corresponding with each model that store.
In embodiments of the present invention, when being trained to each model in the control system, training device is available Training sample data (such as detection sample data, representative detection sample data and the target-like aspect of a large amount of different tasks Notebook data), the above method can be used for each training sample data, each model is trained, and in the knowledge base The data of storage are updated, to constantly improve each model in the data stored in the knowledge base and the control system Operational effect.
Optionally, in embodiments of the present invention, the data stored in the knowledge base can also include for the control system The constrained parameters that the adjusting range of the parameter of each model is defined in system.Correspondingly, when being trained to each model, It needs to be adjusted the parameter of model in the range of the constrained parameters limit.
The data stored in the knowledge base as a result, can form constraint to the training of model each in control system, this is based on The training method of constraint can reduce the training burden of machine learning, and can guarantee the output of model tally with the actual situation need It asks, guarantees safety and reliability when controlling smart machine.For example, the steering wheel that can be exported with Guarantee control system Steering angle in a certain range, guarantee the safety and stationarity of automatic Pilot.
In conclusion the embodiment of the invention provides the training method of each model in a kind of control system of smart machine, This method can initialize Controlling model using control theory data, and the Controlling model after initialization is in training Required sample size is less, and training effectiveness is higher, and cost is relatively low for training.
Figure 16 is a kind of structural schematic diagram of the control device of smart machine provided in an embodiment of the present invention, the control device It can be configured in smart machine, or can also be configured at and be established in the control equipment for having communication connection with the smart machine. The control device can be used to implement the control method of the smart machine of above method embodiment offer.As shown in figure 16, the dress It sets and may include:
First obtains module 301, can be used to implement method shown in step 101 in above method embodiment.
First processing module 302 can be used to implement method shown in step 102 in above method embodiment.
Second processing module 303 can be used to implement method shown in step 103 in above method embodiment.
Third processing module 304 can be used to implement method shown in step 104 in above method embodiment.
Control module 305 can be used to implement method shown in step 105 in above method embodiment.
Wherein, which is obtained based on control theory data initialization.
Figure 17 is the structural schematic diagram of the control device of another smart machine provided in an embodiment of the present invention, such as Figure 17 institute Show, which can also include:
Second obtains module 306, can be used to implement method shown in step 201a in above method embodiment.
First training module 307 can be used to implement method shown in step 202a in above method embodiment.
Optionally, as shown in figure 17, which can also include:
Third obtains module 308, can be used to implement method shown in step 201b in above method embodiment.
Second training module 309 can be used to implement method shown in step 202b in above method embodiment.
Optionally, as shown in figure 17, which can also include:
Initialization module 310 can be used to implement method shown in step 201c in above method embodiment.
4th obtains module 311, can be used to implement method shown in step 202c in above method embodiment.
Third training module 312 can be used to implement method shown in step 203c in above method embodiment.
Figure 18 is the structural schematic diagram of the control device of another smart machine provided in an embodiment of the present invention, with reference to figure 18, which can also include:
5th obtains module 313, can be used to implement method shown in step 106 in above method embodiment.
Determining module 314 can be used to implement method shown in step 107 in above method embodiment.
Module 315 is adjusted, can be used to implement method shown in step 108 in above method embodiment.
Optionally, which may include: the control submodel for calculating weight, and for calculating the control The one or more of parameter calculate submodel;The third processing module 304, can be used to implement step in above method embodiment Method shown in 1041c to step 1044c.
Optionally, which can be used for: the new status data and the goal task are input to evaluation mould Type obtains the control effect of the control parameter.
Optionally, which is automatic driving vehicle or intelligent robot.
In conclusion the device can will acquire the embodiment of the invention provides a kind of control device of smart machine Detection data and goal task be input to sensor model, obtain and the associated representative detection data of the goal task.Then Goal task and the representativeness detection data can be input to plan model, obtain target state data.It later can should Target state data and the representativeness detection data are input to Controlling model, obtain the control ginseng for controlling the smart machine Number.Finally the smart machine can be controlled based on the control parameter.Since the Controlling model is at the beginning of being based on control theory data What beginningization obtained, which can directly express and reflect the control law and principle of smart machine, compared to phase Training sample is directlyed adopt in the technology of pass to be trained, and is not only reduced dependence of the Controlling model to training sample, is improved instruction Practice efficiency, it may also be ensured that the control effect of smart machine.
The embodiment of the invention also provides a kind of control devices of smart machine.As shown in figure 19, which can be with It include: processor 1201 (such as CPU), memory 1202, network interface 1203 and bus 1204.Wherein, bus 1204 is for connecting Connect processor 1201, memory 1202 and network interface 1203.Memory 1202 may include random access memory (Random Access Memory, RAM), it is also possible to include non-labile memory (non-volatile memory), for example, at least one A magnetic disk storage.It is realized by network interface 1203 (can be wired or wireless) logical between server and communication equipment Letter connection.Computer program 12021 is stored in memory 1202, which applies function for realizing various Energy.The processor 1201 can be used for executing the computer program 12021 that stores in memory 1202 to realize above method reality The control method of the smart machine of example offer is provided.
The embodiment of the invention also provides a kind of computer readable storage medium, stored in the computer readable storage medium There is instruction, when the computer readable storage medium is run on computers, so that computer executes such as above method embodiment The control method of provided smart machine.
The embodiment of the invention also provides a kind of computer program products comprising instruction, when the computer program product exists When being run on computer, so that computer executes the control method of the smart machine as provided by above method embodiment.
The embodiment of the invention also provides a kind of smart machine, which may include such as any institute of Figure 16 to Figure 19 The control device shown.
In the above-described embodiments, can come wholly or partly by software, hardware, firmware or any combination thereof real It is existing.When implemented in software, it can entirely or partly realize in the form of a computer program product, the computer program Product includes one or more computer instructions.When loading on computers and executing the computer program instructions, all or It partly generates according to process or function described in the embodiment of the present invention.The computer can be general purpose computer, computer Network or other programmable devices.The computer instruction can store in the readable storage medium storing program for executing of computer, Huo Zhecong One computer readable storage medium is transmitted to another computer readable storage medium, for example, the computer instruction can be with Pass through wired (such as coaxial cable, optical fiber, Digital Subscriber Line) from a web-site, computer, server or data center Or wireless (such as infrared, wireless, microwave etc.) mode is transmitted to another web-site, computer, server or data center. The computer readable storage medium can be any usable medium that computer can access or can comprising one or more The data storage devices such as server, the data center integrated with medium.The usable medium can be magnetic medium (for example, soft Disk, hard disk, tape), optical medium or semiconductor medium (such as solid state hard disk) etc..
The foregoing is merely the alternative embodiments of the application, not to limit the application, it is all in spirit herein and Within principle, any modification, equivalent replacement, improvement and so on be should be included within the scope of protection of this application.

Claims (22)

1. a kind of control method of smart machine, which is characterized in that the described method includes:
It receives for after the executing instruction of goal task, obtains detection data, the detection data includes the smart machine The status data of the environmental data of ambient enviroment and the smart machine;
The detection data and the goal task are input to sensor model, obtained and the associated representativeness of the goal task Detection data;
The goal task and the representative detection data are input to plan model, obtain target state data, it is described Target state data is used to indicate the state reached needed for the smart machine;
By the target state data, and the partly or entirely described representative detection data is input to Controlling model, is used In the control parameter for controlling the smart machine;
The smart machine, which is controlled, based on the control parameter executes the goal task;
Wherein, the Controlling model is obtained based on control theory data initialization.
2. the method according to claim 1, wherein
The sensor model is obtained based on the mode training of deep learning.
3. the method according to claim 1, wherein
The plan model is obtained based on the mode formula training of intensified learning.
4. the method according to claim 1, wherein
The Controlling model is obtained based on the mode formula training of intensified learning.
5. according to the method described in claim 2, it is characterized in that, before receiving for the executing instruction of goal task, institute State method further include:
It obtains detection sample data and detects sample data, the detection sample data packet with the associated representativeness of appointed task Include the environmental samples data of ambient enviroment of the smart machine when executing appointed task and the state sample of the smart machine Data;
Mode based on deep learning, using the detection sample data, the appointed task and the representative detection sample Notebook data is trained initial sensor model, obtains the sensor model.
6. according to the method described in claim 3, it is characterized in that, before receiving for the executing instruction of goal task, institute State method further include:
It obtains and the associated representative detection sample data of appointed task and Effect value sample data;
Mode based on intensified learning, using the representative detection sample data, the appointed task and the Effect value Sample data is trained initial plan model, obtains the plan model.
7. according to the method described in claim 4, it is characterized in that, before receiving for the executing instruction of goal task, institute State method further include:
Initial Controlling model is initialized based on the control theory data;
Acquisition some or all of be associated with representativeness with appointed task and detects sample data, dbjective state sample data, Yi Jixiao Fruit value sample data;
Mode based on intensified learning, using the representative detection sample data got, the dbjective state sample number According to and the Effect value sample data, the initial Controlling model is trained, the Controlling model is obtained.
8. according to the method described in claim 4, it is characterized in that, the Controlling model includes: the control for calculating weight Submodel, and submodel is calculated for calculating the one or more of the control parameter;Receiving holding for goal task Before row instruction, the method also includes:
Acquisition some or all of be associated with representativeness with appointed task and detects sample data, dbjective state sample data, Yi Jixiao Fruit value sample data;
Mode based on intensified learning, using the representative detection sample data got, the dbjective state sample number According to and the Effect value sample data, initial control submodel is trained, the control submodel is obtained;
Each calculating submodel is determined based on the control theory data.
9. according to the method described in claim 8, it is characterized in that, the Controlling model includes: the control for calculating weight Submodel, and submodel is calculated for calculating the one or more of the control parameter;
It is described by the target state data, and the partly or entirely described representative detection data is input to Controlling model, obtains To the control parameter for controlling the smart machine, comprising:
It is obtained from the target state data, and in the partly or entirely described representative detection data and each calculating The corresponding one group of target input data of model;
Every group of target input data is input to corresponding calculating submodel respectively, is obtained corresponding to every group of target input data The value of control parameter;
By the target state data, and partly or entirely representative detection data is input to the control submodel, obtains One group of weight;
According to one group of weight and the value of the corresponding control parameter of each group target input data, the control ginseng is determined Several target values.
10. method according to any one of claims 1 to 9, which is characterized in that the method also includes:
After controlling the smart machine based on the control parameter, the new status data of the smart machine is obtained;
Control effect is determined according to the new status data and the goal task;
According to the control effect, adjust one or more in the sensor model, the plan model and the Controlling model The parameter of model.
11. method according to any one of claims 1 to 9, which is characterized in that the smart machine be automatic driving vehicle or Person's intelligent robot.
12. a kind of control device of smart machine, which is characterized in that described device includes:
First obtains module, for obtaining detection data, the detection data after receiving and being directed to the executing instruction of goal task The status data of environmental data and the smart machine including the smart machine ambient enviroment;
First processing module obtains and the mesh for the detection data and the goal task to be input to sensor model The associated representative detection data of mark task;
Second processing module is obtained for the goal task and the representative detection data to be input to plan model Target state data, the target state data are used to indicate the state reached needed for the smart machine;
Third processing module is used for the target state data, and the partly or entirely described representative detection data input To Controlling model, the control parameter for controlling the smart machine is obtained;
Control module executes the goal task for controlling the smart machine based on the control parameter;
Wherein, the Controlling model is obtained based on control theory data initialization.
13. device according to claim 12, which is characterized in that described device further include:
Second obtains module, for before receiving for the executing instruction of goal task, obtain detection sample data and with The associated representative detection sample data of appointed task, the detection sample data includes smart machine when executing appointed task Ambient enviroment environmental samples data and the smart machine state sample data;
First training module, for the mode based on deep learning, using the detection sample data, the appointed task and The representative detection sample data is trained initial sensor model, obtains the sensor model.
14. device according to claim 12, which is characterized in that described device further include:
Third obtains module, for obtaining and appointed task associated generation before receiving for the executing instruction of goal task Table detects sample data and Effect value sample data;
Second training module is appointed for the mode based on intensified learning using the representative detection sample data, described specify Business and the Effect value sample data are trained initial plan model, obtain the plan model.
15. device according to claim 12, which is characterized in that described device further include:
Initialization module, for before receiving for the executing instruction of goal task, based on the control theory data to first Beginning Controlling model is initialized;
4th obtains module, some or all of be associated with representativeness with appointed task for obtaining and detects sample data, target-like Aspect notebook data and Effect value sample data;
Third training module, for the mode based on intensified learning, using the representative detection sample data got, institute Dbjective state sample data and the Effect value sample data are stated, the initial Controlling model is trained, is obtained described Controlling model.
16. device according to claim 12, which is characterized in that described device further include:
4th obtains module, some or all of be associated with representativeness with appointed task for obtaining and detects sample data, target-like Aspect notebook data and Effect value sample data;
Third training module, for the mode based on intensified learning, using the representative detection sample data got, institute Dbjective state sample data and the Effect value sample data are stated, initial control submodel is trained, the control is obtained System model;
Initialization module, for determining each calculating submodel based on the control theory data.
17. device according to claim 16, which is characterized in that the Controlling model includes: the control for calculating weight System model, and submodel is calculated for calculating the one or more of the control parameter;
The third processing module, is used for:
It is obtained from the target state data, and in the partly or entirely described representative detection data and each calculating The corresponding one group of target input data of model;
Every group of target input data is input to corresponding calculating submodel respectively, is obtained corresponding to every group of target input data The value of control parameter;
By the target state data, and partly or entirely representative detection data is input to the control submodel, obtains One group of weight;
According to one group of weight and the value of the corresponding control parameter of each group target input data, the control ginseng is determined Several target values.
18. 2 to 17 any device according to claim 1, which is characterized in that described device further include:
5th obtains module, for obtaining the smart machine after controlling the smart machine based on the control parameter New status data;
Determining module, for determining control effect according to the new status data and the goal task;
Module is adjusted, for adjusting the sensor model, the plan model and the Controlling model according to the control effect The parameter of middle one or more model.
19. 2 to 17 any device according to claim 1, which is characterized in that the adjustment module is used for:
The new status data and the goal task are input to evaluation model, obtain the control effect of the control parameter Fruit.
20. a kind of control device of smart machine, which is characterized in that described device includes: memory, processor and is stored in this On the memory and the computer program that can run on the processor, when the processor executes the computer program Realize the control method of the smart machine as described in claim 1 to 11 is any.
21. a kind of computer readable storage medium, which is characterized in that instruction is stored in the computer readable storage medium, When the computer readable storage medium is run on computers, so that computer is executed such as any institute of claim 1 to 11 The control method for the smart machine stated.
22. a kind of smart machine, which is characterized in that the smart machine includes the dress as described in claim 12 to 20 is any It sets.
CN201810850160.3A 2018-07-28 2018-07-28 Intelligent device and control method and device thereof Active CN109109863B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810850160.3A CN109109863B (en) 2018-07-28 2018-07-28 Intelligent device and control method and device thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810850160.3A CN109109863B (en) 2018-07-28 2018-07-28 Intelligent device and control method and device thereof

Publications (2)

Publication Number Publication Date
CN109109863A true CN109109863A (en) 2019-01-01
CN109109863B CN109109863B (en) 2020-06-16

Family

ID=64863520

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810850160.3A Active CN109109863B (en) 2018-07-28 2018-07-28 Intelligent device and control method and device thereof

Country Status (1)

Country Link
CN (1) CN109109863B (en)

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109976726A (en) * 2019-03-20 2019-07-05 深圳市赛梅斯凯科技有限公司 Vehicle-mounted Edge intelligence computing architecture, method, system and storage medium
CN110187727A (en) * 2019-06-17 2019-08-30 武汉理工大学 A kind of Glass Furnace Temperature control method based on deep learning and intensified learning
CN111123952A (en) * 2019-12-31 2020-05-08 华为技术有限公司 Trajectory planning method and device
CN111694973A (en) * 2020-06-09 2020-09-22 北京百度网讯科技有限公司 Model training method and device for automatic driving scene and electronic equipment
WO2021036543A1 (en) * 2019-08-29 2021-03-04 南京智慧光信息科技研究院有限公司 Automatic operation method employing big data and artificial intelligence, and robot system
CN113077641A (en) * 2021-03-24 2021-07-06 中南大学 Decision mapping method and device for bus on-the-way control and storage medium
CN113468307A (en) * 2021-06-30 2021-10-01 网易(杭州)网络有限公司 Text processing method and device, electronic equipment and storage medium
CN113954858A (en) * 2020-07-20 2022-01-21 华为技术有限公司 Method for planning vehicle driving route and intelligent automobile
US20220212683A1 (en) * 2019-07-30 2022-07-07 Mazda Motor Corporation Vehicle control system
WO2022264929A1 (en) * 2021-06-14 2022-12-22 株式会社明電舎 Control device and control method

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105059288A (en) * 2015-08-11 2015-11-18 奇瑞汽车股份有限公司 Lane keeping control system and method
CN105109483A (en) * 2015-08-24 2015-12-02 奇瑞汽车股份有限公司 Driving method and driving system
CN107270923A (en) * 2017-06-16 2017-10-20 广东欧珀移动通信有限公司 Method, terminal and storage medium that a kind of route is pushed
CN107390682A (en) * 2017-07-04 2017-11-24 安徽省现代农业装备产业技术研究院有限公司 A kind of agri-vehicle automatic Pilot path follower method and system
US20180074493A1 (en) * 2016-09-13 2018-03-15 Toyota Motor Engineering & Manufacturing North America, Inc. Method and device for producing vehicle operational data based on deep learning techniques
CN107907886A (en) * 2017-11-07 2018-04-13 广东欧珀移动通信有限公司 Travel conditions recognition methods, device, storage medium and terminal device
CN108297864A (en) * 2018-01-25 2018-07-20 广州大学 The control method and control system of driver and the linkage of vehicle active safety technologies

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105059288A (en) * 2015-08-11 2015-11-18 奇瑞汽车股份有限公司 Lane keeping control system and method
CN105109483A (en) * 2015-08-24 2015-12-02 奇瑞汽车股份有限公司 Driving method and driving system
US20180074493A1 (en) * 2016-09-13 2018-03-15 Toyota Motor Engineering & Manufacturing North America, Inc. Method and device for producing vehicle operational data based on deep learning techniques
CN107270923A (en) * 2017-06-16 2017-10-20 广东欧珀移动通信有限公司 Method, terminal and storage medium that a kind of route is pushed
CN107390682A (en) * 2017-07-04 2017-11-24 安徽省现代农业装备产业技术研究院有限公司 A kind of agri-vehicle automatic Pilot path follower method and system
CN107907886A (en) * 2017-11-07 2018-04-13 广东欧珀移动通信有限公司 Travel conditions recognition methods, device, storage medium and terminal device
CN108297864A (en) * 2018-01-25 2018-07-20 广州大学 The control method and control system of driver and the linkage of vehicle active safety technologies

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109976726A (en) * 2019-03-20 2019-07-05 深圳市赛梅斯凯科技有限公司 Vehicle-mounted Edge intelligence computing architecture, method, system and storage medium
CN110187727A (en) * 2019-06-17 2019-08-30 武汉理工大学 A kind of Glass Furnace Temperature control method based on deep learning and intensified learning
US20220212683A1 (en) * 2019-07-30 2022-07-07 Mazda Motor Corporation Vehicle control system
WO2021036543A1 (en) * 2019-08-29 2021-03-04 南京智慧光信息科技研究院有限公司 Automatic operation method employing big data and artificial intelligence, and robot system
CN111123952A (en) * 2019-12-31 2020-05-08 华为技术有限公司 Trajectory planning method and device
CN111123952B (en) * 2019-12-31 2021-12-31 华为技术有限公司 Trajectory planning method and device
CN111694973A (en) * 2020-06-09 2020-09-22 北京百度网讯科技有限公司 Model training method and device for automatic driving scene and electronic equipment
CN111694973B (en) * 2020-06-09 2023-10-13 阿波罗智能技术(北京)有限公司 Model training method and device for automatic driving scene and electronic equipment
CN113954858A (en) * 2020-07-20 2022-01-21 华为技术有限公司 Method for planning vehicle driving route and intelligent automobile
WO2022016901A1 (en) * 2020-07-20 2022-01-27 华为技术有限公司 Method for planning driving route of vehicle, and intelligent vehicle
CN113077641A (en) * 2021-03-24 2021-07-06 中南大学 Decision mapping method and device for bus on-the-way control and storage medium
WO2022264929A1 (en) * 2021-06-14 2022-12-22 株式会社明電舎 Control device and control method
JP2022190200A (en) * 2021-06-14 2022-12-26 株式会社明電舎 Controller and control method
JP7248053B2 (en) 2021-06-14 2023-03-29 株式会社明電舎 Control device and control method
CN113468307A (en) * 2021-06-30 2021-10-01 网易(杭州)网络有限公司 Text processing method and device, electronic equipment and storage medium
CN113468307B (en) * 2021-06-30 2023-06-30 网易(杭州)网络有限公司 Text processing method, device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN109109863B (en) 2020-06-16

Similar Documents

Publication Publication Date Title
CN109109863A (en) Smart machine and its control method, device
Rhinehart et al. Precog: Prediction conditioned on goals in visual multi-agent settings
US11537134B1 (en) Generating environmental input encoding for training neural networks
EP3405845B1 (en) Object-focused active three-dimensional reconstruction
US10500721B2 (en) Machine learning device, laminated core manufacturing apparatus, laminated core manufacturing system, and machine learning method for learning operation for stacking core sheets
WO2021178909A1 (en) Learning point cloud augmentation policies
US11472444B2 (en) Method and system for dynamically updating an environmental representation of an autonomous agent
US20230219585A1 (en) Tools for performance testing and/or training autonomous vehicle planners
CN112734808B (en) Trajectory prediction method for vulnerable road users in vehicle driving environment
CN112203916A (en) Method and device for determining lane change related information of target vehicle, method and device for determining vehicle comfort measure for predicting driving maneuver of target vehicle, and computer program
US20230230484A1 (en) Methods for spatio-temporal scene-graph embedding for autonomous vehicle applications
CN114655227A (en) Driving style recognition method, driving assistance method and device
EP4086817A1 (en) Training distilled machine learning models using a pre-trained feature extractor
Mirus et al. An investigation of vehicle behavior prediction using a vector power representation to encode spatial positions of multiple objects and neural networks
Wheeler et al. A probabilistic framework for microscopic traffic propagation
CN116448134B (en) Vehicle path planning method and device based on risk field and uncertain analysis
Ding et al. Capture uncertainties in deep neural networks for safe operation of autonomous driving vehicles
CN114594776B (en) Navigation obstacle avoidance method based on layering and modular learning
Liu et al. Reliability of Deep Neural Networks for an End-to-End Imitation Learning-Based Lane Keeping
Ponda et al. Decentralized information-rich planning and hybrid sensor fusion for uncertainty reduction in human-robot missions
Toubeh et al. Risk-aware planning by confidence estimation using deep learning-based perception
Ge et al. Deep reinforcement learning navigation via decision transformer in autonomous driving
CN116324904A (en) Method and system for annotating sensor data
Nozarian et al. Uncertainty quantification and calibration of imitation learning policy in autonomous driving
Saleem et al. Obstacle-avoidance algorithm using deep learning based on rgbd images and robot orientation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant