CN112613617A - Uncertainty estimation method and device based on regression model - Google Patents

Uncertainty estimation method and device based on regression model Download PDF

Info

Publication number
CN112613617A
CN112613617A CN202011612532.2A CN202011612532A CN112613617A CN 112613617 A CN112613617 A CN 112613617A CN 202011612532 A CN202011612532 A CN 202011612532A CN 112613617 A CN112613617 A CN 112613617A
Authority
CN
China
Prior art keywords
regression model
loss function
training
features
probability distribution
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011612532.2A
Other languages
Chinese (zh)
Inventor
周杰
鲁继文
李万华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tsinghua University
Original Assignee
Tsinghua University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tsinghua University filed Critical Tsinghua University
Priority to CN202011612532.2A priority Critical patent/CN112613617A/en
Publication of CN112613617A publication Critical patent/CN112613617A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/15Correlation function computation including computation of convolution operations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Data Mining & Analysis (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Pure & Applied Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Mathematical Optimization (AREA)
  • Mathematical Analysis (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Molecular Biology (AREA)
  • Algebra (AREA)
  • Databases & Information Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Medical Informatics (AREA)
  • Complex Calculations (AREA)

Abstract

The application provides an uncertainty estimation method and device based on a regression model, and relates to the technical field of machine learning, wherein the method comprises the following steps: acquiring an input sample, and extracting probability distribution characteristics of the input sample; wherein the input sample has a tag value; sampling T characteristics from the probability distribution characteristics; wherein T is a positive integer; acquiring T loss functions corresponding to the T characteristics, and processing the T loss functions to acquire a training loss function; and inputting the input sample into the regression model for processing to obtain a predicted value, adjusting parameters of the regression model according to the label value and the predicted value through a training loss function to obtain a trained regression model, inputting the data to be processed into the trained regression model, and obtaining a regression result and a target value. Therefore, the uncertainty, namely the target value, of each test data can be given, meanwhile, the accuracy of the regression result is effectively improved in modeling uncertainty, and a regression model with better performance is obtained.

Description

Uncertainty estimation method and device based on regression model
Technical Field
The application relates to the technical field of machine learning, in particular to an uncertainty estimation method and device based on a regression model.
Background
In general, regression problems require that a corresponding target value y is often required to be predicted for a given datum x. The regression problem is a fundamental machine learning problem.
Most existing methods solve this problem based on deep neural networks, typically, a deep neural network is used to extract features from the data x, and then a regressor is used to regress the extracted features to specific values. Common regression strategies can be broadly divided into three categories: direct regression-based methods, classification-based methods, and sequence-based methods. Direct regression-based methods directly predict target values using a regressor, which is trained using the L1 or L2 loss functions during the training process. The classification-based method converts the regression problem into a classification problem, first divides the target space into several sub-categories, and then uses a regressor to perform learning of the classification task. The order-based approach implements a regressor using several binary classifiers, each responsible for predicting a binary classification problem. Most of these methods are based on neural network implementations, which tend to give over-confident predictions.
In an actual scenario, besides the model is required to give a predicted value, the confidence of the predicted value is often required to be known. Such as predicting the distance to a forward target in autonomous driving, the regressor may give a prediction but also needs to know to what extent this prediction can be relied upon. It is very dangerous to adopt all the predicted results given by the model directly without considering the confidence. In practice we should not adopt low confidence prediction results. Therefore, it is necessary to know what the uncertainty of the model is while performing model learning, that is, the prediction result given by the model for each data and the uncertainty of the prediction result.
Disclosure of Invention
The present application is directed to solving, at least to some extent, one of the technical problems in the related art.
Therefore, a first objective of the present application is to provide a regression model-based uncertainty estimation method, which can provide uncertainty, i.e. a target value, of each test data, and effectively improve accuracy of a regression result in modeling uncertainty, so as to obtain a regression model with better performance.
A second objective of the present application is to provide an uncertainty estimation device based on a regression model.
In order to achieve the above object, an embodiment of a first aspect of the present application provides a regression model-based uncertainty estimation method, including:
acquiring an input sample, and extracting probability distribution characteristics of the input sample; wherein the input sample has a tag value;
sampling T features from the probability distribution features; wherein T is a positive integer;
acquiring T loss functions corresponding to the T characteristics, and processing the T loss functions to acquire a training loss function;
and inputting the input sample into a regression model for processing to obtain a predicted value, adjusting parameters of the regression model according to the label value and the predicted value through the training loss function to obtain a trained regression model, so that the data to be processed is input into the trained regression model to obtain a regression result and a target value.
According to the uncertainty estimation method based on the regression model, the input sample is obtained, and the probability distribution characteristics of the input sample are extracted; wherein the input sample has a tag value; sampling T characteristics from the probability distribution characteristics; wherein T is a positive integer; acquiring T loss functions corresponding to the T characteristics, and processing the T loss functions to acquire a training loss function; and inputting the input sample into the regression model for processing to obtain a predicted value, adjusting parameters of the regression model according to the label value and the predicted value through a training loss function to obtain a trained regression model, inputting the data to be processed into the trained regression model, and obtaining a regression result and a target value. Therefore, the uncertainty, namely the target value, of each test data can be given, meanwhile, the accuracy of the regression result is effectively improved in modeling uncertainty, and a regression model with better performance is obtained.
In an embodiment of the present application, the extracting the probability distribution feature of the input sample includes:
and processing the input samples through two neural networks respectively to obtain the mean value and the variance of high-dimensional Gaussian distribution as the probability distribution characteristics.
In one embodiment of the present application, the formula for sampling T features from the probability distribution features is:
Figure BDA0002875176610000021
wherein the content of the first and second substances,
Figure BDA0002875176610000022
the input samples x, theta1And theta2For the parameters of the two neural networks, diag () represents taking its diagonal elements, and t is time.
In an embodiment of the present application, the processing the T loss functions to obtain a training loss function includes:
summing and averaging the T loss functions to obtain an average loss function;
and obtaining an ordered distribution constraint function, and calculating the sum of the average loss function and the ordered distribution constraint function as the training loss function.
In one embodiment of the present application, the training loss function is formulated as:
Figure BDA0002875176610000031
wherein the content of the first and second substances,
Figure BDA0002875176610000032
representing the mean loss function, D training data set, alpha being a hyperparameter, LOrdRepresenting the ordered distribution constraint function.
To achieve the above object, a second aspect of the present application provides a regression model-based uncertainty estimation apparatus, including:
the first acquisition module is used for acquiring an input sample; wherein the input sample has a tag value;
the extraction module is used for extracting the probability distribution characteristics of the input samples;
a sampling module for sampling T features from the probability distribution features; wherein T is a positive integer;
a second obtaining module, configured to obtain T loss functions corresponding to the T features;
the processing module is used for processing the T loss functions to obtain training loss functions;
and the training estimation module is used for inputting the input sample into a regression model for processing to obtain a predicted value, adjusting the parameters of the regression model according to the label value and the predicted value through the training loss function to obtain a trained regression model, so that the data to be processed is input into the trained regression model to obtain a regression result and a target value.
The uncertainty estimation device based on the regression model obtains an input sample and extracts probability distribution characteristics of the input sample; wherein the input sample has a tag value; sampling T characteristics from the probability distribution characteristics; wherein T is a positive integer; acquiring T loss functions corresponding to the T characteristics, and processing the T loss functions to acquire a training loss function; and inputting the input sample into the regression model for processing to obtain a predicted value, adjusting parameters of the regression model according to the label value and the predicted value through a training loss function to obtain a trained regression model, inputting the data to be processed into the trained regression model, and obtaining a regression result and a target value. Therefore, the uncertainty, namely the target value, of each test data can be given, meanwhile, the accuracy of the regression result is effectively improved in modeling uncertainty, and a regression model with better performance is obtained.
In an embodiment of the application, the extraction module is specifically configured to
And processing the input samples through two neural networks respectively to obtain the mean value and the variance of high-dimensional Gaussian distribution as the probability distribution characteristics.
In one embodiment of the present application, the formula for sampling T features from the probability distribution features is:
Figure BDA0002875176610000033
wherein the content of the first and second substances,
Figure BDA0002875176610000034
the input samples x, theta1And theta2For the parameters of the two neural networks, diag () represents taking its diagonal elements, and t is time.
In an embodiment of the application, the processing module is specifically configured to
Summing and averaging the T loss functions to obtain an average loss function;
and obtaining an ordered distribution constraint function, and calculating the sum of the average loss function and the ordered distribution constraint function as the training loss function.
In one embodiment of the present application, the training loss function is formulated as:
Figure BDA0002875176610000041
wherein the content of the first and second substances,
Figure BDA0002875176610000042
representing the mean loss function, D training data set, alpha being a hyperparameter, LOrdRepresenting the ordered distribution constraint function.
Additional aspects and advantages of the present application will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the present application.
Drawings
The foregoing and/or additional aspects and advantages of the present application will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings of which:
fig. 1 is a schematic flowchart of a regression model-based uncertainty estimation method according to an embodiment of the present disclosure;
FIG. 2 is an exemplary diagram of regression model training in an embodiment of the present application;
FIG. 3 is an exemplary diagram of a probabilistic unordered feature-probabilistic ordered feature according to an embodiment of the application;
fig. 4 is a schematic structural diagram of an uncertainty estimation apparatus based on a regression model according to an embodiment of the present disclosure.
Detailed Description
Reference will now be made in detail to embodiments of the present application, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the drawings are exemplary and intended to be used for explaining the present application and should not be construed as limiting the present application.
The regression model-based uncertainty estimation method and apparatus of the embodiments of the present application are described below with reference to the accompanying drawings.
Fig. 1 is a schematic flowchart of an uncertainty estimation method based on a regression model according to an embodiment of the present disclosure.
As shown in fig. 1, the regression model-based uncertainty estimation method includes the following steps:
step 101, obtaining an input sample, and extracting probability distribution characteristics of the input sample; where the input sample has a tag value.
Step 102, sampling T characteristics from probability distribution characteristics; wherein T is a positive integer.
And 103, acquiring T loss functions corresponding to the T features, and processing the T loss functions to acquire a training loss function.
And 104, inputting the input sample into the regression model for processing to obtain a predicted value, adjusting parameters of the regression model according to the label value and the predicted value through a training loss function to obtain a trained regression model, inputting the data to be processed into the trained regression model, and obtaining a regression result and a target value.
In the embodiment of the present application, extracting the probability distribution characteristics of the input sample includes: and processing the input samples through two neural networks respectively to obtain the mean value and the variance of high-dimensional Gaussian distribution as the probability distribution characteristics.
In an embodiment of the present application, a formula for sampling T features from the probability distribution features is:
Figure BDA0002875176610000051
wherein the content of the first and second substances,
Figure BDA0002875176610000052
input samples x, theta1And theta2For the parameters of both neural networks, diag () means to take its diagonal elements, and t is time.
In an embodiment of the present application, processing T loss functions to obtain a training loss function includes: summing and averaging the T loss functions to obtain an average loss function; and obtaining an ordered distribution constraint function, and calculating the sum of the average loss function and the ordered distribution constraint function as a training loss function.
In an embodiment of the present application, the training loss function is formulated as:
Figure BDA0002875176610000053
wherein the content of the first and second substances,
Figure BDA0002875176610000054
representing the mean loss function, D training data set, alpha being a hyperparameter, LOrdAn ordered distribution constraint function is represented.
Specifically, aiming at the uncertainty estimation of the regression problem, the method is applied to various regression methods based on learning, on one hand, the performance of the regression method is improved, and meanwhile, the uncertainty corresponding to the model can be given to any test sample. In addition, the method also comprises ordered distribution constraint which aims to retain the orderliness in the target space in the feature space, so that more effective probability features are learned, and the performance of the model is improved.
Specifically, the representation of each sample in the feature space is considered to be a high-dimensional Gaussian distribution z-p (z | x). For an input sample x, firstly, the input sample x is sent to a neural network to extract features, and probability ordered features are learned, namely for each sample x, the features are represented as a probability distribution, and the probability features are represented by using a high-dimensional Gaussian distribution. Two neural networks were therefore used to predict the mean and variance of the gaussian distribution in the high dimension respectively:
Figure BDA0002875176610000055
Figure BDA0002875176610000056
and
Figure BDA0002875176610000057
respectively representing two neural networks with respective parameters theta1And theta2. Thus, for each sample x, the parameters of its probabilistic representation among the features are obtained using a neural network. T features are then sampled from the probability distribution. To allow subsequent gradient back-propagation, a heavily parametric technique may be used for sampling, equation (1).
After the sampled features are obtained, loss functions of the T features are obtained by using the sampled features to different regression methods.
Specifically, the direct regression method directly predicts a value using a regressor, i.e., a regression model, and trains using the L1 or L2 loss function, and then correspondingly, the probability-ordered features are applied to the T sampled features respectivelyThe upper loss function, taking the L2 loss function as an example, includes:
Figure BDA0002875176610000058
wherein y represents the predicted result of the regressor and w represents a parameter learnable in the regressor;
specifically, the original regression space is discretized into C classes based on a classification method, a cross entropy loss function is usually used for training, and similarly, the probability ordered features apply the above loss functions to the T sampled features respectively to obtain the following loss functions for training:
Figure BDA0002875176610000061
where C represents the number of all possible classes, C represents the true class label corresponding to sample x, and r is used to enumerate all possible classes.
Specifically, the sequence-based regression method firstly discretizes an original regression space into C categories, then uses C-1 binary classifiers, each classifier is responsible for predicting whether a label of a sample is greater than a certain sequence, in the training, C-1 cross entropy loss functions are used for training, and similarly, probability order features apply the above loss functions to T sampled features respectively to obtain the following loss functions for training:
Figure BDA0002875176610000062
wherein C represents all possible category numbers, wherein one total number of the C-1 binary classifiers is used, k is used for coordinate index with the value of 1-C-1 and b for respectively representing the prediction result and the label value on the C-1 binary classifierskShows the prediction result of the kth binary classifier, rkIt represents the corresponding label value of sample x on the kth binary classifier.
Wherein, bkShows the prediction result of the kth binary classifier, rkIt represents the corresponding label value of the sample x on the kth binary classifier as shown in fig. 2.
In the training, in addition to the loss function described above,and further providing ordered distribution constraint, wherein the constraint considers that the target space in the regression problem is always ordered, and the learned probability ordered characteristics can keep the ordering in the target space. In particular, for one triplet (x)l,xm,xn) And a corresponding label (y)l,ym,yn) And learned probabilistic ordering feature (z)l,zm,zn) The following constraints are learned:
Figure BDA0002875176610000063
where d () represents the distance between the distributions. Let S { (l, m, n) | | yl-ym|<|yl-yn| define the ordered distribution constraint as:
Figure BDA0002875176610000064
where d () represents the distance between the distributions. To measure the distance between the distributions, similar performance can be obtained with two different metric functions. The first uses symmetric KL divergence distances and the second is the Wasserstein distance.
With the proposed ordered distribution constraint, the unordered probability features in the feature space eventually become ordered probability features, which is shown in fig. 3.
Finally, in the training process, the ordered feature constraint is applied to the loss function at the same time for training, so for the direct regression method, the final loss function is:
Figure BDA0002875176610000065
where D represents the training data set and α is a hyperparameter. For the classification-based approach, the final loss function is:
Figure BDA0002875176610000071
Figure BDA0002875176610000072
for order-based partiesThe final loss function is:
Figure BDA0002875176610000073
it should be noted that other methods can obtain similar forms, so that the probability-ordered features can be trained to obtain a regression model with higher performance. Meanwhile, the uncertainty of the data is modeled by the variance term in the probability ordered characteristic, and for each sample, the harmonic mean of the predicted variance term diag (Σ (x)) is calculated to represent the uncertainty of the sample.
Therefore, one probability distribution can be used in the feature space to model uncertainty, which can be applied to various learning-based regression methods and ultimately improve the performance of the method. Meanwhile, the uncertainty of each test sample can be calculated according to the variance item in the probability ordered features, so that an uncertainty index is provided in the deployment of a real scene.
According to the uncertainty estimation method based on the regression model, the input sample is obtained, and the probability distribution characteristics of the input sample are extracted; wherein the input sample has a tag value; sampling T characteristics from the probability distribution characteristics; wherein T is a positive integer; acquiring T loss functions corresponding to the T characteristics, and processing the T loss functions to acquire a training loss function; and inputting the input sample into the regression model for processing to obtain a predicted value, adjusting parameters of the regression model according to the label value and the predicted value through a training loss function to obtain a trained regression model, inputting the data to be processed into the trained regression model, and obtaining a regression result and a target value. Therefore, the uncertainty, namely the target value, of each test data can be given, meanwhile, the accuracy of the regression result is effectively improved in modeling uncertainty, and a regression model with better performance is obtained.
In order to implement the above embodiments, the present application further provides an uncertainty estimation device based on a regression model.
Fig. 4 is a schematic structural diagram of an uncertainty estimation apparatus based on a regression model according to an embodiment of the present disclosure.
As shown in fig. 4, the regression model-based uncertainty estimation apparatus includes: a first acquisition module 410, an extraction module 420, a sampling module 430, a second acquisition module 440, a processing module 450, and a training estimation module 460.
A first obtaining module 410 for obtaining an input sample; wherein the input sample has a tag value.
And an extracting module 420, configured to extract a probability distribution characteristic of the input sample.
A sampling module 430 for sampling T features from the probability distribution features; wherein T is a positive integer.
A second obtaining module 440, configured to obtain T loss functions corresponding to the T features.
The processing module 450 is configured to process the T loss functions to obtain training loss functions.
And a training estimation module 460, configured to input the input sample into a regression model for processing, to obtain a predicted value, and adjust a parameter of the regression model according to the label value and the predicted value through the training loss function, to obtain a trained regression model, so that data to be processed is input into the trained regression model, and a regression result and a target value are obtained.
In the embodiment of the present application, the extracting module 420 is specifically used for
And processing the input samples through two neural networks respectively to obtain the mean value and the variance of high-dimensional Gaussian distribution as the probability distribution characteristics.
In one embodiment of the present application, the formula for sampling T features from the probability distribution features is:
Figure BDA0002875176610000081
wherein the content of the first and second substances,
Figure BDA0002875176610000082
the input samples x, theta1And theta2For the parameters of the two neural networks, diag () represents taking its diagonal elements, and t is time.
In one embodiment of the present application, the processing module 450 is specifically configured for
Summing and averaging the T loss functions to obtain an average loss function;
and obtaining an ordered distribution constraint function, and calculating the sum of the average loss function and the ordered distribution constraint function as the training loss function.
In one embodiment of the present application, the training loss function is formulated as:
Figure BDA0002875176610000083
wherein the content of the first and second substances,
Figure BDA0002875176610000084
representing the mean loss function, D training data set, alpha being a hyperparameter, LOrdRepresenting the ordered distribution constraint function.
The uncertainty estimation device based on the regression model obtains an input sample and extracts probability distribution characteristics of the input sample; wherein the input sample has a tag value; sampling T characteristics from the probability distribution characteristics; wherein T is a positive integer; acquiring T loss functions corresponding to the T characteristics, and processing the T loss functions to acquire a training loss function; and inputting the input sample into the regression model for processing to obtain a predicted value, adjusting parameters of the regression model according to the label value and the predicted value through a training loss function to obtain a trained regression model, inputting the data to be processed into the trained regression model, and obtaining a regression result and a target value. Therefore, the uncertainty, namely the target value, of each test data can be given, meanwhile, the accuracy of the regression result is effectively improved in modeling uncertainty, and a regression model with better performance is obtained.
It should be noted that the explanation of the embodiment of the uncertainty estimation method based on the regression model is also applicable to the uncertainty estimation device based on the regression model of the embodiment, and is not repeated here.
In the description herein, reference to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the application. In this specification, the schematic representations of the terms used above are not necessarily intended to refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, various embodiments or examples and features of different embodiments or examples described in this specification can be combined and combined by one skilled in the art without contradiction.
Furthermore, the terms "first", "second" and "first" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include at least one such feature. In the description of the present application, "plurality" means at least two, e.g., two, three, etc., unless specifically limited otherwise.
Any process or method descriptions in flow charts or otherwise described herein may be understood as representing modules, segments, or portions of code which include one or more executable instructions for implementing steps of a custom logic function or process, and alternate implementations are included within the scope of the preferred embodiment of the present application in which functions may be executed out of order from that shown or discussed, including substantially concurrently or in reverse order, depending on the functionality involved, as would be understood by those reasonably skilled in the art of the present application.
The logic and/or steps represented in the flowcharts or otherwise described herein, e.g., an ordered listing of executable instructions that can be considered to implement logical functions, can be embodied in any computer-readable medium for use by or in connection with an instruction execution system, apparatus, or device, such as a computer-based system, processor-containing system, or other system that can fetch the instructions from the instruction execution system, apparatus, or device and execute the instructions. For the purposes of this description, a "computer-readable medium" can be any means that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device. More specific examples (a non-exhaustive list) of the computer-readable medium would include the following: an electrical connection (electronic device) having one or more wires, a portable computer diskette (magnetic device), a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber device, and a portable compact disc read-only memory (CDROM). Additionally, the computer-readable medium could even be paper or another suitable medium upon which the program is printed, as the program can be electronically captured, via for instance optical scanning of the paper or other medium, then compiled, interpreted or otherwise processed in a suitable manner if necessary, and then stored in a computer memory.
It should be understood that portions of the present application may be implemented in hardware, software, firmware, or a combination thereof. In the above embodiments, the various steps or methods may be implemented in software or firmware stored in memory and executed by a suitable instruction execution system. If implemented in hardware, as in another embodiment, any one or combination of the following techniques, which are known in the art, may be used: a discrete logic circuit having a logic gate circuit for implementing a logic function on a data signal, an application specific integrated circuit having an appropriate combinational logic gate circuit, a Programmable Gate Array (PGA), a Field Programmable Gate Array (FPGA), or the like.
It will be understood by those skilled in the art that all or part of the steps carried by the method for implementing the above embodiments may be implemented by hardware related to instructions of a program, which may be stored in a computer readable storage medium, and when the program is executed, the program includes one or a combination of the steps of the method embodiments.
In addition, functional units in the embodiments of the present application may be integrated into one processing module, or each unit may exist alone physically, or two or more units are integrated into one module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode. The integrated module, if implemented in the form of a software functional module and sold or used as a stand-alone product, may also be stored in a computer readable storage medium.
The storage medium mentioned above may be a read-only memory, a magnetic or optical disk, etc. Although embodiments of the present application have been shown and described above, it is understood that the above embodiments are exemplary and should not be construed as limiting the present application, and that variations, modifications, substitutions and alterations may be made to the above embodiments by those of ordinary skill in the art within the scope of the present application.

Claims (10)

1. An uncertainty estimation method based on a regression model is characterized by comprising the following steps:
acquiring an input sample, and extracting probability distribution characteristics of the input sample; wherein the input sample has a tag value;
sampling T features from the probability distribution features; wherein T is a positive integer;
acquiring T loss functions corresponding to the T characteristics, and processing the T loss functions to acquire a training loss function;
and inputting the input sample into a regression model for processing to obtain a predicted value, adjusting parameters of the regression model according to the label value and the predicted value through the training loss function to obtain a trained regression model, so that the data to be processed is input into the trained regression model to obtain a regression result and a target value.
2. The method of claim 1, wherein said extracting probability distribution features of the input samples comprises:
and processing the input samples through two neural networks respectively to obtain the mean value and the variance of high-dimensional Gaussian distribution as the probability distribution characteristics.
3. The method of claim 2, wherein the formula for sampling T features from the probability distribution features is:
Figure FDA0002875176600000011
wherein the content of the first and second substances,
Figure FDA0002875176600000012
the input samples x, theta1And theta2For the parameters of the two neural networks, diag () represents taking its diagonal elements, and t is time.
4. The method of claim 1, wherein said processing said T loss functions to obtain training loss functions comprises:
summing and averaging the T loss functions to obtain an average loss function;
and obtaining an ordered distribution constraint function, and calculating the sum of the average loss function and the ordered distribution constraint function as the training loss function.
5. The method of claim 1, wherein the training loss function is formulated as:
Figure FDA0002875176600000013
wherein the content of the first and second substances,
Figure FDA0002875176600000014
representing the mean loss function, D training data set, alpha being a hyperparameter, LOrdRepresenting the ordered distribution constraint function.
6. A regression model-based uncertainty estimation method, comprising:
the first acquisition module is used for acquiring an input sample; wherein the input sample has a tag value;
the extraction module is used for extracting the probability distribution characteristics of the input samples;
a sampling module for sampling T features from the probability distribution features; wherein T is a positive integer;
a second obtaining module, configured to obtain T loss functions corresponding to the T features;
the processing module is used for processing the T loss functions to obtain training loss functions;
and the training estimation module is used for inputting the input sample into a regression model for processing to obtain a predicted value, adjusting the parameters of the regression model according to the label value and the predicted value through the training loss function to obtain a trained regression model, so that the data to be processed is input into the trained regression model to obtain a regression result and a target value.
7. The apparatus of claim 6, wherein the extraction module is specifically configured to
And processing the input samples through two neural networks respectively to obtain the mean value and the variance of high-dimensional Gaussian distribution as the probability distribution characteristics.
8. The method of claim 7, wherein the formula for sampling T features from the probability distribution features is:
Figure FDA0002875176600000021
wherein the content of the first and second substances,
Figure FDA0002875176600000022
the input samples x, theta1And theta2For the parameters of the two neural networks, diag () represents taking its diagonal elements, and t is time.
9. The apparatus of claim 6, wherein the processing module is specifically configured to
Summing and averaging the T loss functions to obtain an average loss function;
and obtaining an ordered distribution constraint function, and calculating the sum of the average loss function and the ordered distribution constraint function as the training loss function.
10. The apparatus of claim 6, wherein the training loss function is formulated as:
Figure FDA0002875176600000023
wherein the content of the first and second substances,
Figure FDA0002875176600000024
representing the mean loss function, D training data set, alpha being a hyperparameter, LordRepresenting the ordered distribution constraint function.
CN202011612532.2A 2020-12-30 2020-12-30 Uncertainty estimation method and device based on regression model Pending CN112613617A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011612532.2A CN112613617A (en) 2020-12-30 2020-12-30 Uncertainty estimation method and device based on regression model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011612532.2A CN112613617A (en) 2020-12-30 2020-12-30 Uncertainty estimation method and device based on regression model

Publications (1)

Publication Number Publication Date
CN112613617A true CN112613617A (en) 2021-04-06

Family

ID=75249755

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011612532.2A Pending CN112613617A (en) 2020-12-30 2020-12-30 Uncertainty estimation method and device based on regression model

Country Status (1)

Country Link
CN (1) CN112613617A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113127806A (en) * 2021-04-19 2021-07-16 上海工程技术大学 Regression analysis model selection method based on machine learning
CN113506328A (en) * 2021-07-16 2021-10-15 北京地平线信息技术有限公司 Method and device for generating sight line estimation model and method and device for estimating sight line
CN115565611A (en) * 2022-09-28 2023-01-03 广州译码基因科技有限公司 Biological regression prediction method, device, equipment and storage medium

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113127806A (en) * 2021-04-19 2021-07-16 上海工程技术大学 Regression analysis model selection method based on machine learning
CN113506328A (en) * 2021-07-16 2021-10-15 北京地平线信息技术有限公司 Method and device for generating sight line estimation model and method and device for estimating sight line
CN115565611A (en) * 2022-09-28 2023-01-03 广州译码基因科技有限公司 Biological regression prediction method, device, equipment and storage medium

Similar Documents

Publication Publication Date Title
CN112613617A (en) Uncertainty estimation method and device based on regression model
US6397200B1 (en) Data reduction system for improving classifier performance
EP3796228A1 (en) Device and method for generating a counterfactual data sample for a neural network
CN109271958B (en) Face age identification method and device
CN111325260B (en) Data processing method and device, electronic equipment and computer readable medium
CN116610092A (en) Method and system for vehicle analysis
WO2021079442A1 (en) Estimation program, estimation method, information processing device, relearning program, and relearning method
CN113406623A (en) Target identification method, device and medium based on radar high-resolution range profile
CN113762005B (en) Feature selection model training and object classification methods, devices, equipment and media
CN112115996B (en) Image data processing method, device, equipment and storage medium
CN113095351A (en) Method for generating marked data by means of an improvement of the initial marking
CN109743200B (en) Resource feature-based cloud computing platform computing task cost prediction method and system
JP6398991B2 (en) Model estimation apparatus, method and program
CN111814813A (en) Neural network training and image classification method and device
CN113723431B (en) Image recognition method, apparatus and computer readable storage medium
Alfaz et al. A deep convolutional neural network based approach to classify and detect crack in concrete surface using xception
US20210319269A1 (en) Apparatus for determining a classifier for identifying objects in an image, an apparatus for identifying objects in an image and corresponding methods
CN115565115A (en) Outfitting intelligent identification method and computer equipment
KR20190134380A (en) A Method of Association Learning for Domain Invariant Human Classifier with Convolutional Neural Networks and the method thereof
JP7306460B2 (en) Adversarial instance detection system, method and program
CN112446428A (en) Image data processing method and device
CN112508080A (en) Vehicle model identification method, device, equipment and medium based on experience playback
CN113469176A (en) Target detection model training method, target detection method and related equipment thereof
CN111160419A (en) Electronic transformer data classification prediction method and device based on deep learning
EP3940601A1 (en) Information processing apparatus, information processing method, and information program

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20210406