CN115828101A - Feature screening method and device, storage medium and computer equipment - Google Patents

Feature screening method and device, storage medium and computer equipment Download PDF

Info

Publication number
CN115828101A
CN115828101A CN202211573770.6A CN202211573770A CN115828101A CN 115828101 A CN115828101 A CN 115828101A CN 202211573770 A CN202211573770 A CN 202211573770A CN 115828101 A CN115828101 A CN 115828101A
Authority
CN
China
Prior art keywords
fitting
data
result
user
characteristic data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211573770.6A
Other languages
Chinese (zh)
Inventor
杨逸飞
陈飞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Weride Technology Co Ltd
Original Assignee
Guangzhou Weride Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Weride Technology Co Ltd filed Critical Guangzhou Weride Technology Co Ltd
Priority to CN202211573770.6A priority Critical patent/CN115828101A/en
Publication of CN115828101A publication Critical patent/CN115828101A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The feature screening method, the feature screening device, the storage medium and the computer equipment can firstly obtain a feature data set of a model to be trained, the feature data set comprises a plurality of label data, the label data comprise various vehicle feature data recorded when a plurality of unmanned vehicles have accidents at different moments in a preset historical time period and result statistical data of the accident occurrence times at different moments, after the various vehicle feature data and the result statistical data are obtained, the various vehicle feature data and the result statistical data can be sent to a front-end page to be displayed, a user can select various vehicle feature data to be combined in pairs in the front-end page, and/or select various vehicle feature data to be combined with the result statistical data in pairs to form combined data, so that the interactivity between the user and the feature screening can be enhanced, and a screening result meeting the user requirement can be obtained.

Description

Feature screening method and device, storage medium and computer equipment
Technical Field
The present application relates to the field of data processing technologies, and in particular, to a method and an apparatus for feature screening, a storage medium, and a computer device.
Background
The unmanned vehicle generates a large amount of data in daily driving and testing processes, and developers can use the data to perform relevant analysis, such as model training by using various vehicle characteristic data recorded when the unmanned vehicle has an accident within three months, wherein the vehicle characteristic data includes but is not limited to general date, timestamp, hours, and information such as the number of abnormal conditions, the number of passing intersections, the number of turns, the number of miles and the number of obstacles aiming at vehicle driving.
At present, when model training is performed by using various vehicle feature data, because some redundant and inefficient data may exist in the full amount of vehicle feature data, before the model training is performed, feature dimensions of a data set are generally reduced by selecting high-quality features and deleting irrelevant and redundant features, and classification efficiency and accuracy are improved. The existing feature screening modes mainly comprise a filtering method based on weight scoring and an encapsulation method based on a final model training evaluation function, and the two modes mainly comprise that a station filters in data dimension, the interactivity with a user is weak, and feature screening cannot be carried out according to the requirements of the user.
Disclosure of Invention
The present application aims to solve at least one of the above technical defects, and in particular, the technical defect that the feature screening method in the prior art has weak interactivity with the user and cannot perform feature screening according to the requirement of the user.
The application provides a feature screening method, which comprises the following steps:
acquiring a characteristic data set of a model to be trained, wherein the characteristic data set comprises a plurality of marking data, and the marking data comprise various vehicle characteristic data recorded when a plurality of unmanned vehicles have accidents at different moments in a preset historical time period and result statistical data of accident occurrence times at different moments;
sending each item of vehicle characteristic data and the result statistical data to a front-end page for displaying, and receiving a confirmation instruction returned by a user, wherein the confirmation instruction comprises a plurality of combined data obtained by combining each item of vehicle characteristic data in pairs and/or combining each item of vehicle characteristic data and the result statistical data in pairs;
fitting each combined data according to a preset fitting strategy, evaluating each fitting result to obtain a plurality of evaluation results, sending each fitting result and the corresponding evaluation result to the front-end page for displaying, and receiving a first selection result of each fitting result by a user;
and screening the vehicle characteristic data in the characteristic data set based on the first selection result, and training the model to be trained according to the screened characteristic data set.
Optionally, the sending the various vehicle feature data and the result statistical data to a front-end page for displaying, and receiving a confirmation instruction returned by the user includes:
combining each item of vehicle characteristic data in pairs, and/or sending a plurality of combined data obtained by combining each item of vehicle characteristic data and the result statistical data in pairs to a front-end page for displaying, and receiving a confirmation instruction returned after a user confirms the plurality of combined data; the number of the combined data contained in the confirmation instruction is not more than the number of the combined data displayed by the front-end page;
or sending each item of vehicle characteristic data and the result statistical data to a front-end page for displaying, and receiving a confirmation instruction returned by a user after the user combines each item of vehicle characteristic data and/or combines each item of vehicle characteristic data and the result statistical data.
Optionally, the fitting is performed on each combined data according to a preset fitting strategy, so as to obtain a plurality of fitting results, including:
judging whether a confirmation instruction returned by a user contains a custom fitting mode;
if yes, fitting each combined data according to the self-defined fitting mode to obtain a plurality of fitting results;
and if not, fitting each combined data according to a default fitting mode to obtain a plurality of fitting results.
Optionally, the fitting is performed on each combined data according to a default fitting manner, so as to obtain a plurality of fitting results, including:
sending a default fitting mode to the front-end page for displaying, wherein the default fitting mode at least comprises linear fitting and nonlinear fitting;
receiving a second selection result of the user on the default fitting mode, and fitting each combined data based on a fitting formula of the linear fitting to obtain a plurality of fitting results when the second selection result is the linear fitting;
and when the second selection result is the nonlinear fitting, acquiring nonlinear fitting parameters determined by the user in the second selection result, and fitting each combined data according to the nonlinear fitting parameters to obtain a plurality of fitting results.
Optionally, the evaluating each fitting result to obtain a plurality of evaluation results includes:
for each fit:
and evaluating the fitting result by utilizing at least one evaluation index, and obtaining the evaluation result of the fitting result under each evaluation index.
Optionally, the evaluation index at least comprises a correlation coefficient evaluation index and a mean square error evaluation index;
the evaluating the fitting result by using at least one evaluation index and obtaining the evaluation result of the fitting result under each evaluation index comprises:
evaluating the fitting result by using the correlation coefficient evaluation index, and obtaining a correlation coefficient of the fitting result under the correlation coefficient evaluation index;
and/or evaluating the fitting result by using the mean square error evaluation index, and obtaining the mean square error of the fitting result under the mean square error evaluation index.
Optionally, the sending each fitting result and the corresponding evaluation result to the front-end page for displaying includes:
and sequencing the fitting results according to the evaluation results corresponding to the fitting results, and sending the sequenced fitting results and the corresponding evaluation results to the front-end page for displaying.
Optionally, the generating process of the model to be trained includes:
acquiring a code file uploaded by a user or provided by a storage link on the cloud;
and generating a model to be trained according to the code file.
The application also provides a feature screening device, includes:
the data acquisition module is used for acquiring a characteristic data set of the model to be trained, wherein the characteristic data set comprises a plurality of marking data, and the marking data comprise various vehicle characteristic data recorded when a plurality of unmanned vehicles have accidents at different moments in a preset historical time period and result statistical data of accident occurrence times at different moments;
the characteristic combination module is used for sending each item of vehicle characteristic data and the result statistical data to a front-end page for displaying and receiving a confirmation instruction returned by a user, wherein the confirmation instruction comprises a plurality of combined data obtained by pairwise combination of each item of vehicle characteristic data and/or pairwise combination of each item of vehicle characteristic data and the result statistical data;
the combined evaluation module is used for respectively fitting each combined data according to a preset fitting strategy, evaluating each fitting result to obtain a plurality of evaluation results, sending each fitting result and the corresponding evaluation result to the front-end page for displaying, and receiving a first selection result of each fitting result by a user;
and the characteristic screening module is used for screening the vehicle characteristic data in the characteristic data set based on the first selection result and training the model to be trained according to the screened characteristic data set.
The present application also provides a storage medium having stored therein computer readable instructions, which, when executed by one or more processors, cause the one or more processors to perform the steps of the feature screening method as described in any one of the above embodiments.
The present application further provides a computer device, comprising: one or more processors, and a memory;
the memory has stored therein computer readable instructions which, when executed by the one or more processors, perform the steps of the feature screening method as in any one of the above embodiments.
According to the technical scheme, the embodiment of the application has the following advantages:
the feature screening method, the device, the storage medium and the computer equipment provided by the application can firstly obtain the feature data set of the model to be trained, the characteristic data set comprises a plurality of marking data, the marking data comprises various vehicle characteristic data recorded when accidents happen to a plurality of unmanned vehicles at different moments within a preset historical time period and result statistical data of the accident occurrence times at different moments, after obtaining various vehicle characteristic data and result statistical data, the application can send the various vehicle characteristic data and the result statistical data to a front-end page for displaying, a user can select various vehicle characteristic data from the front-end page to be combined in pairs, and/or selecting each item of vehicle characteristic data and result statistical data to be combined pairwise to form combined data, or obtaining a plurality of combined data according to the default combination mode of the application, when obtaining a plurality of combined data determined by a user, each combined data can be fitted according to a preset fitting strategy, each fitting result is evaluated to obtain a plurality of evaluation results, each fitting result and the corresponding evaluation result are sent to a front-end page to be displayed, a user can screen each fitting result according to the self requirement and the evaluation result of each fitting result, after receiving the first selection result of the user for each fitting result, the method and the device can screen the vehicle feature data in the feature data set according to the first selection result, and training the model to be trained according to the screened feature data set, so that the interactivity with the user can be enhanced during feature screening, and a screening result meeting the requirements of the user can be obtained.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art that other drawings can be obtained according to the drawings without inventive exercise.
Fig. 1 is a schematic flow chart of a feature screening method according to an embodiment of the present disclosure;
FIG. 2 is a diagram illustrating fitting results in a front-end page according to an embodiment of the present disclosure;
fig. 3 is a schematic structural diagram of a feature screening apparatus according to an embodiment of the present application;
fig. 4 is a schematic internal structural diagram of a computer device according to an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments in the present application without making any creative effort belong to the protection scope of the present application.
At present, when model training is performed by using various vehicle feature data, because some redundant and inefficient data may exist in the full amount of vehicle feature data, before the model training is performed, feature dimensions of a data set are generally reduced by selecting high-quality features and deleting irrelevant and redundant features, and classification efficiency and accuracy are improved. The existing feature screening modes mainly comprise a filtering method based on weight scoring and an encapsulation method based on a final model training evaluation function, and the two modes mainly comprise that a station filters in data dimension, the interactivity with a user is weak, and feature screening cannot be carried out according to the requirements of the user. Based on this, the following technical solutions are proposed in the present application, specifically see the following:
in an embodiment, as shown in fig. 1, fig. 1 is a schematic flow chart of a feature screening method provided in an embodiment of the present application; the application provides a feature screening method, which can comprise the following steps:
s110: and acquiring a characteristic data set of the model to be trained.
In this step, when performing the feature screening, a feature data set of the model to be trained may be obtained, and the feature data in the feature data set is screened.
Specifically, the model to be trained in the present application may be a linear regression model, or may be other network models that need to be trained, which is not limited herein; the feature data set in the application refers to training data used when a model to be trained is trained, the feature data set comprises a plurality of marking data, and the marking data mainly comprise various vehicle feature data recorded when a plurality of unmanned vehicles have accidents at different moments in a preset historical time period and result statistical data of accident occurrence times at different moments. The vehicle characteristic data includes, but is not limited to, general date, time stamp, hours, and information such as the number of abnormal situations, the number of intersections passing through, the number of turns, the number of miles, and the number of obstacles for vehicle driving, and the statistical data of the number of accidents occurring at different times may be 0 or any positive integer.
For example, the annotation data of the present application can be expressed as follows:
{"obs_cnt_front":{"0":10,"1":27,"2":22,"3":13,"4":25,"5":27,"6":13},
"obs_cnt_left":{"0":11,"1":41,"2":22,"3":25,"4":4,"5":25,"6":3,"7":3},
"hour":{"0":903171,"1":903172,"2":903184,"3":903215,"4":903175,"5":903366,"6":903180,"7":903181},
"incident_count":{"0":0,"1":0,"2":1,"3":0,"4":1,"5":0,"6":0,"7":4}
each piece of labeled data represents a type of vehicle feature data, each type of vehicle feature data includes both a feature label and feature data, such as the above-mentioned "obs _ cnt _ front", "obs _ cnt _ left", "hour" and "incident _ count" are feature labels, and the data in the above { } are feature data recorded at different time points in time sequence. The method can divide the labeled data into two types according to the accident occurrence reason and result, wherein one type is vehicle characteristic data which mainly records the vehicle driving condition of the unmanned vehicle when the accident occurs, such as the number of passing crossroads, the number of turns, the mileage, the number of obstacles and other information; the other type is result statistical data which is mainly obtained by counting the accident occurrence frequency of a plurality of unmanned vehicles at different moments in a preset historical time period. After the marked data are divided into two types, whether the different vehicle characteristic data are related or not and whether the different vehicle characteristic data are related or not can be analyzed, so that the accuracy of characteristic screening can be further improved.
S120: and sending the characteristic data and the result statistical data of each vehicle to a front-end page for displaying, and receiving a confirmation instruction returned by the user.
In this step, after the feature data set of the model to be trained is obtained through S110, the application may send each item of vehicle feature data and result statistical data in the feature data set to a front-end page for display, so that a user may combine each item of vehicle feature data in pairs according to his own needs, and/or combine each item of vehicle feature data with the result statistical data in pairs; or, the method and the device can directly combine each vehicle characteristic data in pairs, and/or display a plurality of combined data obtained by combining each vehicle characteristic data and the result statistical data in pairs on a front-end page so that a user can screen and confirm the combined data on the front-end page, and after the user confirms the combined data, the method and the device can receive a confirmation instruction returned by the user and perform subsequent analysis according to the combined data contained in the confirmation instruction.
S130: and respectively fitting each combined data according to a preset fitting strategy, evaluating each fitting result to obtain a plurality of evaluation results, sending each fitting result and the corresponding evaluation result to a front-end page for displaying, and receiving a first selection result of each fitting result by a user.
In this step, after the characteristic data and the result statistical data of each vehicle are sent to the front-end page for displaying through S120, and a confirmation instruction returned by the user is received, the confirmation instruction includes a plurality of combined data obtained by combining the characteristic data of each vehicle pairwise and/or combining the characteristic data of each vehicle pairwise and the result statistical data of each vehicle pairwise, and the application can respectively fit each combined data according to a preset fitting strategy and evaluate each fitting result to obtain a plurality of evaluation results.
In a specific implementation manner, after obtaining a plurality of combined data, the method and the device for fitting the combined data can fit each combined data according to a preset default fitting manner, and can also fit each combined data according to a user-defined fitting manner, and the specific fitting manner can be linear fitting or nonlinear fitting, can be selected according to actual conditions, and is not limited herein. Further, after each combined data is fitted, the fitting result obtained after each combined data is fitted can be evaluated so as to determine the association degree between two groups of feature data in each combined data, and then a user can select the fitting result according to the evaluation result and select valuable feature data.
It should be noted that each fitting result in the present application corresponds to at least one evaluation result, and may specifically be determined according to the number of evaluation indexes during evaluation, for example, when two or three evaluation indexes are provided in the present application, the fitting result may be evaluated by each evaluation index, and then a plurality of evaluation results are obtained, so that a user may select the fitting result comprehensively according to the plurality of evaluation results, and further improve the accuracy of feature screening.
Furthermore, when the fitting results and the corresponding evaluation results are sent to the front-end page for display, the fitting results can be sorted according to the evaluation results under a certain evaluation index, or the fitting results can be comprehensively sorted according to the evaluation results under a plurality of evaluation indexes, and when the fitting results are sorted, the numerical values can be sorted from large to small, or sorted from small to large, and the fitting results can be selected according to actual conditions, without limitation.
S140: and screening the vehicle characteristic data in the characteristic data set based on the first selection result, and training the model to be trained according to the screened characteristic data set.
In this step, after obtaining first selection result, this application can come to filter the vehicle characteristic data in the characteristic data set according to first selection result, and then obtains the characteristic data set after the screening, and then, this application can come to treat the training model according to the characteristic data set after the screening and train, and then promotes model training efficiency and training effect.
Specifically, when the vehicle feature data in the feature data set are screened according to the first selection result, the combined data corresponding to each fitting result in the first selection result can be determined, then the vehicle feature data in each combined data is extracted, the screened vehicle feature data is obtained after the extracted vehicle feature data is subjected to de-weighting, and the vehicle feature data can be used as input data of the model to be trained.
In the above embodiment, a feature data set of a model to be trained may be obtained, the feature data set includes a plurality of label data, the plurality of label data includes various vehicle feature data recorded when accidents occur at different times in a preset historical time period by a plurality of unmanned vehicles and result statistical data of the accident occurrence times at different times, after obtaining the various vehicle feature data and the result statistical data, the present application may send the various vehicle feature data and the result statistical data to a front-end page for display, a user may select various vehicle feature data for pairwise combination in the front-end page, and/or select various vehicle feature data and the result statistical data for pairwise combination to form combined data, or may obtain a plurality of combined data according to a default combination mode of the present application, after obtaining a plurality of combined data determined by the user, may respectively fit each combined data according to a preset fitting policy, and obtain a plurality of evaluation results after evaluating each fitting result, and then send each fitting result and corresponding evaluation result to the front-end page for display, and may select a first feature data according to the first selected feature data of the training results, and select a first feature data after obtaining the first feature data.
In one embodiment, the sending, in S120, each item of vehicle feature data and the result statistical data to a front-end page for displaying, and receiving a confirmation instruction returned by the user may include:
s121: combining each item of vehicle characteristic data in pairs, and/or sending a plurality of combined data obtained by combining each item of vehicle characteristic data and the result statistical data in pairs to a front-end page for displaying, and receiving a confirmation instruction returned after a user confirms the plurality of combined data; wherein the number of combined data included in the confirmation instruction is not greater than the number of combined data shown by the front-end page.
S122: or sending each item of vehicle characteristic data and the result statistical data to a front-end page for displaying, and receiving a confirmation instruction returned by a user after the user combines each item of vehicle characteristic data and/or combines each item of vehicle characteristic data and the result statistical data.
In this embodiment, after the feature data set of the model to be trained is obtained, the feature data of each vehicle and the result statistical data in the feature data set may be sent to a front-end page for display, so that a user may combine the feature data of each vehicle in pairs according to his own needs, and/or combine the feature data of each vehicle and the result statistical data in pairs; for example, after the application shows vehicle characteristic data such as the number of passing intersections, the number of turns, the mileage, the number of obstacles and the like and result statistical data of the accident occurrence times of a plurality of unmanned vehicles at different moments in a preset historical time period in a front-end page, a user can combine the number of turns and the accident occurrence times according to the requirement of the user, can also combine the accident occurrence times and the number of obstacles, can also combine the number of turns and the mileage, and the like, so that a plurality of combined data are obtained.
Or, the vehicle feature data can be combined in pairs directly, and/or a plurality of combined data obtained by combining the vehicle feature data and the result statistical data in pairs are displayed on a front-end page, a user can determine to select all default combined data, combined data meeting the self requirement can be screened from the default combined data, and after the user confirms the final combined data, a confirmation instruction can be returned.
In one embodiment, the fitting each combined data according to a preset fitting strategy in S130 to obtain a plurality of fitting results may include:
s131: and judging whether the confirmation instruction returned by the user contains a custom fitting mode.
S132: and if so, fitting each combined data according to the self-defined fitting mode to obtain a plurality of fitting results.
S133: and if not, fitting each combined data according to a default fitting mode to obtain a plurality of fitting results.
In this embodiment, when fitting each piece of combined data, it may be determined whether a user-defined fitting manner is included in a confirmation instruction returned by the user, if so, each piece of combined data is fitted according to the user-defined fitting manner, and a plurality of fitting results are obtained, and if not, each piece of combined data is fitted according to a default fitting manner, and a plurality of fitting results are obtained.
The fitting method can be used for receiving the user-defined fitting mode when fitting the combined data, and thus customized fitting can be performed according to the user-defined fitting mode, and user requirements are met. Furthermore, when the user does not provide a self-defined fitting mode, the combined data can be fitted through a preset default fitting mode, so that the association degree of the combined data can be judged according to the fitting result, and the requirements of the user are considered.
In one embodiment, the fitting each combined data according to a default fitting manner in S133 to obtain a plurality of fitting results may include:
s1331: and sending a default fitting mode to the front-end page for displaying, wherein the default fitting mode at least comprises linear fitting and nonlinear fitting.
S1332: and receiving a second selection result of the user on the default fitting mode, and fitting each combined data based on the fitting formula of the linear fitting to obtain a plurality of fitting results when the second selection result is the linear fitting.
S1333: and when the second selection result is the nonlinear fitting, acquiring nonlinear fitting parameters determined by the user in the second selection result, and fitting each combined data according to the nonlinear fitting parameters to obtain a plurality of fitting results.
In this embodiment, when each combination data is fitted according to a default fitting manner, in order to enhance interactivity with a user, the default fitting manner may be sent to the front-end page for display, a second selection result of the user on the default fitting manner is received, and then the combination data is fitted according to the second selection result.
Specifically, the method and the device can set various default fitting modes for the user to select, for example, the method and the device can set linear fitting and nonlinear fitting modes for the user to select, and when the user selects linear fitting, each combined data can be fitted according to a preset linear fitting formula, so that a plurality of fitting results are obtained; when the user selects the nonlinear fitting, the nonlinear fitting parameters determined by the user in the second selection result can be obtained, and each combined data is fitted according to the nonlinear fitting parameters, so that a plurality of fitting results are obtained.
It is to be understood that both linear and non-linear fits in this application are a form of curve fitting. Where x and y in the linear fit are both observed quantities, and y is a function of x: f (x) = y = a + bx, and curve fitting is to find the optimal estimated values of the parameters a, b through observed values of x, y, and find the optimal theoretical curve f (x) = y = a + bx; for the non-linear fitting, the fitting may be implemented by a polynomial fitting, and the polynomial fitting requires a user to provide an order, and certainly, when the order is not provided by the user, a default order may also be preset, which is not limited herein.
In one embodiment, the evaluating the fitting results in S130 to obtain a plurality of evaluation results, which may include:
s310: for each fit: and evaluating the fitting result by utilizing at least one evaluation index, and obtaining the evaluation result of the fitting result under each evaluation index.
In this embodiment, when each fitting result is evaluated to obtain a plurality of evaluation results, at least one evaluation index may be used to evaluate the fitting result, so as to obtain an evaluation result of the fitting result under each evaluation index.
In one embodiment, the evaluation index includes at least a correlation coefficient evaluation index and a mean square error evaluation index; in S310, evaluating the fitting result by using at least one evaluation indicator, and obtaining an evaluation result of the fitting result under each evaluation indicator, may include:
s311: and evaluating the fitting result by using the correlation coefficient evaluation index, and obtaining the correlation coefficient of the fitting result under the correlation coefficient evaluation index.
S312: and/or evaluating the fitting result by using the mean square error evaluation index, and obtaining the mean square error of the fitting result under the mean square error evaluation index.
In this embodiment, when predicting future data by using a statistical algorithm, different evaluation indexes are often used to evaluate the quality of a prediction result, including: mean Square Error (MSE), root Mean Square Error (RMSE), mean absolute deviation (MAE), deviation (BIAS), correlation Coefficient (CORR), accuracy (ACCURATE), and the like. Therefore, when the application predicts the quality of the fitting result, one or more evaluation indexes can be selected to evaluate the fitting result.
Specifically, the fitting result can be evaluated by using the correlation coefficient evaluation index and/or the mean square error evaluation index, so as to obtain a corresponding evaluation result. Of course, the fitting result can be evaluated by selecting evaluation indexes such as average absolute deviation, deviation and accuracy, and the setting can be performed according to actual conditions, which is not limited herein.
In one embodiment, the sending each fitting result and the corresponding evaluation result to the front-end page for displaying in S130 may include:
and sequencing the fitting results according to the evaluation results corresponding to the fitting results, and sending the sequenced fitting results and the corresponding evaluation results to the front-end page for displaying.
In this embodiment, when each fitting result and the corresponding evaluation result are sent to the front-end page for display, each fitting result may be sorted according to the evaluation result corresponding to each fitting result, and the sorted fitting results and the corresponding evaluation results are sent to the front-end page for display.
Further, when the fitting results and the corresponding evaluation results are sent to the front-end page for display, the fitting results can be sorted according to the evaluation results under a certain evaluation index, or the fitting results can be comprehensively sorted according to the evaluation results under a plurality of evaluation indexes, and when the fitting results are sorted, the numerical values can be sorted from large to small, or sorted from small to large, and the fitting results can be selected according to actual conditions, without limitation.
Schematically, as shown in fig. 2, fig. 2 is a diagram showing a fitting result in a front-end page provided by an embodiment of the present application; in fig. 2, when the statistical data of the selected results is combined with the characteristic data of each vehicle, and the fitting results are evaluated through a correlation Coefficient (CORR) evaluation index and a Mean Square Error (MSE) evaluation index, and the fitting results are sorted from high to low according to the evaluation result under the correlation Coefficient (CORR) evaluation index, the sorting results shown in fig. 2 can be obtained, where the sorting results include combined data obtained by combining the number of passing intersections and the number of times of occurrence of an accident, combined data obtained by combining the mileage number and the number of times of occurrence of an accident, and other combined data, where the combined data obtained by combining the number of passing intersections and the number of times of occurrence of an accident is fitted and evaluated to obtain a correlation coefficient of 0.3206 and a mean square error of 1.01, and the combined data obtained by combining the mileage number and the number of occurrence of an accident is fitted and evaluated to obtain a correlation coefficient of 0.2913 and a mean square error of 1.03, and the user can select the combined data meeting the needs of the user through the sorting results.
In one embodiment, the generating process of the model to be trained may include:
s111: and acquiring a code file uploaded by a user or provided by a storage link on the cloud.
S112: and generating a model to be trained according to the code file.
In this embodiment, when the model to be trained is trained by using the filtered feature data set, since the feature data set of the model to be trained is obtained in advance and the model to be trained is not obtained, before the model to be trained is trained, a code file uploaded by a user or provided through a cloud storage link can be obtained, and the model to be trained is generated according to the code file.
Furthermore, the method and the device can also score the model to be trained after training the model to be trained by utilizing the feature training sets before and after screening, and judge the effect of the feature screening according to the scoring result.
The feature screening apparatus provided in the embodiments of the present application is described below, and the feature screening apparatus described below and the feature screening method described above may be referred to in correspondence with each other.
In one embodiment, as shown in fig. 3, fig. 3 is a schematic structural diagram of a feature screening apparatus provided in an embodiment of the present application; the present application further provides a feature screening apparatus, which may include a data obtaining module 210, a feature combining module 220, a combination evaluating module 230, and a feature screening module 240, and specifically includes the following:
the data acquisition module 210 is configured to acquire a feature data set of the model to be trained, where the feature data set includes a plurality of labeled data, and the labeled data includes various vehicle feature data recorded when accidents occur at different times within a preset historical time period for a plurality of unmanned vehicles, and result statistical data of the number of times of the accidents occur at different times.
And the characteristic combination module 220 is configured to send each item of vehicle characteristic data and the result statistical data to a front-end page for displaying, and receive a confirmation instruction returned by a user, where the confirmation instruction includes a plurality of combination data obtained by combining each item of vehicle characteristic data in pairs and/or combining each item of vehicle characteristic data with the result statistical data in pairs.
The combination evaluation module 230 is configured to respectively fit each combination data according to a preset fitting strategy, evaluate each fitting result to obtain a plurality of evaluation results, send each fitting result and the corresponding evaluation result to the front-end page for display, and receive a first selection result of each fitting result by the user.
And the feature screening module 240 is configured to screen the vehicle feature data in the feature data set based on the first selection result, and train the model to be trained according to the screened feature data set.
In the above embodiment, a feature data set of a model to be trained may be obtained, the feature data set includes a plurality of label data, the plurality of label data includes various vehicle feature data recorded when accidents occur at different times in a preset historical time period by a plurality of unmanned vehicles and result statistical data of the accident occurrence times at different times, after obtaining the various vehicle feature data and the result statistical data, the present application may send the various vehicle feature data and the result statistical data to a front-end page for display, a user may select various vehicle feature data for pairwise combination in the front-end page, and/or select various vehicle feature data and the result statistical data for pairwise combination to form combined data, or may obtain a plurality of combined data according to a default combination mode of the present application, after obtaining a plurality of combined data determined by the user, may respectively fit each combined data according to a preset fitting policy, and obtain a plurality of evaluation results after evaluating each fitting result, and then send each fitting result and corresponding evaluation result to the front-end page for display, and may select a first feature data according to the first selected feature data of the training results, and select a first feature data after obtaining the first feature data.
In one embodiment, the present application further provides a storage medium having stored therein computer-readable instructions, which, when executed by one or more processors, cause the one or more processors to perform the steps of the feature screening method as described in any one of the above embodiments.
In one embodiment, the present application further provides a computer device comprising: one or more processors, and a memory.
The memory has stored therein computer readable instructions which, when executed by the one or more processors, perform the steps of the feature screening method of any one of the above embodiments.
Fig. 4 is a schematic diagram illustrating an internal structure of a computer device according to an embodiment of the present disclosure, and the computer device 300 may be provided as a server, as shown in fig. 4. Referring to fig. 4, the computer device 300 includes a processing component 302 that further includes one or more processors and memory resources, represented by memory 301, for storing instructions, such as application programs, that are executable by the processing component 302. The application programs stored in memory 301 may include one or more modules that each correspond to a set of instructions. Further, the processing component 302 is configured to execute instructions to perform the feature screening method of any of the embodiments described above.
The computer device 300 may also include a power component 303 configured to perform power management of the computer device 300, a wired or wireless network interface 304 configured to connect the computer device 300 to a network, and an input output (I/O) interface 305. The computer device 300 may operate based on an operating system stored in memory 301, such as Windows Server, mac OS XTM, unix, linux, free BSDTM, or the like.
Those skilled in the art will appreciate that the architecture shown in fig. 4 is merely a block diagram of some of the structures associated with the disclosed aspects and is not intended to limit the computing devices to which the disclosed aspects apply, as particular computing devices may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.
Finally, it should also be noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrases "comprising one of 8230; \8230;" 8230; "does not exclude the presence of additional like elements in a process, method, article, or apparatus that comprises the element.
In the present specification, the embodiments are described in a progressive manner, each embodiment focuses on differences from other embodiments, the embodiments may be combined as needed, and the same and similar parts may be referred to each other.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present application. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the application. Thus, the present application is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (11)

1. A method of feature screening, the method comprising:
acquiring a characteristic data set of a model to be trained, wherein the characteristic data set comprises a plurality of marking data, and the marking data comprise various vehicle characteristic data recorded when a plurality of unmanned vehicles have accidents at different moments in a preset historical time period and result statistical data of accident occurrence times at different moments;
sending each item of vehicle characteristic data and the result statistical data to a front-end page for displaying, and receiving a confirmation instruction returned by a user, wherein the confirmation instruction comprises a plurality of combined data obtained by combining each item of vehicle characteristic data in pairs and/or combining each item of vehicle characteristic data and the result statistical data in pairs;
fitting each combined data according to a preset fitting strategy, evaluating each fitting result to obtain a plurality of evaluation results, sending each fitting result and the corresponding evaluation result to the front-end page for displaying, and receiving a first selection result of each fitting result by a user;
and screening the vehicle characteristic data in the characteristic data set based on the first selection result, and training the model to be trained according to the screened characteristic data set.
2. The feature screening method according to claim 1, wherein the sending the feature data of each vehicle and the result statistical data to a front-end page for displaying and receiving a confirmation instruction returned by a user comprises:
combining each item of vehicle characteristic data in pairs, and/or sending a plurality of combined data obtained by combining each item of vehicle characteristic data and the result statistical data in pairs to a front-end page for displaying, and receiving a confirmation instruction returned after a user confirms the plurality of combined data; the number of the combined data contained in the confirmation instruction is not more than the number of the combined data displayed by the front-end page;
or sending each item of vehicle characteristic data and the result statistical data to a front-end page for displaying, and receiving a confirmation instruction returned by a user after the user combines each item of vehicle characteristic data and/or combines each item of vehicle characteristic data and the result statistical data.
3. The feature screening method according to claim 1, wherein the fitting is performed on each of the combined data according to a preset fitting strategy to obtain a plurality of fitting results, and the fitting results include:
judging whether a confirmation instruction returned by a user contains a custom fitting mode;
if yes, fitting each combined data according to the self-defined fitting mode to obtain a plurality of fitting results;
and if not, fitting each combined data according to a default fitting mode to obtain a plurality of fitting results.
4. The feature screening method according to claim 3, wherein the fitting each of the combined data according to a default fitting manner to obtain a plurality of fitting results comprises:
sending a default fitting mode to the front-end page for displaying, wherein the default fitting mode at least comprises linear fitting and nonlinear fitting;
receiving a second selection result of the user on the default fitting mode, and fitting each combined data based on a fitting formula of the linear fitting to obtain a plurality of fitting results when the second selection result is the linear fitting;
and when the second selection result is the nonlinear fitting, acquiring nonlinear fitting parameters determined by the user in the second selection result, and fitting each combined data according to the nonlinear fitting parameters to obtain a plurality of fitting results.
5. The method of claim 1, wherein the evaluating each of the fitting results to obtain a plurality of evaluation results comprises:
for each fit:
and evaluating the fitting result by utilizing at least one evaluation index, and obtaining the evaluation result of the fitting result under each evaluation index.
6. The feature screening method according to claim 5, wherein the evaluation index includes at least a correlation coefficient evaluation index and a mean square error evaluation index;
the evaluating the fitting result by using at least one evaluation index and obtaining the evaluation result of the fitting result under each evaluation index comprises:
evaluating the fitting result by using the correlation coefficient evaluation index, and obtaining a correlation coefficient of the fitting result under the correlation coefficient evaluation index;
and/or evaluating the fitting result by using the mean square error evaluation index, and obtaining the mean square error of the fitting result under the mean square error evaluation index.
7. The feature screening method according to any one of claims 1 to 6, wherein the sending each fitting result and the corresponding evaluation result to the front end page for display comprises:
and sequencing the fitting results according to the evaluation results corresponding to the fitting results, and sending the sequenced fitting results and the corresponding evaluation results to the front-end page for displaying.
8. The feature screening method according to any one of claims 1 to 6, wherein the generation process of the model to be trained comprises:
acquiring a code file uploaded by a user or provided by a cloud storage link;
and generating a model to be trained according to the code file.
9. A feature screening apparatus, comprising:
the data acquisition module is used for acquiring a characteristic data set of the model to be trained, wherein the characteristic data set comprises a plurality of marking data, and the marking data comprise various vehicle characteristic data recorded when a plurality of unmanned vehicles have accidents at different moments in a preset historical time period and result statistical data of accident occurrence times at different moments;
the characteristic combination module is used for sending each item of vehicle characteristic data and the result statistical data to a front-end page for displaying and receiving a confirmation instruction returned by a user, wherein the confirmation instruction comprises a plurality of combined data obtained by pairwise combination of each item of vehicle characteristic data and/or pairwise combination of each item of vehicle characteristic data and the result statistical data;
the combined evaluation module is used for respectively fitting each combined data according to a preset fitting strategy, evaluating each fitting result to obtain a plurality of evaluation results, sending each fitting result and the corresponding evaluation result to the front-end page for displaying, and receiving a first selection result of each fitting result by a user;
and the characteristic screening module is used for screening the vehicle characteristic data in the characteristic data set based on the first selection result and training the model to be trained according to the screened characteristic data set.
10. A storage medium, characterized by: the storage medium having stored therein computer-readable instructions which, when executed by one or more processors, cause the one or more processors to perform the steps of the feature screening method of any one of claims 1 to 8.
11. A computer device, comprising: one or more processors, and a memory;
the memory has stored therein computer readable instructions which, when executed by the one or more processors, perform the steps of the feature screening method of any one of claims 1 to 8.
CN202211573770.6A 2022-12-08 2022-12-08 Feature screening method and device, storage medium and computer equipment Pending CN115828101A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211573770.6A CN115828101A (en) 2022-12-08 2022-12-08 Feature screening method and device, storage medium and computer equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211573770.6A CN115828101A (en) 2022-12-08 2022-12-08 Feature screening method and device, storage medium and computer equipment

Publications (1)

Publication Number Publication Date
CN115828101A true CN115828101A (en) 2023-03-21

Family

ID=85544697

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211573770.6A Pending CN115828101A (en) 2022-12-08 2022-12-08 Feature screening method and device, storage medium and computer equipment

Country Status (1)

Country Link
CN (1) CN115828101A (en)

Similar Documents

Publication Publication Date Title
CN111241154B (en) Storage battery fault early warning method and system based on big data
CN110807930B (en) Dangerous vehicle early warning method and device
KR101617349B1 (en) Diagnostic system and method for the analysis of driving behavior
US20140081675A1 (en) Systems, methods, and apparatus for optimizing claim appraisals
CN102881162A (en) Data processing and fusion method for large-scale traffic information
CN110738523B (en) Maintenance order quantity prediction method and device
CN112102518A (en) Synchronization method and device of vehicle maintenance reminding information, server and storage medium
CN110991999A (en) Method and device for improving law enforcement amount cutting efficiency, computer equipment and storage medium
CN110619482A (en) Driving behavior scoring method based on logistic regression and single-level analysis weighting method
CN105654574A (en) Vehicle equipment-based driving behavior evaluation method and vehicle equipment-based driving behavior evaluation device
US11120308B2 (en) Vehicle damage detection method based on image analysis, electronic device and storage medium
CN110807493A (en) Optimization method and equipment of vehicle classification model
CN116935659B (en) High-speed service area bayonet vehicle auditing system and method thereof
CN115151920A (en) Driving evaluation method and device and non-transitory computer readable storage medium
CN115828101A (en) Feature screening method and device, storage medium and computer equipment
CN114066288B (en) Intelligent data center-based emergency detection method and system for operation road
CN115439263A (en) Accurate vehicle insurance premium evaluation method based on automatic driving behavior assets
CN115496440A (en) Method and device for determining second-hand car inventory
Huang et al. Towards automated model calibration and validation in rail transit simulation
US20110213747A1 (en) System and method for aggregating information
CN112381560B (en) Shared equipment product market prediction system and method
CN113660461B (en) Method, device and equipment for evaluating performance of video passing equipment
CN113112160B (en) Diagnostic data processing method, diagnostic data processing device and electronic equipment
Shen et al. Real-time traffic prediction using GPS data with low sampling rates: a hybrid approach
CN115375978B (en) Behavior information determination method and apparatus, storage medium, and electronic apparatus

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination