CN116451125A - New energy vehicle owner identification method, device, equipment and storage medium - Google Patents

New energy vehicle owner identification method, device, equipment and storage medium Download PDF

Info

Publication number
CN116451125A
CN116451125A CN202310649173.5A CN202310649173A CN116451125A CN 116451125 A CN116451125 A CN 116451125A CN 202310649173 A CN202310649173 A CN 202310649173A CN 116451125 A CN116451125 A CN 116451125A
Authority
CN
China
Prior art keywords
feature
model
data
initial
features
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310649173.5A
Other languages
Chinese (zh)
Inventor
孙澄澄
陈煦
朱磊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Technology Shenzhen Co Ltd
Original Assignee
Ping An Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Technology Shenzhen Co Ltd filed Critical Ping An Technology Shenzhen Co Ltd
Priority to CN202310649173.5A priority Critical patent/CN116451125A/en
Publication of CN116451125A publication Critical patent/CN116451125A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/10Pre-processing; Data cleansing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/211Selection of the most significant subset of features
    • G06F18/2113Selection of the most significant subset of features by ranking or filtering the set of features, e.g. using a measure of variance or of feature cross-correlation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • G06N3/0455Auto-encoder networks; Encoder-decoder networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/60Other road transportation technologies with climate change mitigation effect
    • Y02T10/70Energy storage systems for electromobility, e.g. batteries

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Computational Linguistics (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The application discloses a method, a device, equipment and a storage medium for identifying a new energy vehicle owner, belonging to the technical field of artificial intelligence and the field of financial production insurance. According to the vehicle identification method, the training data are subjected to feature extraction, initial features are input into an initial identification model, the initial identification model comprises a feature crossing unit and a feature prediction unit, feature crossing items among the initial features are calculated through the feature crossing unit, the initial features and the feature crossing items are used as input data of the feature prediction unit to train the feature prediction unit, a prediction result is output, model parameters are adjusted until the model is fitted, the identification model is generated, vehicle data to be identified are imported into the identification model, and a vehicle owner identification result is output. According to the method and the device, the recognition model can learn and train the low-order features and Gao Jiete cards at the same time through calculating the feature cross terms, a large number of complicated manual feature engineering treatments are not needed, a large amount of manpower investment is avoided while the recognition accuracy of the model is ensured, and the cost is saved.

Description

New energy vehicle owner identification method, device, equipment and storage medium
Technical Field
The application belongs to the technical field of artificial intelligence and the field of financial production insurance, and particularly relates to a method, a device, equipment and a storage medium for identifying a new energy vehicle owner.
Background
The new energy automobile refers to an automobile using novel energy to replace traditional fuel oil, and comprises an electric automobile, a hybrid electric automobile, a fuel cell automobile and the like. The development of new energy automobiles has trend and front Jing Xing, and the development of new energy automobiles is continuously promoted by multiple factors such as technology, market, application and the like in the future, so that the new energy automobiles become an important direction of transformation, upgrading and sustainable development of automobile industry. With global importance on environmental protection and energy safety, new energy automobiles are becoming a trend of future automobile development.
New energy automobiles sales have increased rapidly over the past few years, and for insurance companies operating car insurance, there is a need to deal with opportunities and challenges presented by new business forms. Because of the changes of new energy automobile products and sales modes, the customer groups of the new energy automobiles are structurally changed compared with the traditional fuel automobile customers, for example, the female duty ratio in the new energy automobile owners is higher, the duty ratio of the new energy automobile owners with multiple automobiles is higher, and the like. At the same time, the whole vehicle owner scale number and the sales space of the vehicle insurance products are further expanded, but in general, an insurance company has no way to get the sales data of the vehicle sellers, so that the potential passenger group of the vehicle insurance products is difficult to determine. In the face of the rising trend of current new energy owners year by year, insurance companies need to actively explore and mine customer features, quickly and accurately identify potential passenger groups, and therefore, seek a suitable touch conversion mode to cope with market competition.
The traditional new energy vehicle owner identification scheme often forms screening rules based on direct use behaviors such as charging, but usable data are limited to be collected by an insurance company own vehicle service platform, and most of the use behaviors come from converted clients, so that the identification efficiency of the data is low and the coverage is insufficient. For data collected by other service platforms, such as data generated on some vehicle service platforms, the processing of the data often needs to mine the correlation and implicit information between the data and the features besides the features, and a large amount of complex feature engineering processing is needed by using a traditional deep learning model, so that huge labor cost investment is generated.
Disclosure of Invention
The embodiment of the application aims to provide a new energy vehicle owner identification method, a new energy vehicle owner identification device, computer equipment and a storage medium, so as to solve the technical problems that an existing new energy vehicle owner identification scheme is realized by using a traditional deep learning model, a large amount of labor cost is required to be input for characteristic engineering treatment, the cost is high, and the efficiency is low.
In order to solve the technical problems, the embodiment of the application provides a new energy vehicle owner identification method, which adopts the following technical scheme:
A new energy vehicle owner identification method comprises the following steps:
pre-training data are acquired and preprocessed, wherein the pre-training data are historical vehicle data collected in advance;
extracting features of the preprocessed pre-training data to obtain initial features, and inputting the initial features into a preset initial recognition model, wherein the initial recognition model comprises a feature crossing unit and a feature prediction unit;
calculating feature cross terms between the initial features by the feature cross unit;
taking the initial characteristic and the characteristic cross item as input data of the characteristic prediction unit, training the characteristic prediction unit by utilizing the input data, and outputting a prediction result;
adjusting model parameters of the initial recognition model according to the prediction result until the model is fitted, and generating a new energy vehicle owner recognition model;
and receiving a vehicle owner identification instruction, acquiring vehicle data to be identified, importing the vehicle data to be identified into the new energy vehicle owner identification model, and outputting a vehicle owner identification result.
Further, the preprocessing includes data screening, data cleaning, standardization and data set division, and the acquiring the pre-training data and preprocessing the pre-training data specifically includes:
Importing the pre-training data from a preset database;
performing data screening on the pre-training data according to a preset data screening rule;
performing data cleaning and data standardization processing on the screened pre-training data;
and carrying out data set division on the pre-training data subjected to data cleaning and data standardization processing to obtain a training set, a verification set and a test set.
Further, before the initial feature is input to a preset initial recognition model, the method further comprises:
classifying the initial features to obtain discrete features and continuous features;
carrying out one-hot coding on the discrete features and carrying out normalization processing on the continuous features;
and combining the discrete features after the one-hot encoding and the continuous features after the normalization processing to obtain feature combinations.
Further, the initial recognition model is a Deep FM model, the Deep FM model includes an FM sub-model and a Deep sub-model, the feature cross unit is the FM sub-model, the feature prediction unit is the Deep sub-model, and the feature cross unit calculates feature cross terms between the initial features, including:
Selecting any one of the initial features as a target initial feature,
respectively calculating the inner products of the target initial features and the rest initial features to obtain first-order cross terms of the target initial features;
obtaining the hidden vector of the FM submodel, and multiplying the first-order cross item by the hidden vector to obtain a second-order cross item of the target initial characteristic;
and combining the first-order cross terms and the second-order cross terms to obtain the characteristic cross terms.
Further, the Deep sub-model is an MLP model, the MLP model includes a plurality of fully connected layers and an output layer, the initial feature and the feature cross term are used as input data of the feature prediction unit, the input data is used for training the feature prediction unit, and a prediction result is output, and the method specifically includes:
splicing the initial feature and the feature cross item to obtain a full feature vector;
sequentially introducing the full feature vectors into a plurality of full connection layers to perform feature processing to obtain feature fitting vectors;
and importing the characteristic fitting vector to the output layer to obtain a prediction result.
Further, the step of sequentially introducing the full feature vectors into a plurality of full connection layers to perform feature processing to obtain feature fitting vectors, specifically includes:
The full feature vector is imported into a first full connection layer to obtain a first output vector;
performing nonlinear transformation on the first output vector;
the first output vector after nonlinear transformation is imported into a second full-connection layer to obtain a second output vector;
sequentially leading the output vector of the last full-connection layer into the next full-connection layer until the feature fitting vector output by the last full-connection layer is obtained;
before the output vector of the last full-connection layer is led into the next full-connection layer in turn until the feature fitting vector of the output of the last full-connection layer is obtained, the method further comprises:
for the last whole of the connecting layer the output vector is non-linearly transformed.
Further, the model parameters of the initial recognition model are adjusted according to the prediction result until the model is fitted, and a new energy vehicle owner recognition model is generated, which specifically comprises:
calculating an error between the prediction result and a preset standard result based on a loss function of the MLP model to obtain a prediction error;
comparing the prediction error with a preset error threshold;
and if the prediction error is greater than the error threshold, continuously adjusting the model parameters of the initial recognition model until the prediction error is less than or equal to the error threshold, and generating a new energy vehicle owner recognition model.
In order to solve the technical problem, the embodiment of the application also provides a new energy vehicle owner identification device, which adopts the following technical scheme:
a new energy vehicle owner identification device, comprising:
the data processing module is used for acquiring pre-training data and preprocessing the pre-training data, wherein the pre-training data is historical vehicle data collected in advance;
the feature extraction module is used for extracting features of the preprocessed pre-training data to obtain initial features, and inputting the initial features into a preset initial recognition model, wherein the initial recognition model comprises a feature crossing unit and a feature prediction unit;
a cross calculation module for calculating feature cross items between the initial features by the feature cross unit;
the feature learning module is used for taking the initial feature and the feature cross item as input data of the feature prediction unit, training the feature prediction unit by utilizing the input data, and outputting a prediction result;
the model adjustment module is used for adjusting model parameters of the initial recognition model according to the prediction result until the model is fitted, so as to generate a new energy vehicle owner recognition model;
The vehicle owner identification module is used for receiving a vehicle owner identification instruction, acquiring vehicle data to be identified, importing the vehicle data to be identified into the new energy vehicle owner identification model, and outputting a vehicle owner identification result.
In order to solve the above technical problems, the embodiments of the present application further provide a computer device, which adopts the following technical schemes:
a computer device comprising a memory and a processor, the memory having stored therein computer readable instructions which when executed by the processor implement the steps of the new energy vehicle owner identification method of any of the above.
In order to solve the above technical problems, embodiments of the present application further provide a computer readable storage medium, which adopts the following technical solutions:
a computer readable storage medium having stored thereon computer readable instructions which when executed by a processor implement the steps of the new energy vehicle owner identification method of any of the above.
Compared with the prior art, the embodiment of the application has the following main beneficial effects:
the application discloses a method, a device, equipment and a storage medium for identifying a new energy vehicle owner, belonging to the technical field of artificial intelligence and the field of financial production insurance. According to the method, the pre-training data are preprocessed, the preprocessed pre-training data are subjected to feature extraction to obtain initial features, the initial features are input into an initial recognition model, the initial recognition model comprises a feature crossing unit and a feature prediction unit, feature crossing items among the initial features are calculated through the feature crossing unit, the initial features and the feature crossing items are used as input data of the feature prediction unit, the feature prediction unit is trained by the input data, a prediction result is output, model parameters of the initial recognition model are adjusted according to the prediction result until model fitting is achieved, a new energy vehicle owner recognition model is generated, a vehicle owner recognition instruction is received, vehicle data to be recognized is obtained, the vehicle data to be recognized are imported into the new energy vehicle owner recognition model, and the vehicle owner recognition result is output. According to the method, the vehicle owner identification model structure comprising the feature intersection unit and the feature prediction unit is adopted, the feature intersection item is calculated through the feature intersection unit, so that low-order features and Gao Jiete evidence used for model training are obtained, the low-order features and Gao Jiete evidence are simultaneously learned and trained through the feature prediction unit, a large number of complicated manual feature engineering treatments are not needed, a large amount of manpower investment is avoided while model identification precision is ensured, cost is saved, and the purchase rate of related insurance products of new energy vehicles is improved through vehicle owner identification of the new energy vehicles.
Drawings
For a clearer description of the solution in the present application, a brief description will be given below of the drawings that are needed in the description of the embodiments of the present application, it being obvious that the drawings in the following description are some embodiments of the present application, and that other drawings may be obtained from these drawings without inventive effort for a person of ordinary skill in the art.
FIG. 1 illustrates an exemplary system architecture diagram in which the present application may be applied;
FIG. 2 illustrates a flow chart of one embodiment of a method of primary identification of a new energy vehicle in accordance with the present application;
FIG. 3 illustrates a flow chart of one embodiment of step S203 in FIG. 2;
FIG. 4 illustrates a flow chart of one embodiment of step S204 in FIG. 2;
FIG. 5 is a schematic diagram illustrating the construction of one embodiment of a new energy vehicle owner identification device according to the present application;
fig. 6 shows a schematic structural diagram of one embodiment of a computer device according to the present application.
Detailed Description
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs; the terminology used in the description of the applications herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the application; the terms "comprising" and "having" and any variations thereof in the description and claims of the present application and in the description of the figures above are intended to cover non-exclusive inclusions. The terms first, second and the like in the description and in the claims or in the above-described figures, are used for distinguishing between different objects and not necessarily for describing a sequential or chronological order.
Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment may be included in at least one embodiment of the present application. The appearances of such phrases in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. Those of skill in the art will explicitly and implicitly appreciate that the embodiments described herein may be combined with other embodiments.
For a better understanding of the present application, reference will be made to the following drawings, the technical solutions in the embodiments of the present application are clearly and completely described.
As shown in fig. 1, a system architecture 100 may include terminal devices 101, 102, 103, a network 104, and a server 105. The network 104 is used for the terminal equipment 101, 102 103 and a server 105. The network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others.
The user may interact with the server 105 via the network 104 using the terminal devices 101, 102, 103 to receive or send messages or the like. Various communication client applications, such as a web browser application, a shopping class application, a search class application, an instant messaging tool, a mailbox client, social platform software, etc., may be installed on the terminal devices 101, 102, 103.
The terminal devices 101, 102, 103 may be various electronic devices having a display screen and supporting web browsing, including but not limited to smartphones, tablet computers, electronic book readers, MP3 players (Moving Picture Experts Group Audio Layer III, dynamic video expert compression standard audio plane 3), MP4 (Moving Picture Experts Group Audio Layer IV, dynamic video expert compression standard audio plane 4) players, laptop and desktop computers, and the like.
The server 105 may be a server that provides various services, such as a background server that provides support for pages displayed on the terminal devices 101, 102, 103, and may be a stand-alone server, or may be a cloud server that provides cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communications, middleware services, domain name services, security services, content delivery networks (Content Delivery Network, CDN), and basic cloud computing services such as big data and artificial intelligence platforms.
It should be noted that, the method for identifying the owner of the new energy vehicle provided by the embodiment of the application is generally executed by the server, and correspondingly, the device for identifying the owner of the new energy vehicle is generally arranged in the server.
It should be understood that the number of terminal devices, networks and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.
With continued reference to FIG. 2, a flow chart of one embodiment of a new energy vehicle owner identification method according to the present application is shown. The embodiment of the application can acquire and process the related data based on the artificial intelligence technology. Among these, artificial intelligence (Artificial Intelligence, AI) is the theory, method, technique and application system that uses a digital computer or a digital computer-controlled machine to simulate, extend and extend human intelligence, sense the environment, acquire knowledge and use knowledge to obtain optimal results.
Artificial intelligence infrastructure technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a robot technology, a biological recognition technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and other directions.
The traditional new energy vehicle owner identification scheme often forms screening rules based on direct use behaviors such as charging, but usable data are limited to be collected by an insurance company own vehicle service platform, and most of the use behaviors come from converted clients, so that the identification efficiency of the data is low and the coverage is insufficient. For data collected by other service platforms, such as data generated on some vehicle service platforms, the processing of the data often needs to mine the correlation and implicit information between the data and the features besides the features, and a large amount of complex feature engineering processing is needed by using a traditional deep learning model, so that huge labor cost investment is generated.
In order to solve the technical problems, the application discloses a new energy vehicle owner identification method, device, equipment and storage medium, belongs to the technical field of artificial intelligence and the financial production field, and is characterized in that a feature cross unit and a feature prediction unit are included in the application, feature cross items are calculated through the feature cross unit so as to obtain low-order features and Gao Jiete used for model training, and the low-order features and Gao Jiete cards are simultaneously learned and trained through the feature prediction unit.
In the new energy vehicle owner identification project, some basic information (such as gender, age, income, occupation and the like) of the vehicle owner can be used as input characteristics, and then a deep FM algorithm is used for training a model, so that the vehicle owner identification is realized.
The deep FM algorithm is a model which integrates a deep neural network and a factorization machine, and has the main advantages of being capable of automatically learning high-order and low-order characteristics and having lower labor cost in the aspect of characteristic engineering.
Specifically, the deep fm algorithm is implemented by combining the second order cross terms of the factorizer with the higher order cross terms of the deep neural network. The factorizer can capture the second-order cross relation between the input features, the deep neural network can learn the higher-order feature cross relation, and the output results of the two models can be added, so that a final prediction result is obtained. Therefore, the deep FM algorithm can capture the characteristic cross relation of low order and high order at the same time, so that the prediction accuracy is improved.
In the new energy owner identification project, basic information of the owner can be used as input characteristics, and then a deep FM algorithm is used for training a model. During the training process, a large amount of owner data is required to train the model to enable better learning of the relationships between features. After training is completed, the model can be used for identifying new vehicle owner data, so that the aims of identifying accuracy and speed are achieved. Meanwhile, the deep FM algorithm does not need to carry out complex characteristic engineering, so that the labor cost can be greatly reduced, and the research and development of new energy vehicle owner identification projects can be more efficiently completed.
The new energy vehicle owner identification method comprises the following steps:
s201, pre-training data is obtained, and pre-processing is carried out on the pre-training data, wherein the pre-training data is historical vehicle data collected in advance.
In the above embodiment, the server imports the pre-training data from the database, and performs pre-processing on the pre-training data, where the pre-training data is historical vehicle data collected in advance, and the pre-processing includes at least data screening, data cleaning, standardization and data set division.
Further, the method for acquiring the pre-training data and preprocessing the pre-training data specifically comprises the following steps:
Leading in pre-training data from a preset database;
according to preset data screening rule pairs pre-training data for data screening;
performing data cleaning and data standardization processing on the screened pre-training data;
and carrying out data set division on the pre-training data subjected to data cleaning and data standardization processing to obtain a training set, a verification set and a test set.
In the above embodiment, when the deep fm model is used for new energy vehicle owner identification, the pretreatment and cleaning of the data are critical, and the following basic steps are:
and (3) data acquisition: and collecting relevant data of the vehicle owners, including driving records, charging records, vehicle information and the like. For different data types, different preprocessing operations need to be performed, such as for time stamp data, which can be converted into a specific time format; and the vehicle information data can be normalized, so that the vehicle data of different brands can be converted into a uniform format.
Data screening: data screening is required to obtain features that have an important impact on vehicle owner identification. For example, for the driving record data, the characteristics of average speed, driving time, driving distance and the like can be extracted according to different road sections and time periods; for the charging record data, characteristics such as charging frequency, charging electric quantity and the like can be extracted according to charging time and charging pile type. When the feature selection is performed, a feature selection algorithm can be used to screen out features which have an important influence on the identification of the vehicle owner.
In a specific embodiment of the present application, data screening is performed based on the spark platform integrating more than 200 underlying factors, including two major categories, namely, a base car data factor and a consumption business factor, for example, some base car data factors are as follows:
user information: including user ID, gender, age, region, model, etc.
Travel information: including mileage, time, start and end points, travel time, etc.
Vehicle information: including vehicle model, brand, color, age, etc.
Charging information: including charging time, charging location, charging type, etc.
Weather information: including temperature, humidity, wind speed, etc.
Some of the consumption business factors are as follows:
on-line behavior: the type of the buried point, the page type, the triggering times, the triggering time, the residence time and the like related to the triggering vehicle.
Vehicle-related consumption records: consumer category, amount of consumption, time of consumption, frequency of consumption, etc.
Automobile loan records: the number of loans, the amount of loans, the time of loans, the type of loan products, etc.
Information of the insured person: gender, age, annual income level, occupation type, number of historical insurance policies, number of historical insurance targets, customer value hierarchy, online content preferences, and the like.
Information of the insuring target: the date of initial boarding of the vehicle, the type of power of the vehicle, the price of the vehicle, the historical frequency of risk, the historical amount of reimbursement, etc.
The above factors are only exemplary, and not the owners of the new energy vehicles are all the factors of the bottom layer of the project, and other factors may be actually included in the project, and the factors are determined according to the actual requirements of the project. When Spark is used for bottom layer factor integration, data reading, processing and conversion can be performed through APIs such as Spark SQL or Spark DataFrame. For example, data screening and splicing are performed through select, join and other operations of Spark SQL, data aggregation and statistics are performed through groupby, agg and other operations of Spark DataFrame, and finally obtained bottom factor data are used for training and prediction of deep FM model.
Data cleaning: after the data is collected, it needs to be cleaned to remove incomplete, duplicate, erroneous data. For example, as the travel record data, data whose speed is negative, repeated data, or the like may be removed; for the charge recording data, data with a charge time of 0 may be removed.
Feature scaling and normalization: different scaling and normalization processes are required for different types of features. For example, for numerical data such as a driving distance and a charging amount, maximum and minimum normalization may be performed; for discrete data such as vehicle brands and colors, a single heat encoding process may be performed.
Data set partitioning: the cleaned and converted data set is divided into a training set, a validation set and a test set. When dividing the data set, care needs to be taken to avoid the situation of over fitting and under fitting and ensure that the distributions of the training set, the verification set and the test set are similar.
Further, before the initial feature is input into the preset initial recognition model, the method further comprises:
classifying the initial features to obtain discrete features and continuous features;
carrying out one-hot coding on the discrete features and carrying out normalization processing on the continuous features;
and combining the discrete features after the one-hot encoding and the continuous features after the normalization processing to obtain feature combinations.
In this embodiment, by classifying the initial features to obtain discrete features and continuous features, performing one-hot encoding on the discrete features, normalizing the continuous features, and combining the one-hot encoded discrete features and the normalized continuous features to obtain a feature combination.
Wherein, for discrete features, an Embedding technique is used to map each discrete feature into a low-dimensional dense vector. The purpose of this mapping is to enable the model to automatically learn the relationships between features, since in a high-dimensional sparse feature space, it is difficult to directly find the correlations between features. The Embedding layer maps each discrete feature into a $d $dimension dense vector, $d$ being a user-defined hyper-parameter.
For continuous features, normalization operations are employed, scaling it between [0,1 ]. Specifically, for each successive feature, it can be normalized to a distribution with a mean of 0 and a variance of 1, which is then scaled between [0,1] by a Sigmoid function or a Tanh function.
In the above embodiment, by classifying the initial features and processing the discrete features and the continuous features respectively, both the discrete features and the continuous features can be used for prediction tasks in the deep fm model, and interactions between the features can be more accurate and effective through the Embedding and normalization operations.
S202, extracting features of the preprocessed pre-training data to obtain initial features, and inputting the initial features into a preset initial recognition model, wherein the initial recognition model comprises a feature crossing unit and a feature prediction unit.
In the above embodiment, the initial recognition model is a Deep FM model, the Deep FM model includes an FM sub-model and a Deep sub-model, the feature intersection unit is an FM sub-model, the feature prediction unit is a Deep sub-model, and the Deep sub-model is an MLP model.
The deep FM model is a deep learning model based on a combination of a neural network and a factorizer (Factorization Machine, FM). The method combines the FM model with a multi-layer perceptron (MLP), so that cross information between features can be fully mined, and nonlinear representation of high-order features can be learned, thereby improving the prediction performance of the model.
The core idea of the deep FM model is to combine the feature-crossing portion of the FM model and the feature-embedding portion of the MLP model to form an end-to-end trainable model. Specifically, the deep FM model first calculates the cross terms of all first and second order features using the FM model, and then inputs these cross terms into the MLP model along with the original feature vectors for training. In the training process, the model learns the interaction relation and the nonlinear representation among different features so as to improve the prediction accuracy of the model.
S203, calculating feature cross items among the initial features through a feature cross unit.
In this embodiment, feature cross terms between the initial features are calculated by the FM sub-model, the feature cross terms including first order cross terms and second order cross terms.
Further, referring to fig. 3, the feature cross terms between the initial features are calculated by the feature cross unit, which specifically includes:
s301, selecting any initial feature as a target initial feature;
s302, respectively calculating inner products of the target initial features and the rest initial features to obtain first-order cross terms of the target initial features;
s303, obtaining hidden vectors of the FM submodel, and multiplying the first-order cross terms by the hidden vectors to obtain second-order cross terms of the initial characteristics of the target;
S304, combining the first-order cross terms and the second-order cross terms to obtain the characteristic cross terms.
In this embodiment, feature cross terms of each initial feature need to be calculated through an FM sub-model, wherein any one initial feature is selected as a target initial feature, inner products of the target initial feature and the rest initial features are calculated respectively to obtain first-order cross terms of the target initial feature, hidden vectors V of the FM sub-model are obtained, and the first-order cross terms and the hidden vectors V are multiplied to obtain second-order cross terms of the target initial feature.
The deep fm model computes two-dimensional cross information between features in the form of a hidden vector V. Specifically, the deep FM model will first calculate the cross terms of all first and second order features through the FM model, where the second order cross terms are calculated by multiplying the hidden vector V of the feature by the first order cross term. The hidden vector V can be considered as a low-dimensional representation of the model-learned feature vector, and thus can better capture interactions between features.
For a sample with n features, the deep FM model randomly initializes a k-dimensional hidden vector V for the feature vector in each feature dimension and takes it as a parameter in the FM model. The deep fm model then automatically learns the hidden vectors V for each feature vector and uses them to calculate the intersection information between features, trained on the data. In the training process, the deep fm model not only learns the interaction relationship between features, but also can capture higher-order feature interactions by using a deeper neural network structure.
S204, taking the initial characteristics and the characteristic cross items as input data of a characteristic prediction unit, training the characteristic prediction unit by utilizing the input data, and outputting a prediction result.
In this embodiment, the initial feature and the feature cross term are spliced to obtain a full feature vector, and the feature prediction unit is trained by using the full feature vector to output a prediction result.
Further, referring to fig. 4, the mlp model includes a plurality of fully connected layers and an output layer, uses initial features and feature cross terms as input data of a feature prediction unit, trains the feature prediction unit by using the input data, and outputs a prediction result, and specifically includes:
s401, splicing the initial feature and the feature cross item to obtain a full feature vector;
s402, sequentially importing the full feature vectors into a plurality of full connection layers to perform feature processing to obtain feature fitting vectors;
s403, the feature fitting vector is imported to an output layer to obtain a prediction result.
Further, the full feature vector is sequentially led into a plurality of full connection layers for feature processing to obtain a feature fitting vector, which specifically comprises the following steps:
importing the full feature vector into a first full connection layer to obtain a first output vector;
Performing nonlinear transformation on the first output vector;
leading the first output vector after nonlinear transformation into a second full-connection layer to obtain a second output vector;
sequentially leading the output vector of the last full-connection layer into the next full-connection layer until the feature fitting vector of the output of the last full-connection layer is obtained;
before the output vector of the last full-connection layer is led into the next full-connection layer in sequence until the feature fitting vector of the output of the last full-connection layer is obtained, the method further comprises the following steps:
and carrying out nonlinear transformation on the output vector of the last full connection layer.
In this embodiment, the MLP model calculation process may be divided into the following steps:
1. splicing the input feature vectors to obtain a vector containing all the features;
2. inputting the spliced feature vectors into a first full-connection layer to obtain a group of new feature vectors;
3. the new feature vector is subjected to nonlinear transformation, and is generally realized by using activating functions such as ReLU and the like;
4. inputting the transformed feature vector to the next full connection layer, and obtaining a new set of feature vectors again;
5. repeating the step 3 and the step 4 until a final feature vector, namely a feature fitting vector, is obtained;
6. And inputting the characteristic fitting vector to an output layer to obtain a prediction result.
In the above embodiment, in each fully connected layer, a plurality of neurons are used to transform and extract the features of the input feature vector, and through multi-layer fully connected and nonlinear transformation, the MLP model can learn the features of higher layers, so as to improve the performance of the model. Meanwhile, in the deep FM model, the MLP model and the FM model are jointly trained, so that second-order cross information and high-order nonlinear information can be fully utilized, and the prediction accuracy of the model is improved.
And S205, adjusting model parameters of the initial recognition model according to the prediction result until the model is fitted, and generating a new energy vehicle owner recognition model.
In this embodiment, a prediction error is calculated according to a prediction result, and model parameters of an initial recognition model are adjusted through the prediction error until the model is fitted, so as to generate a new energy vehicle owner recognition model.
Further, according to the prediction result, the model parameters of the initial recognition model are adjusted until the model is fitted, and a new energy vehicle owner recognition model is generated, which specifically comprises the following steps:
calculating an error between a prediction result and a preset standard result based on a loss function of the MLP model to obtain a prediction error;
Comparing the prediction error with a preset error threshold;
and if the prediction error is greater than the error threshold, continuously adjusting the model parameters of the initial recognition model until the prediction error is less than or equal to the error threshold, and generating a new energy vehicle owner recognition model.
In this embodiment, an error between a prediction result and a preset standard result is calculated based on a loss function of an MLP model to obtain a prediction error, where the preset standard result is pre-training data carrying initial features, the prediction error is transmitted in network layers of a deep fm model through a back propagation algorithm, the prediction error of each network layer of the deep fm model is compared with a preset error threshold, if the prediction error of any network layer is greater than the error threshold, a gradient descent method is used to update model parameters continuously adjusting the initial recognition model until the prediction errors of all network layers are less than or equal to the error threshold, and a new energy vehicle owner recognition model is generated.
In the above embodiment, the model parameters include the weight parameters W and the bias term b of the MLP part, and the embedding matrix and the bias term of the FM part, and the model parameters are continuously adjusted by updating with a gradient descent method until the model converges, so that the new energy vehicle owner identification model can be obtained.
S206, receiving a vehicle owner identification instruction, acquiring vehicle data to be identified, importing the vehicle data to be identified into a new energy vehicle owner identification model, and outputting a vehicle owner identification result.
In this embodiment, after the training of the vehicle owner identification model of the new energy vehicle is completed, when receiving the vehicle owner identification instruction, the vehicle data to be identified can be obtained, the vehicle data to be identified is imported into the vehicle owner identification model of the new energy vehicle, and the vehicle owner identification result is output.
In this embodiment, the electronic device (for example, the server shown in fig. 1) on which the new energy vehicle owner identification method operates may receive the owner identification instruction through a wired connection manner or a wireless connection manner. It should be noted that the wireless connection may include, but is not limited to, 3G/4G connections, wiFi connections, bluetooth connections, wiMAX connections, zigbee connections, UWB (ultra wideband) connections, and other now known or later developed wireless connection means.
In the embodiment, the application discloses a new energy vehicle owner identification method, and belongs to the technical field of artificial intelligence and the financial risk production field. According to the method, the pre-training data are preprocessed, the preprocessed pre-training data are subjected to feature extraction to obtain initial features, the initial features are input into an initial recognition model, the initial recognition model comprises a feature crossing unit and a feature prediction unit, feature crossing items among the initial features are calculated through the feature crossing unit, the initial features and the feature crossing items are used as input data of the feature prediction unit, the feature prediction unit is trained by the input data, a prediction result is output, model parameters of the initial recognition model are adjusted according to the prediction result until model fitting is achieved, a new energy vehicle owner recognition model is generated, a vehicle owner recognition instruction is received, vehicle data to be recognized is obtained, the vehicle data to be recognized are imported into the new energy vehicle owner recognition model, and the vehicle owner recognition result is output. According to the method, the vehicle owner identification model structure comprising the feature intersection unit and the feature prediction unit is adopted, feature intersection items are calculated through the feature intersection unit, so that low-order features and Gao Jiete evidence used for model training are obtained, the low-order features and Gao Jiete evidence are simultaneously learned and trained through the feature prediction unit, a large number of complicated manual feature engineering treatments are not needed, a large number of manpower inputs are avoided when model identification precision is ensured, cost is saved, vehicle owners of new energy vehicles are identified, sales of related insurance products of the new energy vehicles are promoted, and purchase rate of the related insurance products of the new energy vehicles is improved.
It should be emphasized that, to further ensure the privacy and security of the vehicle data to be identified, the vehicle data to be identified may also be stored in a node of a blockchain.
The blockchain referred to in the application is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, consensus mechanism, encryption algorithm and the like. The Blockchain (Blockchain), which is essentially a decentralised database, is a string of data blocks that are generated by cryptographic means in association, each data block containing a batch of information of network transactions for verifying the validity of the information (anti-counterfeiting) and generating the next block. The blockchain may include a blockchain underlying platform, a platform product services layer, an application services layer, and the like.
Those skilled in the art will appreciate that implementing all or part of the processes of the methods of the embodiments described above may be accomplished by way of computer readable instructions, stored on a computer readable storage medium, which when executed may comprise processes of embodiments of the methods described above. The storage medium may be a nonvolatile storage medium such as a magnetic disk, an optical disk, a Read-Only Memory (ROM), or a random access Memory (Random Access Memory, RAM).
It should be understood that, although the steps in the flowcharts of the figures are shown in order as indicated by the arrows, these steps are not necessarily performed in order as indicated by the arrows. The steps are not strictly limited in order and may be performed in other orders, unless explicitly stated herein. Moreover, at least some of the steps in the flowcharts of the figures may include a plurality of sub-steps or stages that are not necessarily performed at the same time, but may be performed at different times, the order of their execution not necessarily being sequential, but may be performed in turn or alternately with other steps or at least a portion of the other steps or stages.
With further reference to fig. 5, as an implementation of the method shown in fig. 2, the present application provides an embodiment of a new energy vehicle owner identification device, where the embodiment of the device corresponds to the embodiment of the method shown in fig. 2, and the device may be specifically applied to various electronic devices.
As shown in fig. 5, the new energy vehicle owner identification device 500 according to the present embodiment includes:
the data processing module 501 is configured to obtain pre-training data, and pre-process the pre-training data, where the pre-training data is pre-collected historical vehicle data;
The feature extraction module 502 is configured to perform feature extraction on the preprocessed pre-training data to obtain initial features, and input the initial features into a preset initial recognition model, where the initial recognition model includes a feature intersection unit and a feature prediction unit;
a cross calculation module 503 for calculating feature cross items between the initial features by the feature cross unit;
the feature learning module 504 is configured to take the initial feature and the feature cross term as input data of the feature prediction unit, train the feature prediction unit by using the input data, and output a prediction result;
the model adjustment module 505 is configured to adjust model parameters of the initial recognition model according to the prediction result until the model is fitted, so as to generate a new energy vehicle owner recognition model;
the vehicle owner identification module 506 is configured to receive a vehicle owner identification instruction, obtain vehicle data to be identified, import the vehicle data to be identified into a new energy vehicle owner identification model, and output a vehicle owner identification result.
Further, preprocessing includes data screening, data cleansing, normalization, and data set partitioning, and the data processing module 501 specifically includes:
the data importing unit is used for importing the pre-training data from a preset database;
The data screening unit is used for carrying out data screening on the pre-training data according to a preset data screening rule;
the data preprocessing unit is used for carrying out data cleaning and data standardization processing on the screened pre-training data;
the data dividing unit is used for dividing the data set of the pre-training data subjected to data cleaning and data standardization processing to obtain a training set, a verification set and a test set.
Further, the new energy vehicle owner identification device 500 further includes:
the feature classification module is used for classifying the initial features to obtain discrete features and continuous features;
the feature coding module is used for carrying out one-hot coding on the discrete features and carrying out normalization processing on the continuous features;
and the feature normalization module is used for combining the discrete features after the one-hot encoding and the continuous features after the normalization processing to obtain feature combinations.
Further, the initial recognition model is a Deep FM model, the Deep FM model includes an FM sub-model and a Deep sub-model, the feature intersection unit is an FM sub-model, the feature prediction unit is a Deep sub-model, and the intersection calculation module 503 specifically includes:
a feature selection unit for selecting any one of the initial features as a target initial feature,
The first-order intersection unit is used for respectively calculating the inner products of the target initial characteristics and the rest initial characteristics to obtain first-order intersection items of the target initial characteristics;
the second-order intersection unit is used for acquiring hidden vectors of the FM submodel and multiplying the first-order intersection item and the hidden vectors to obtain a second-order intersection item of the initial characteristic of the target;
and the cross item combination unit is used for combining the first-order cross item and the second-order cross item to obtain the characteristic cross item.
Further, the Deep sub-model is an MLP model, and the MLP model includes a plurality of fully connected layers and an output layer, and the feature learning module 504 specifically includes:
the feature splicing unit is used for splicing the initial feature and the feature cross item to obtain a full feature vector;
the feature fitting unit is used for sequentially guiding the full feature vectors into a plurality of full connection layers to perform feature processing to obtain feature fitting vectors;
and the result output unit is used for guiding the characteristic fitting vector into the output layer to obtain a prediction result.
Further, the feature fitting unit specifically includes:
the first vector input subunit is used for guiding the full feature vector into a first full connection layer to obtain a first output vector;
a nonlinear transformation subunit, configured to perform nonlinear transformation on the first output vector;
A second vector input subunit, configured to guide the first output vector after nonlinear transformation into a second full-connection layer to obtain a second output vector;
the cyclic input subunit is used for sequentially leading the output vector of the last full-connection layer into the next full-connection layer until the feature fitting vector output by the last full-connection layer is obtained;
the feature fitting unit further includes:
and the cyclic conversion subunit is used for carrying out nonlinear conversion on the output vector of the last full-connection layer.
Further, the model adjustment module 505 specifically includes:
the error calculation unit is used for calculating an error between a prediction result and a preset standard result based on a loss function of the MLP model to obtain a prediction error;
the error comparison unit is used for comparing the prediction error with a preset error threshold value;
and the parameter adjusting unit is used for continuously adjusting the model parameters of the initial recognition model when the prediction error is larger than the error threshold value until the prediction error is smaller than or equal to the error threshold value, so as to generate a new energy vehicle owner recognition model.
In the embodiment, the application discloses a new energy vehicle owner identification device, belongs to artificial intelligence technical field and finance and produces dangerous field. According to the method, the pre-training data are preprocessed, the preprocessed pre-training data are subjected to feature extraction to obtain initial features, the initial features are input into an initial recognition model, the initial recognition model comprises a feature crossing unit and a feature prediction unit, feature crossing items among the initial features are calculated through the feature crossing unit, the initial features and the feature crossing items are used as input data of the feature prediction unit, the feature prediction unit is trained by the input data, a prediction result is output, model parameters of the initial recognition model are adjusted according to the prediction result until model fitting is achieved, a new energy vehicle owner recognition model is generated, a vehicle owner recognition instruction is received, vehicle data to be recognized is obtained, the vehicle data to be recognized are imported into the new energy vehicle owner recognition model, and the vehicle owner recognition result is output. According to the method, the vehicle owner identification model structure comprising the feature intersection unit and the feature prediction unit is adopted, the feature intersection item is calculated through the feature intersection unit, so that low-order features and Gao Jiete evidence used for model training are obtained, the low-order features and Gao Jiete evidence are simultaneously learned and trained through the feature prediction unit, a large number of complicated manual feature engineering treatments are not needed, the model identification precision is ensured, a large amount of manpower investment is avoided, and the cost is saved.
In order to solve the technical problems, the embodiment of the application also provides computer equipment. Referring specifically to fig. 6, fig. 6 is a basic structural block diagram of a computer device according to the present embodiment.
The computer device 6 comprises a memory 61, a processor 62, a network interface 63 communicatively connected to each other via a system bus. It is noted that only computer device 6 having components 61-63 is shown in the figures, but it should be understood that not all of the illustrated components are required to be implemented and that more or fewer components may be implemented instead. It will be appreciated by those skilled in the art that the computer device herein is a device capable of automatically performing numerical calculations and/or information processing in accordance with predetermined or stored instructions, the hardware of which includes, but is not limited to, microprocessors, application specific integrated circuits (Application Specific Integrated Circuit, ASICs), programmable gate arrays (fields-Programmable Gate Array, FPGAs), digital processors (Digital Signal Processor, DSPs), embedded devices, etc.
The computer equipment can be a desktop computer, a notebook computer, a palm computer, a cloud server and other computing equipment. The computer equipment can perform man-machine interaction with a user through a keyboard, a mouse, a remote controller, a touch pad or voice control equipment and the like.
The memory 61 includes at least one type of readable storage media including flash memory, hard disk, multimedia card, card memory (e.g., SD or DX memory, etc.), random Access Memory (RAM), static Random Access Memory (SRAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), programmable Read Only Memory (PROM), magnetic memory, magnetic disk, optical disk, etc. In some embodiments, the storage 61 may be an internal storage unit of the computer device 6, such as a hard disk or a memory of the computer device 6. In other embodiments, the memory 61 may also be an external storage device of the computer device 6, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash Card (Flash Card) or the like, which are provided on the computer device 6. Of course, the memory 61 may also comprise both an internal memory unit of the computer device 6 and an external memory device. In this embodiment, the memory 61 is generally used to store an operating system and various application software installed on the computer device 6, such as computer readable instructions of a new energy vehicle owner identification method. Further, the memory 61 may be used to temporarily store various types of data that have been output or are to be output.
The processor 62 may be a central processing unit (Central Processing Unit, CPU), controller, microcontroller, microprocessor, or other data processing chip in some embodiments. The processor 62 is typically used to control the overall operation of the computer device 6. In this embodiment, the processor 62 is configured to execute computer readable instructions stored in the memory 61 or process data, such as computer readable instructions for executing the method for identifying a vehicle owner of the new energy vehicle.
The network interface 63 may comprise a wireless network interface or a wired network interface, which network interface 63 is typically used for establishing a communication connection between the computer device 6 and other electronic devices.
In the above embodiment, the application discloses a computer device, which belongs to the technical field of artificial intelligence and the field of financial production. According to the method, the pre-training data are preprocessed, the preprocessed pre-training data are subjected to feature extraction to obtain initial features, the initial features are input into an initial recognition model, the initial recognition model comprises a feature crossing unit and a feature prediction unit, feature crossing items among the initial features are calculated through the feature crossing unit, the initial features and the feature crossing items are used as input data of the feature prediction unit, the feature prediction unit is trained by the input data, a prediction result is output, model parameters of the initial recognition model are adjusted according to the prediction result until model fitting is achieved, a new energy vehicle owner recognition model is generated, a vehicle owner recognition instruction is received, vehicle data to be recognized is obtained, the vehicle data to be recognized are imported into the new energy vehicle owner recognition model, and the vehicle owner recognition result is output. According to the method, the vehicle owner identification model structure comprising the feature intersection unit and the feature prediction unit is adopted, the feature intersection item is calculated through the feature intersection unit, so that low-order features and Gao Jiete evidence used for model training are obtained, the low-order features and Gao Jiete evidence are simultaneously learned and trained through the feature prediction unit, a large number of complicated manual feature engineering treatments are not needed, the model identification precision is ensured, a large amount of manpower investment is avoided, and the cost is saved.
The present application also provides another embodiment, namely, a computer readable storage medium, where computer readable instructions are stored, where the computer readable instructions are executable by at least one processor, so that the at least one processor performs the steps of the method for identifying a vehicle owner of a new energy vehicle as described above.
In the above embodiments, the present application discloses a storage medium, which belongs to the technical field of artificial intelligence and the field of financial production. According to the method, the pre-training data are preprocessed, the preprocessed pre-training data are subjected to feature extraction to obtain initial features, the initial features are input into an initial recognition model, the initial recognition model comprises a feature crossing unit and a feature prediction unit, feature crossing items among the initial features are calculated through the feature crossing unit, the initial features and the feature crossing items are used as input data of the feature prediction unit, the feature prediction unit is trained by the input data, a prediction result is output, model parameters of the initial recognition model are adjusted according to the prediction result until model fitting is achieved, a new energy vehicle owner recognition model is generated, a vehicle owner recognition instruction is received, vehicle data to be recognized is obtained, the vehicle data to be recognized are imported into the new energy vehicle owner recognition model, and the vehicle owner recognition result is output. According to the method, the vehicle owner identification model structure comprising the feature intersection unit and the feature prediction unit is adopted, the feature intersection item is calculated through the feature intersection unit, so that low-order features and Gao Jiete evidence used for model training are obtained, the low-order features and Gao Jiete evidence are simultaneously learned and trained through the feature prediction unit, a large number of complicated manual feature engineering treatments are not needed, the model identification precision is ensured, a large amount of manpower investment is avoided, and the cost is saved.
From the above description of the embodiments, it will be clear to those skilled in the art that the above-described embodiment method may be implemented by means of software plus a necessary general hardware platform, but of course may also be implemented by means of hardware, but in many cases the former is a preferred embodiment. Based on such understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art in the form of a software product stored in a storage medium (such as ROM/RAM, magnetic disk, optical disk), comprising several instructions for causing a terminal device (which may be a mobile phone, a computer, a server, an air conditioner, or a network device, etc.) to perform the method described in the embodiments of the present application.
The subject application is operational with numerous general purpose or special purpose computer system environments or configurations. For example: personal computers, server computers, hand-held or portable devices, tablet devices, multiprocessor systems, microprocessor-based systems, set top boxes, programmable consumer electronics, network PCs, minicomputers, mainframe computers, distributed computing environments that include any of the above systems or devices, and the like. The application may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The application may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.
It is apparent that the embodiments described above are only some embodiments of the present application, but not all embodiments, the preferred embodiments of the present application are given in the drawings, but not limiting the patent scope of the present application. This application may be embodied in many different forms, but rather, embodiments are provided in order to provide a more thorough understanding of the present disclosure. Although the present application has been described in detail with reference to the foregoing embodiments, it will be apparent to those skilled in the art that modifications may be made to the embodiments described in the foregoing, or equivalents may be substituted for elements thereof. All equivalent structures made by the specification and the drawings of the application are directly or indirectly applied to other related technical fields, and are also within the protection scope of the application.

Claims (10)

1. The new energy vehicle owner identification method is characterized by comprising the following steps:
pre-training data are acquired and preprocessed, wherein the pre-training data are historical vehicle data collected in advance;
extracting features of the preprocessed pre-training data to obtain initial features, and inputting the initial features into a preset initial recognition model, wherein the initial recognition model comprises a feature crossing unit and a feature prediction unit;
Calculating feature cross terms between the initial features by the feature cross unit;
taking the initial characteristic and the characteristic cross item as input data of the characteristic prediction unit, training the characteristic prediction unit by utilizing the input data, and outputting a prediction result;
adjusting model parameters of the initial recognition model according to the prediction result until the model is fitted, and generating a new energy vehicle owner recognition model;
and receiving a vehicle owner identification instruction, acquiring vehicle data to be identified, importing the vehicle data to be identified into the new energy vehicle owner identification model, and outputting a vehicle owner identification result.
2. The method for identifying a vehicle owner of a new energy vehicle according to claim 1, wherein the preprocessing comprises data screening, data cleaning, standardization and data set division, and the steps of obtaining pre-training data and preprocessing the pre-training data comprise:
importing the pre-training data from a preset database;
performing data screening on the pre-training data according to a preset data screening rule;
performing data cleaning and data standardization processing on the screened pre-training data;
And carrying out data set division on the pre-training data subjected to data cleaning and data standardization processing to obtain a training set, a verification set and a test set.
3. The method for identifying a vehicle owner of a new energy vehicle according to claim 1, further comprising, before said inputting the initial feature into a preset initial identification model:
classifying the initial features to obtain discrete features and continuous features;
carrying out one-hot coding on the discrete features and carrying out normalization processing on the continuous features;
and combining the discrete features after the one-hot encoding and the continuous features after the normalization processing to obtain feature combinations.
4. The method for identifying a vehicle owner of a new energy vehicle according to claim 1, wherein the initial identification model is a Deep FM model, the Deep FM model includes an FM sub-model and a Deep sub-model, the feature intersection unit is the FM sub-model, the feature prediction unit is the Deep sub-model, and the feature intersection unit calculates feature intersection items between the initial features, specifically including:
selecting any one of the initial features as a target initial feature,
respectively calculating the inner products of the target initial features and the rest initial features to obtain first-order cross terms of the target initial features;
Obtaining the hidden vector of the FM submodel, and multiplying the first-order cross item by the hidden vector to obtain a second-order cross item of the target initial characteristic;
and combining the first-order cross terms and the second-order cross terms to obtain the characteristic cross terms.
5. The method for identifying a vehicle owner of a new energy vehicle according to claim 4, wherein the Deep sub-model is an MLP model, the MLP model includes a plurality of full connection layers and output layers, the initial feature and the feature cross term are used as input data of the feature prediction unit, the feature prediction unit is trained by using the input data, and a prediction result is output, and the method specifically includes:
splicing the initial feature and the feature cross item to obtain a full feature vector;
sequentially introducing the full feature vectors into a plurality of full connection layers to perform feature processing to obtain feature fitting vectors;
and importing the characteristic fitting vector to the output layer to obtain a prediction result.
6. The method for identifying a vehicle owner of a new energy vehicle according to claim 5, wherein the step of sequentially introducing the full feature vector into a plurality of full connection layers to perform feature processing to obtain a feature fitting vector comprises the following steps:
The full feature vector is imported into a first full connection layer to obtain a first output vector;
performing nonlinear transformation on the first output vector;
the first output vector after nonlinear transformation is imported into a second full-connection layer to obtain a second output vector;
sequentially leading the output vector of the last full-connection layer into the next full-connection layer until the feature fitting vector output by the last full-connection layer is obtained;
before the output vector of the last full-connection layer is led into the next full-connection layer in turn until the feature fitting vector of the output of the last full-connection layer is obtained, the method further comprises:
and carrying out nonlinear transformation on the output vector of the last full connection layer.
7. The method for identifying a vehicle owner of a new energy vehicle according to claim 5, wherein the adjusting the model parameters of the initial identification model according to the prediction result until the model is fitted, and generating the vehicle owner identification model of the new energy vehicle specifically comprises:
calculating an error between the prediction result and a preset standard result based on a loss function of the MLP model to obtain a prediction error;
comparing the prediction error with a preset error threshold;
And if the prediction error is greater than the error threshold, continuously adjusting the model parameters of the initial recognition model until the prediction error is less than or equal to the error threshold, and generating a new energy vehicle owner recognition model.
8. The utility model provides a new energy automobile owner identification device which characterized in that includes:
the data processing module is used for acquiring pre-training data and preprocessing the pre-training data, wherein the pre-training data is historical vehicle data collected in advance;
the feature extraction module is used for extracting features of the preprocessed pre-training data to obtain initial features, and inputting the initial features into a preset initial recognition model, wherein the initial recognition model comprises a feature crossing unit and a feature prediction unit;
a cross calculation module for calculating feature cross items between the initial features by the feature cross unit;
the feature learning module is used for taking the initial feature and the feature cross item as input data of the feature prediction unit, training the feature prediction unit by utilizing the input data, and outputting a prediction result;
the model adjustment module is used for adjusting model parameters of the initial recognition model according to the prediction result until the model is fitted, so as to generate a new energy vehicle owner recognition model;
The vehicle owner identification module is used for receiving a vehicle owner identification instruction, acquiring vehicle data to be identified, importing the vehicle data to be identified into the new energy vehicle owner identification model, and outputting a vehicle owner identification result.
9. A computer device comprising a memory and a processor, the memory having stored therein computer readable instructions which when executed by the processor implement the steps of the new energy vehicle owner identification method of any one of claims 1 to 7.
10. A computer readable storage medium having stored thereon computer readable instructions which when executed by a processor perform the steps of the new energy vehicle owner identification method of any of claims 1 to 7.
CN202310649173.5A 2023-06-02 2023-06-02 New energy vehicle owner identification method, device, equipment and storage medium Pending CN116451125A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310649173.5A CN116451125A (en) 2023-06-02 2023-06-02 New energy vehicle owner identification method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310649173.5A CN116451125A (en) 2023-06-02 2023-06-02 New energy vehicle owner identification method, device, equipment and storage medium

Publications (1)

Publication Number Publication Date
CN116451125A true CN116451125A (en) 2023-07-18

Family

ID=87124009

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310649173.5A Pending CN116451125A (en) 2023-06-02 2023-06-02 New energy vehicle owner identification method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN116451125A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118013060A (en) * 2024-03-19 2024-05-10 腾讯科技(深圳)有限公司 Data processing method, device, equipment, storage medium and product

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111311372A (en) * 2020-03-04 2020-06-19 支付宝(杭州)信息技术有限公司 User identification method and device
US20220188366A1 (en) * 2020-12-15 2022-06-16 NantMedia Holdings, LLC Combined Wide And Deep Machine Learning Models For Automated Database Element Processing Systems, Methods And Apparatuses
CN114971675A (en) * 2022-04-06 2022-08-30 北京科技大学 Second-hand car price evaluation method based on deep FM model

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111311372A (en) * 2020-03-04 2020-06-19 支付宝(杭州)信息技术有限公司 User identification method and device
US20220188366A1 (en) * 2020-12-15 2022-06-16 NantMedia Holdings, LLC Combined Wide And Deep Machine Learning Models For Automated Database Element Processing Systems, Methods And Apparatuses
CN114971675A (en) * 2022-04-06 2022-08-30 北京科技大学 Second-hand car price evaluation method based on deep FM model

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
高华玲等: "《推荐算法及应用》", 31 January 2021, 北京邮电大学出版社, pages: 73 - 74 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118013060A (en) * 2024-03-19 2024-05-10 腾讯科技(深圳)有限公司 Data processing method, device, equipment, storage medium and product

Similar Documents

Publication Publication Date Title
Yu et al. Prediction of highway tunnel pavement performance based on digital twin and multiple time series stacking
CN110751557B (en) Abnormal fund transaction behavior analysis method and system based on sequence model
CN111898550B (en) Expression recognition model building method and device, computer equipment and storage medium
CN114880449B (en) Method and device for generating answers of intelligent questions and answers, electronic equipment and storage medium
CN113706291A (en) Fraud risk prediction method, device, equipment and storage medium
CN116451125A (en) New energy vehicle owner identification method, device, equipment and storage medium
CN115099326A (en) Behavior prediction method, behavior prediction device, behavior prediction equipment and storage medium based on artificial intelligence
CN116843483A (en) Vehicle insurance claim settlement method, device, computer equipment and storage medium
CN116777646A (en) Artificial intelligence-based risk identification method, apparatus, device and storage medium
CN117235633A (en) Mechanism classification method, mechanism classification device, computer equipment and storage medium
CN116703515A (en) Recommendation method and device based on artificial intelligence, computer equipment and storage medium
CN114154617A (en) Low-voltage resident user abnormal electricity utilization identification method and system based on VFL
CN113420789A (en) Method, device, storage medium and computer equipment for predicting risk account
CN113822689A (en) Advertisement conversion rate estimation method and device, storage medium and electronic equipment
CN116307742B (en) Risk identification method, device and equipment for subdivision guest group and storage medium
Siaminamini et al. Generating a risk profile for car insurance policyholders: A deep learning conceptual model
CN117252713A (en) Risk identification method, device and equipment for new energy vehicle and storage medium
CN116756147A (en) Data classification method, device, computer equipment and storage medium
CN117611352A (en) Vehicle insurance claim processing method, device, computer equipment and storage medium
CN117251799A (en) Financial certificate processing method and device, computer equipment and storage medium
CN117407750A (en) Metadata-based data quality monitoring method, device, equipment and storage medium
CN117236707A (en) Asset optimization configuration method and device, computer equipment and storage medium
CN116245616A (en) E-commerce platform commodity recommendation method and device, computer equipment and storage medium
CN118260347A (en) Data acquisition and analysis method and system based on artificial intelligence
CN115205058A (en) Vehicle accident loss evaluation method and device, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination