CN113516205A - Data classification method, device, equipment and storage medium based on artificial intelligence - Google Patents

Data classification method, device, equipment and storage medium based on artificial intelligence Download PDF

Info

Publication number
CN113516205A
CN113516205A CN202111029679.3A CN202111029679A CN113516205A CN 113516205 A CN113516205 A CN 113516205A CN 202111029679 A CN202111029679 A CN 202111029679A CN 113516205 A CN113516205 A CN 113516205A
Authority
CN
China
Prior art keywords
data
feature
result
sample
classification
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202111029679.3A
Other languages
Chinese (zh)
Other versions
CN113516205B (en
Inventor
任杰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Technology Shenzhen Co Ltd
Original Assignee
Ping An Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Technology Shenzhen Co Ltd filed Critical Ping An Technology Shenzhen Co Ltd
Priority to CN202111029679.3A priority Critical patent/CN113516205B/en
Publication of CN113516205A publication Critical patent/CN113516205A/en
Application granted granted Critical
Publication of CN113516205B publication Critical patent/CN113516205B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention relates to artificial intelligence and provides a data classification method, a device, equipment and a storage medium based on artificial intelligence. The method can obtain a plurality of initial samples, each initial sample comprises a sample value of a sample user in a plurality of data characteristics and a user result, the sample value is subjected to discrete processing to obtain a discrete result, the discrete result is subjected to nuclear density estimation analysis to obtain characteristic distribution information of each data characteristic, information in each characteristic distribution information is extracted to obtain a training sample, a characteristic weight of each data characteristic is determined according to the discrete result and the user result, a sample result is generated according to the training sample and the characteristic weight, a preset network is adjusted based on the training sample and the sample result to obtain a classification model, a request characteristic is obtained, and the request characteristic is processed according to the classification model to obtain a classification result. The invention can improve the accuracy of the classification result. In addition, the invention also relates to a block chain technology, and the classification result can be stored in the block chain.

Description

Data classification method, device, equipment and storage medium based on artificial intelligence
Technical Field
The invention relates to the technical field of artificial intelligence, in particular to a data classification method, a data classification device, data classification equipment and a storage medium based on artificial intelligence.
Background
The evaluation of the stability of the employees by the enterprise is a key problem of the relationship management research of the employees, and the value of the employees can be better realized through the evaluation of the stability of the employees.
At present, in a scheme for classifying user stability, a classification model is generally trained by directly using acquired data samples, and then the stability of a user is evaluated according to the classification model, however, the data samples adopted in training the classification model are directly acquired from a real scene, and the acquired data has the problem of sample distribution imbalance, so that the accuracy of the trained classification model is low, and the problem that the stability of employees cannot be accurately evaluated is caused.
Disclosure of Invention
In view of the above, it is desirable to provide a data classification method, apparatus, device and storage medium based on artificial intelligence, which can improve the accuracy of the classification result.
In one aspect, the present invention provides an artificial intelligence based data classification method, including:
obtaining a plurality of initial samples, wherein each initial sample comprises sample values of a sample user in a plurality of data characteristics and a user result corresponding to the sample user;
performing discrete processing on the sample value to obtain a discrete result;
performing kernel density estimation analysis on the discrete result based on the plurality of data characteristics to obtain characteristic distribution information of each data characteristic;
extracting information in each feature distribution information to obtain a training sample;
determining a feature weight of each data feature according to the discrete result and the user result;
generating a sample result of the training sample according to the training sample and the feature weight;
adjusting a preset network based on the training sample and the sample result to obtain a classification model;
when a classification request is received, acquiring request characteristics according to the classification request;
and processing the request characteristics according to the classification model to obtain a classification result of the classification request.
According to a preferred embodiment of the present invention, the discretizing the sample values to obtain a discretization result includes:
detecting a data type of the sample value;
screening sample values with the data types being numerical types from the sample values to serve as first numerical values, and screening sample values with the data types not being the numerical types from the sample values to serve as information to be processed;
acquiring data characteristics to which the information to be processed belongs as target characteristics, and acquiring a plurality of preset ranges of the target characteristics;
dispersing the information to be processed into scores corresponding to the preset ranges to obtain a second score;
determining the first numerical value and the second numerical value as the discrete result.
According to a preferred embodiment of the present invention, the performing a kernel density estimation analysis on the discrete result based on the plurality of data features to obtain feature distribution information of each data feature includes:
for each data feature, acquiring an attribute feature of the data feature;
selecting a kernel function corresponding to the attribute characteristics from preset functions;
calculating the feature distribution information of the data feature according to the following formula based on the kernel function and the discrete result:
Figure DEST_PATH_IMAGE001
;
wherein the content of the first and second substances,
Figure 959655DEST_PATH_IMAGE002
it is referred to the feature distribution information,
Figure DEST_PATH_IMAGE003
refers to the number of such discrete results,
Figure 947203DEST_PATH_IMAGE004
Figure DEST_PATH_IMAGE005
it is meant that the discrete result is,
Figure 171511DEST_PATH_IMAGE006
is referred to as
Figure DEST_PATH_IMAGE007
The number of the discrete results is one,
Figure 295325DEST_PATH_IMAGE008
refers to the kernel function.
According to a preferred embodiment of the present invention, the extracting information in each of the feature distribution information to obtain a training sample includes:
randomly selecting any feature distribution information from the plurality of feature distribution information as target feature distribution information, and determining the plurality of feature distribution information except the target feature distribution information as the rest feature distribution information;
randomly extracting any feature data from the target feature distribution information, and extracting initial feature data from the rest feature distribution information;
calculating the coexistence probability of the arbitrary feature data and the initial feature data;
determining the initial characteristic data with the coexistence probability larger than a preset probability threshold value as target characteristic data;
and determining the combination of the arbitrary characteristic data and each target characteristic data as the training sample.
According to a preferred embodiment of the present invention, the calculating the coexistence probability of the arbitrary feature data and the initial feature data in the remaining feature distribution information includes:
calculating the information correlation degree of the target characteristic distribution information and the rest characteristic distribution information;
calculating a first data probability of the arbitrary feature data in the target feature distribution information, and calculating a second data probability of the initial feature data in the rest feature distribution information;
and calculating the product of the information correlation degree, the first data probability and the second data probability to obtain the coexistence probability.
According to the preferred embodiment of the present invention, the determining the feature weight of each data feature according to the discrete result and the user result includes:
discretizing the user result to obtain a labeling result;
based on the labeling result and the discrete result, calculating the feature weight according to the following formula:
Figure DEST_PATH_IMAGE009
Figure 579675DEST_PATH_IMAGE010
wherein the content of the first and second substances,
Figure DEST_PATH_IMAGE011
the result of the annotation is referred to as the result of the annotation,
Figure 54519DEST_PATH_IMAGE012
Figure DEST_PATH_IMAGE013
Figure 82518DEST_PATH_IMAGE014
Figure DEST_PATH_IMAGE015
respectively, refer to the weight of the feature,
Figure 264100DEST_PATH_IMAGE005
Figure 516090DEST_PATH_IMAGE016
Figure 415913DEST_PATH_IMAGE014
Figure DEST_PATH_IMAGE017
the feature weight value corresponding to each data feature is referred to.
According to a preferred embodiment of the present invention, the adjusting the preset network based on the training samples and the sample results to obtain the classification model includes:
determining classification scenes corresponding to the plurality of initial samples;
acquiring a network matched with the classification scene from a classification network library as the preset network;
inputting the training sample into the preset network to obtain a prediction result;
calculating a network loss value of the preset network based on the prediction result and the sample result;
and adjusting the network parameters of the preset network according to the network loss value until the network loss value is not reduced any more, and obtaining the classification model.
On the other hand, the invention also provides a data classification device based on artificial intelligence, which comprises:
the device comprises an acquisition unit, a processing unit and a processing unit, wherein the acquisition unit is used for acquiring a plurality of initial samples, and each initial sample comprises a sample value of a sample user in a plurality of data characteristics and a user result corresponding to the sample user;
the discrete unit is used for carrying out discrete processing on the sample value to obtain a discrete result;
the analysis unit is used for carrying out nuclear density estimation analysis on the discrete result based on the plurality of data characteristics to obtain characteristic distribution information of each data characteristic;
the extraction unit is used for extracting information in each feature distribution information to obtain a training sample;
a determining unit, configured to determine a feature weight of each data feature according to the discrete result and the user result;
the generating unit is used for generating a sample result of the training sample according to the training sample and the characteristic weight;
the adjusting unit is used for adjusting a preset network based on the training sample and the sample result to obtain a classification model;
the obtaining unit is further used for obtaining request characteristics according to the classification request when the classification request is received;
and the processing unit is used for processing the request characteristics according to the classification model to obtain a classification result of the classification request.
In another aspect, the present invention further provides an electronic device, including:
a memory storing computer readable instructions; and
a processor executing computer readable instructions stored in the memory to implement the artificial intelligence based data classification method.
In another aspect, the present invention also provides a computer-readable storage medium having computer-readable instructions stored therein, which are executed by a processor in an electronic device to implement the artificial intelligence based data classification method.
According to the technical scheme, the discrete results are subjected to kernel density estimation analysis, so that the feature distribution information can be uniformly distributed on the data features, the balance of the training sample is improved, the robustness of the classification model is improved, the sample results can be generated based on the training sample and the feature weight, repeated marking on the training sample is not needed, the generation efficiency of the sample results can be improved, meanwhile, the sample results can be analyzed from data according to the sample probability and the feature weight, the accuracy of the sample results is improved, the accuracy of the classification model is improved again, and the determination accuracy of the classification results can be further improved.
Drawings
FIG. 1 is a flow chart of the data classification method based on artificial intelligence according to the preferred embodiment of the invention.
FIG. 2 is a functional block diagram of a preferred embodiment of the data classification apparatus based on artificial intelligence according to the present invention.
FIG. 3 is a schematic structural diagram of an electronic device implementing an artificial intelligence-based data classification method according to a preferred embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be described in detail with reference to the accompanying drawings and specific embodiments.
FIG. 1 is a flow chart of a preferred embodiment of the data classification method based on artificial intelligence according to the present invention. The order of the steps in the flow chart may be changed and some steps may be omitted according to different needs.
The artificial intelligence based data classification method can acquire and process related data based on an artificial intelligence technology. Among them, Artificial Intelligence (AI) is a theory, method, technique and application system that simulates, extends and expands human Intelligence using a digital computer or a machine controlled by a digital computer, senses the environment, acquires knowledge and uses the knowledge to obtain the best result.
The artificial intelligence infrastructure generally includes technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a robot technology, a biological recognition technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and the like.
The data classification method based on artificial intelligence is applied to one or more electronic devices, which are devices capable of automatically performing numerical calculation and/or information processing according to computer readable instructions set or stored in advance, and hardware of the electronic devices includes, but is not limited to, a microprocessor, an Application Specific Integrated Circuit (ASIC), a Programmable Gate Array (FPGA), a Digital Signal Processor (DSP), an embedded device, and the like.
The electronic device may be any electronic product capable of performing human-computer interaction with a user, for example, a Personal computer, a tablet computer, a smart phone, a Personal Digital Assistant (PDA), a game machine, an interactive Internet Protocol Television (IPTV), a smart wearable device, and the like.
The electronic device may include a network device and/or a user device. Wherein the network device includes, but is not limited to, a single network electronic device, an electronic device group consisting of a plurality of network electronic devices, or a Cloud Computing (Cloud Computing) based Cloud consisting of a large number of hosts or network electronic devices.
The network in which the electronic device is located includes, but is not limited to: the internet, a wide area Network, a metropolitan area Network, a local area Network, a Virtual Private Network (VPN), and the like.
S10, obtaining a plurality of initial samples, wherein each initial sample comprises sample values of a sample user in a plurality of data characteristics and a user result corresponding to the sample user.
In at least one embodiment of the invention, the sample user may be any employee of the enterprise.
The plurality of data characteristics may include number of clients contacted, number of meetings attended, frequency of company APP usage, job age, academic calendar, age, capacity of the group where the user is located, capacity ranking, and the like.
Accordingly, the sample value refers to information corresponding to the sample user in the plurality of data features.
The user result refers to a stable situation of the sample user, for example, the user result may be high in stability.
And S11, performing discrete processing on the sample values to obtain discrete results.
In at least one embodiment of the present invention, the discretization result includes information obtained by discretizing a non-numerical sample value. The discrete result also includes a sample value of a numerical type.
In at least one embodiment of the present invention, the electronic device performs a discretization process on the sample values, and obtaining a discretization result includes:
detecting a data type of the sample value;
screening sample values with the data types being numerical types from the sample values to serve as first numerical values, and screening sample values with the data types not being the numerical types from the sample values to serve as information to be processed;
acquiring data characteristics to which the information to be processed belongs as target characteristics, and acquiring a plurality of preset ranges of the target characteristics;
dispersing the information to be processed into scores corresponding to the preset ranges to obtain a second score;
determining the first numerical value and the second numerical value as the discrete result.
Wherein the data types include: character type, numerical type, etc.
The preset ranges are generated according to a plurality of feature values of the target feature, for example, the preset ranges may include [ elementary school, college ] and [ subject, research student ] and the like. The plurality of feature values refer to values corresponding to the target feature, for example, the target feature is: learning, the plurality of feature values may then include: primary school, this department, etc.
The information to be processed is subjected to discrete processing without processing the first numerical value, so that the discrete efficiency can be improved, and meanwhile, the information to be processed is subjected to discrete processing, so that the information datamation can be realized, and the analysis accuracy is improved.
And S12, performing kernel density estimation analysis on the discrete result based on the plurality of data characteristics to obtain characteristic distribution information of each data characteristic.
In at least one embodiment of the present invention, the characteristic distribution information may be a distribution curve of the discrete results on the data characteristic, for example, the characteristic distribution information may be a normal distribution curve.
In at least one embodiment of the present invention, the performing, by the electronic device, a kernel density estimation analysis on the discrete result based on the plurality of data features, and obtaining feature distribution information of each data feature includes:
for each data feature, acquiring an attribute feature of the data feature;
selecting a kernel function corresponding to the attribute characteristics from preset functions;
calculating the feature distribution information of the data feature according to the following formula based on the kernel function and the discrete result:
Figure 982024DEST_PATH_IMAGE001
;
wherein the content of the first and second substances,
Figure 283692DEST_PATH_IMAGE002
it is referred to the feature distribution information,
Figure 644266DEST_PATH_IMAGE003
refers to the number of such discrete results,
Figure 156019DEST_PATH_IMAGE004
Figure 525820DEST_PATH_IMAGE005
it is meant that the discrete result is,
Figure 416416DEST_PATH_IMAGE006
is referred to as
Figure 213471DEST_PATH_IMAGE007
The number of the discrete results is one,
Figure 822307DEST_PATH_IMAGE008
refers to the kernel function.
Wherein the attribute feature refers to a characteristic of the data feature, for example, the data feature is: age, the attribute characteristic is: distribution characteristics conforming to normal distribution; the data characteristics are: a sales performance characteristic, the attribute characteristic being: fixed pricing summing features.
The preset function may include a gaussian function, a Triangle function, a trigonometric function, and the like.
The kernel function is a preset function matched with the attribute feature, for example, the data feature is: age, the attribute characteristic is: if the distribution characteristic of the normal distribution is met, the kernel function is: a Gaussian function; the data characteristics are: a sales performance characteristic, the attribute characteristic being: fixed pricing and summing characteristics, the kernel function is: the Triangle function.
And a proper kernel function can be accurately selected for the data characteristics through the attribute characteristics, so that the accuracy of the characteristic distribution information is improved.
And S13, extracting the information in each feature distribution information to obtain a training sample.
In at least one embodiment of the present invention, any one of the feature distribution information is included in the training sample.
Specifically, the training sample includes the arbitrary feature data and each of the target feature data.
In at least one embodiment of the present invention, the extracting, by the electronic device, information in each of the feature distribution information to obtain a training sample includes:
randomly selecting any feature distribution information from the plurality of feature distribution information as target feature distribution information, and determining the plurality of feature distribution information except the target feature distribution information as the rest feature distribution information;
randomly extracting any feature data from the target feature distribution information, and extracting initial feature data from the rest feature distribution information;
calculating the coexistence probability of the arbitrary feature data and the initial feature data;
determining the initial characteristic data with the coexistence probability larger than a preset probability threshold value as target characteristic data;
and determining the combination of the arbitrary characteristic data and each target characteristic data as the training sample.
Wherein the plurality of feature distribution information includes the target feature distribution information and the remaining feature distribution information.
The arbitrary feature data refers to arbitrary information in the target feature distribution information, for example, if the target feature distribution information is data with an age of 0 to 100 years, the arbitrary feature data may be 8 years old.
The initial feature data refers to any information in the remaining feature distribution information. For example, if the remaining feature distribution information is data from schoolchildren to doctor, the initial feature data may be the subject.
The coexistence probability refers to a probability that the arbitrary feature data and the initial feature data exist in the sample user at the same time.
The preset probability threshold is set according to actual requirements.
The target feature data is that the coexistence probability with the arbitrary feature data is greater than the preset probability threshold, and the target feature data is selected according to the coexistence probability, so that the situation that exclusion information exists between the arbitrary feature data and the target feature data can be avoided, for example, the arbitrary feature data is 2 years old, and the target feature data cannot be doctor students.
And selecting the target characteristic data from the initial characteristic data through the coexistence probability, and then generating the training sample according to the arbitrary characteristic data and the target characteristic data, so that the rationality of the training sample can be improved, and the rejection information between the arbitrary characteristic data and the target characteristic data is avoided.
Specifically, the calculating, by the electronic device, a coexistence probability of the arbitrary feature data and initial feature data in the remaining feature distribution information includes:
calculating the information correlation degree of the target characteristic distribution information and the rest characteristic distribution information;
calculating a first data probability of the arbitrary feature data in the target feature distribution information, and calculating a second data probability of the initial feature data in the rest feature distribution information;
and calculating the product of the information correlation degree, the first data probability and the second data probability to obtain the coexistence probability.
By analyzing the coexistence probability in combination with the information correlation degree between the data features, the accuracy of the coexistence probability can be improved.
And S14, determining the feature weight of each data feature according to the discrete result and the user result.
In at least one embodiment of the present invention, the feature weight refers to a weight occupied by the data feature in terms of user stability.
In at least one embodiment of the present invention, the determining, by the electronic device, the feature weight of each data feature according to the discrete result and the user result includes:
discretizing the user result to obtain a labeling result;
based on the labeling result and the discrete result, calculating the feature weight according to the following formula:
Figure 61046DEST_PATH_IMAGE009
Figure 71727DEST_PATH_IMAGE010
wherein the content of the first and second substances,
Figure 39683DEST_PATH_IMAGE011
the result of the annotation is referred to as the result of the annotation,
Figure 135815DEST_PATH_IMAGE012
Figure 847419DEST_PATH_IMAGE013
Figure 712607DEST_PATH_IMAGE014
Figure 913781DEST_PATH_IMAGE015
respectively, refer to the weight of the feature,
Figure 497209DEST_PATH_IMAGE005
Figure 746925DEST_PATH_IMAGE016
Figure 732198DEST_PATH_IMAGE014
Figure 41957DEST_PATH_IMAGE017
the feature weight value corresponding to each data feature is referred to.
Through the implementation mode, the feature weight can be accurately quantized based on the mathematical relationship between the discrete result and the user result.
And S15, generating a sample result of the training sample according to the training sample and the feature weight.
In at least one embodiment of the present invention, a specific manner in which the electronic device generates the sample result of the training sample according to the training sample and the feature weight is reversible with a specific manner in which the electronic device determines the feature weight of each data feature according to the discrete result and the user result, which is not described in detail herein.
And S16, adjusting a preset network based on the training sample and the sample result to obtain a classification model.
In at least one embodiment of the present invention, the classification model refers to a model obtained after the preset network is adjusted.
In at least one embodiment of the present invention, the adjusting, by the electronic device, a preset network based on the training sample and the sample result to obtain a classification model includes:
determining classification scenes corresponding to the plurality of initial samples;
acquiring a network matched with the classification scene from a classification network library as the preset network;
inputting the training sample into the preset network to obtain a prediction result;
calculating a network loss value of the preset network based on the prediction result and the sample result;
and adjusting the network parameters of the preset network according to the network loss value until the network loss value is not reduced any more, and obtaining the classification model.
Wherein the classification network base stores a plurality of networks for data classification.
The preset network can be generated after the unbalanced samples are adjusted by a pre-constructed learner.
The network parameter may include a learning rate of the preset network, and the like.
The preset network is directly adjusted, the preset network is applicable to the classification scene without parameter adjustment, the adjustment efficiency of the classification model can be improved, the network parameters are adjusted through the network loss values, and the classification accuracy of the classification model can be ensured.
S17, when a classification request is received, request characteristics are obtained according to the classification request.
In at least one embodiment of the invention, the sort request may be triggered and generated by any high-management user in the enterprise.
The classification request carries a user identification code and the like.
The request characteristics refer to information corresponding to the data characteristics of the users needing to perform stability classification in the classification request.
In at least one embodiment of the present invention, the obtaining, by the electronic device, the request feature according to the classification request includes:
analyzing the message of the classification request to obtain data information carried by the message;
extracting a user identification code from the data information;
and extracting information corresponding to the plurality of data characteristics from a user information base according to the user identification code to be used as the request characteristics.
The user identification code refers to a tag capable of identifying a user. For example, the user identification code may be a job number.
And S18, processing the request characteristics according to the classification model to obtain the classification result of the classification request.
In at least one embodiment of the present invention, the classification result refers to a stability condition corresponding to a user who needs to perform stability analysis in the classification request.
It is emphasized that the classification result may also be stored in a node of a blockchain in order to further ensure the privacy and security of the classification result.
In at least one embodiment of the present invention, the electronic device discretizes the request feature and inputs the discretized result into the classification model to obtain the classification result.
According to the technical scheme, the discrete results are subjected to kernel density estimation analysis, so that the feature distribution information can be uniformly distributed on the data features, the balance of the training sample is improved, the robustness of the classification model is improved, the sample results can be generated based on the training sample and the feature weight, repeated marking on the training sample is not needed, the generation efficiency of the sample results can be improved, meanwhile, the sample results can be analyzed from data according to the sample probability and the feature weight, the accuracy of the sample results is improved, the accuracy of the classification model is improved again, and the determination accuracy of the classification results can be further improved.
FIG. 2 is a functional block diagram of a preferred embodiment of the data classifying apparatus based on artificial intelligence according to the present invention. The artificial intelligence based data classification device 11 includes an acquisition unit 110, a discretization unit 111, an analysis unit 112, an extraction unit 113, a determination unit 114, a generation unit 115, an adjustment unit 116, and a processing unit 117. The module/unit referred to herein is a series of computer readable instruction segments that can be accessed by the processor 13 and perform a fixed function and that are stored in the memory 12. In the present embodiment, the functions of the modules/units will be described in detail in the following embodiments.
The obtaining unit 110 obtains a plurality of initial samples, each of which includes a sample value of a sample user in a plurality of data characteristics and a user result corresponding to the sample user.
In at least one embodiment of the invention, the sample user may be any employee of the enterprise.
The plurality of data characteristics may include number of clients contacted, number of meetings attended, frequency of company APP usage, job age, academic calendar, age, capacity of the group where the user is located, capacity ranking, and the like.
Accordingly, the sample value refers to information corresponding to the sample user in the plurality of data features.
The user result refers to a stable situation of the sample user, for example, the user result may be high in stability.
The discrete unit 111 performs discrete processing on the sample values to obtain discrete results.
In at least one embodiment of the present invention, the discretization result includes information obtained by discretizing a non-numerical sample value. The discrete result also includes a sample value of a numerical type.
In at least one embodiment of the present invention, the discretizing unit 111 performs discretization on the sample values, and obtaining a discretization result includes:
detecting a data type of the sample value;
screening sample values with the data types being numerical types from the sample values to serve as first numerical values, and screening sample values with the data types not being the numerical types from the sample values to serve as information to be processed;
acquiring data characteristics to which the information to be processed belongs as target characteristics, and acquiring a plurality of preset ranges of the target characteristics;
dispersing the information to be processed into scores corresponding to the preset ranges to obtain a second score;
determining the first numerical value and the second numerical value as the discrete result.
Wherein the data types include: character type, numerical type, etc.
The preset ranges are generated according to a plurality of feature values of the target feature, for example, the preset ranges may include [ elementary school, college ] and [ subject, research student ] and the like. The plurality of feature values refer to values corresponding to the target feature, for example, the target feature is: learning, the plurality of feature values may then include: primary school, this department, etc.
The information to be processed is subjected to discrete processing without processing the first numerical value, so that the discrete efficiency can be improved, and meanwhile, the information to be processed is subjected to discrete processing, so that the information datamation can be realized, and the analysis accuracy is improved.
The analysis unit 112 performs kernel density estimation analysis on the discrete result based on the plurality of data features to obtain feature distribution information of each data feature.
In at least one embodiment of the present invention, the characteristic distribution information may be a distribution curve of the discrete results on the data characteristic, for example, the characteristic distribution information may be a normal distribution curve.
In at least one embodiment of the present invention, the analyzing unit 112 performs a kernel density estimation analysis on the discrete result based on the plurality of data features, and obtaining feature distribution information of each data feature includes:
for each data feature, acquiring an attribute feature of the data feature;
selecting a kernel function corresponding to the attribute characteristics from preset functions;
calculating the feature distribution information of the data feature according to the following formula based on the kernel function and the discrete result:
Figure 174998DEST_PATH_IMAGE001
;
wherein the content of the first and second substances,
Figure 228405DEST_PATH_IMAGE002
it is referred to the feature distribution information,
Figure 68185DEST_PATH_IMAGE003
refers to the number of such discrete results,
Figure 548845DEST_PATH_IMAGE004
Figure 841286DEST_PATH_IMAGE005
it is meant that the discrete result is,
Figure 698383DEST_PATH_IMAGE006
is referred to as
Figure 454987DEST_PATH_IMAGE007
The number of the discrete results is one,
Figure 106548DEST_PATH_IMAGE008
refers to the kernel function.
Wherein the attribute feature refers to a characteristic of the data feature, for example, the data feature is: age, the attribute characteristic is: distribution characteristics conforming to normal distribution; the data characteristics are: a sales performance characteristic, the attribute characteristic being: fixed pricing summing features.
The preset function may include a gaussian function, a Triangle function, a trigonometric function, and the like.
The kernel function is a preset function matched with the attribute feature, for example, the data feature is: age, the attribute characteristic is: if the distribution characteristic of the normal distribution is met, the kernel function is: a Gaussian function; the data characteristics are: a sales performance characteristic, the attribute characteristic being: fixed pricing and summing characteristics, the kernel function is: the Triangle function.
And a proper kernel function can be accurately selected for the data characteristics through the attribute characteristics, so that the accuracy of the characteristic distribution information is improved.
The extracting unit 113 extracts information in each of the feature distribution information to obtain a training sample.
In at least one embodiment of the present invention, any one of the feature distribution information is included in the training sample.
Specifically, the training sample includes the arbitrary feature data and each of the target feature data.
In at least one embodiment of the present invention, the extracting unit 113 extracts information in each of the feature distribution information, and obtaining the training sample includes:
randomly selecting any feature distribution information from the plurality of feature distribution information as target feature distribution information, and determining the plurality of feature distribution information except the target feature distribution information as the rest feature distribution information;
randomly extracting any feature data from the target feature distribution information, and extracting initial feature data from the rest feature distribution information;
calculating the coexistence probability of the arbitrary feature data and the initial feature data;
determining the initial characteristic data with the coexistence probability larger than a preset probability threshold value as target characteristic data;
and determining the combination of the arbitrary characteristic data and each target characteristic data as the training sample.
Wherein the plurality of feature distribution information includes the target feature distribution information and the remaining feature distribution information.
The arbitrary feature data refers to arbitrary information in the target feature distribution information, for example, if the target feature distribution information is data with an age of 0 to 100 years, the arbitrary feature data may be 8 years old.
The coexistence probability refers to a probability that the arbitrary feature data and the initial feature data exist in the sample user at the same time.
The preset probability threshold is set according to actual requirements.
The target feature data is that the coexistence probability with the arbitrary feature data is greater than the preset probability threshold, and the target feature data is selected according to the coexistence probability, so that the situation that exclusion information exists between the arbitrary feature data and the target feature data can be avoided, for example, the arbitrary feature data is 2 years old, and the target feature data cannot be doctor students.
And selecting the target characteristic data from the initial characteristic data through the coexistence probability, and then generating the training sample according to the arbitrary characteristic data and the target characteristic data, so that the rationality of the training sample can be improved, and the rejection information between the arbitrary characteristic data and the target characteristic data is avoided.
Specifically, the calculating, by the extracting unit 113, a coexistence probability of the arbitrary feature data and the initial feature data in the remaining feature distribution information includes:
calculating the information correlation degree of the target characteristic distribution information and the rest characteristic distribution information;
calculating a first data probability of the arbitrary feature data in the target feature distribution information, and calculating a second data probability of the initial feature data in the rest feature distribution information;
and calculating the product of the information correlation degree, the first data probability and the second data probability to obtain the coexistence probability.
By analyzing the coexistence probability in combination with the information correlation degree between the data features, the accuracy of the coexistence probability can be improved.
The determining unit 114 determines a feature weight of each data feature according to the discrete result and the user result.
In at least one embodiment of the present invention, the feature weight refers to a weight occupied by the data feature in terms of user stability.
In at least one embodiment of the present invention, the determining unit 114 determines the feature weight of each data feature according to the discrete result and the user result, including:
discretizing the user result to obtain a labeling result;
based on the labeling result and the discrete result, calculating the feature weight according to the following formula:
Figure 151864DEST_PATH_IMAGE009
Figure 547074DEST_PATH_IMAGE010
wherein the content of the first and second substances,
Figure 95867DEST_PATH_IMAGE011
the result of the annotation is referred to as the result of the annotation,
Figure 183908DEST_PATH_IMAGE012
Figure 513259DEST_PATH_IMAGE013
Figure 712159DEST_PATH_IMAGE014
Figure 115458DEST_PATH_IMAGE015
respectively, refer to the weight of the feature,
Figure 374401DEST_PATH_IMAGE005
Figure 128731DEST_PATH_IMAGE016
Figure 928059DEST_PATH_IMAGE014
Figure 451445DEST_PATH_IMAGE017
the feature weight value corresponding to each data feature is referred to.
Through the implementation mode, the feature weight can be accurately quantized based on the mathematical relationship between the discrete result and the user result.
The generating unit 115 generates a sample result of the training sample according to the training sample and the feature weight.
In at least one embodiment of the present invention, a specific manner in which the generating unit 115 generates the sample result of the training sample according to the training sample and the feature weight is reversible with a specific manner in which the determining unit 114 determines the feature weight of each data feature according to the discrete result and the user result, which is not described in detail herein.
The adjusting unit 116 adjusts a preset network based on the training samples and the sample results to obtain a classification model.
In at least one embodiment of the present invention, the classification model refers to a model obtained after the preset network is adjusted.
In at least one embodiment of the present invention, the adjusting unit 116 adjusts a preset network based on the training samples and the sample results, and obtaining the classification model includes:
determining classification scenes corresponding to the plurality of initial samples;
acquiring a network matched with the classification scene from a classification network library as the preset network;
inputting the training sample into the preset network to obtain a prediction result;
calculating a network loss value of the preset network based on the prediction result and the sample result;
and adjusting the network parameters of the preset network according to the network loss value until the network loss value is not reduced any more, and obtaining the classification model.
Wherein the classification network base stores a plurality of networks for data classification.
The preset network can be generated after the unbalanced samples are adjusted by a pre-constructed learner.
The network parameter may include a learning rate of the preset network, and the like.
The preset network is directly adjusted, the preset network is applicable to the classification scene without parameter adjustment, the adjustment efficiency of the classification model can be improved, the network parameters are adjusted through the network loss values, and the classification accuracy of the classification model can be ensured.
When a classification request is received, the obtaining unit 110 obtains request characteristics according to the classification request.
In at least one embodiment of the invention, the sort request may be triggered and generated by any high-management user in the enterprise.
The classification request carries a user identification code and the like.
The request characteristics refer to information corresponding to the data characteristics of the users needing to perform stability classification in the classification request.
In at least one embodiment of the present invention, the obtaining unit 110 obtains the request feature according to the classification request, including:
analyzing the message of the classification request to obtain data information carried by the message;
extracting a user identification code from the data information;
and extracting information corresponding to the plurality of data characteristics from a user information base according to the user identification code to be used as the request characteristics.
The user identification code refers to a tag capable of identifying a user. For example, the user identification code may be a job number.
The processing unit 117 processes the request features according to the classification model to obtain a classification result of the classification request.
In at least one embodiment of the present invention, the classification result refers to a stability condition corresponding to a user who needs to perform stability analysis in the classification request.
It is emphasized that the classification result may also be stored in a node of a blockchain in order to further ensure the privacy and security of the classification result.
In at least one embodiment of the present invention, the processing unit 117 discretizes the request feature and inputs the discretized result into the classification model to obtain the classification result.
According to the technical scheme, the discrete results are subjected to kernel density estimation analysis, so that the feature distribution information can be uniformly distributed on the data features, the balance of the training sample is improved, the robustness of the classification model is improved, the sample results can be generated based on the training sample and the feature weight, repeated marking on the training sample is not needed, the generation efficiency of the sample results can be improved, meanwhile, the sample results can be analyzed from data according to the sample probability and the feature weight, the accuracy of the sample results is improved, the accuracy of the classification model is improved again, and the determination accuracy of the classification results can be further improved.
Fig. 3 is a schematic structural diagram of an electronic device according to a preferred embodiment of the present invention for implementing an artificial intelligence-based data classification method.
In one embodiment of the present invention, the electronic device 1 includes, but is not limited to, a memory 12, a processor 13, and computer readable instructions, such as an artificial intelligence based data classification program, stored in the memory 12 and executable on the processor 13.
It will be appreciated by a person skilled in the art that the schematic diagram is only an example of the electronic device 1 and does not constitute a limitation of the electronic device 1, and that it may comprise more or less components than shown, or some components may be combined, or different components, e.g. the electronic device 1 may further comprise an input output device, a network access device, a bus, etc.
The Processor 13 may be a Central Processing Unit (CPU), other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic device, discrete hardware component, etc. The processor 13 is an operation core and a control center of the electronic device 1, and is connected to each part of the whole electronic device 1 by various interfaces and lines, and executes an operating system of the electronic device 1 and various installed application programs, program codes, and the like.
Illustratively, the computer readable instructions may be partitioned into one or more modules/units that are stored in the memory 12 and executed by the processor 13 to implement the present invention. The one or more modules/units may be a series of computer readable instruction segments capable of performing specific functions, which are used for describing the execution process of the computer readable instructions in the electronic device 1. For example, the computer readable instructions may be divided into an acquisition unit 110, a discrete unit 111, an analysis unit 112, an extraction unit 113, a determination unit 114, a generation unit 115, an adjustment unit 116, and a processing unit 117.
The memory 12 may be used for storing the computer readable instructions and/or modules, and the processor 13 implements various functions of the electronic device 1 by executing or executing the computer readable instructions and/or modules stored in the memory 12 and invoking data stored in the memory 12. The memory 12 may mainly include a program storage area and a data storage area, wherein the program storage area may store an operating system, an application program required by at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may store data created according to use of the electronic device, and the like. The memory 12 may include non-volatile and volatile memories, such as: a hard disk, a memory, a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), at least one magnetic disk storage device, a Flash memory device, or other storage device.
The memory 12 may be an external memory and/or an internal memory of the electronic device 1. Further, the memory 12 may be a memory having a physical form, such as a memory stick, a TF Card (Trans-flash Card), or the like.
The integrated modules/units of the electronic device 1 may be stored in a computer-readable storage medium if they are implemented in the form of software functional units and sold or used as separate products. Based on such understanding, all or part of the flow of the method according to the above embodiments may be implemented by hardware that is configured to be instructed by computer readable instructions, which may be stored in a computer readable storage medium, and when the computer readable instructions are executed by a processor, the steps of the method embodiments may be implemented.
Wherein the computer readable instructions comprise computer readable instruction code which may be in source code form, object code form, an executable file or some intermediate form, and the like. The computer-readable medium may include: any entity or device capable of carrying said computer readable instruction code, recording medium, U-disk, removable hard disk, magnetic disk, optical disk, computer Memory, Read-Only Memory (ROM), Random Access Memory (RAM).
The block chain is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, a consensus mechanism, an encryption algorithm and the like. A block chain (Blockchain), which is essentially a decentralized database, is a series of data blocks associated by using a cryptographic method, and each data block contains information of a batch of network transactions, so as to verify the validity (anti-counterfeiting) of the information and generate a next block. The blockchain may include a blockchain underlying platform, a platform product service layer, an application service layer, and the like.
With reference to fig. 1, the memory 12 in the electronic device 1 stores computer-readable instructions to implement an artificial intelligence based data classification method, and the processor 13 can execute the computer-readable instructions to implement:
obtaining a plurality of initial samples, wherein each initial sample comprises sample values of a sample user in a plurality of data characteristics and a user result corresponding to the sample user;
performing discrete processing on the sample value to obtain a discrete result;
performing kernel density estimation analysis on the discrete result based on the plurality of data characteristics to obtain characteristic distribution information of each data characteristic;
extracting information in each feature distribution information to obtain a training sample;
determining a feature weight of each data feature according to the discrete result and the user result;
generating a sample result of the training sample according to the training sample and the feature weight;
adjusting a preset network based on the training sample and the sample result to obtain a classification model;
when a classification request is received, acquiring request characteristics according to the classification request;
and processing the request characteristics according to the classification model to obtain a classification result of the classification request.
Specifically, the processor 13 may refer to the description of the relevant steps in the embodiment corresponding to fig. 1 for a specific implementation method of the computer readable instructions, which is not described herein again.
In the embodiments provided in the present invention, it should be understood that the disclosed system, apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the modules is only one logical functional division, and other divisions may be realized in practice.
The computer readable storage medium has computer readable instructions stored thereon, wherein the computer readable instructions when executed by the processor 13 are configured to implement the steps of:
obtaining a plurality of initial samples, wherein each initial sample comprises sample values of a sample user in a plurality of data characteristics and a user result corresponding to the sample user;
performing discrete processing on the sample value to obtain a discrete result;
performing kernel density estimation analysis on the discrete result based on the plurality of data characteristics to obtain characteristic distribution information of each data characteristic;
extracting information in each feature distribution information to obtain a training sample;
determining a feature weight of each data feature according to the discrete result and the user result;
generating a sample result of the training sample according to the training sample and the feature weight;
adjusting a preset network based on the training sample and the sample result to obtain a classification model;
when a classification request is received, acquiring request characteristics according to the classification request;
and processing the request characteristics according to the classification model to obtain a classification result of the classification request.
The modules described as separate parts may or may not be physically separate, and parts displayed as modules may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment.
In addition, functional modules in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, or in a form of hardware plus a software functional module.
The present embodiments are therefore to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference signs in the claims shall not be construed as limiting the claim concerned.
Furthermore, it is obvious that the word "comprising" does not exclude other elements or steps, and the singular does not exclude the plural. The plurality of units or devices may also be implemented by one unit or device through software or hardware. The terms first, second, etc. are used to denote names, but not any particular order.
Finally, it should be noted that the above embodiments are only for illustrating the technical solutions of the present invention and not for limiting, and although the present invention is described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications or equivalent substitutions may be made on the technical solutions of the present invention without departing from the spirit and scope of the technical solutions of the present invention.

Claims (10)

1. An artificial intelligence based data classification method, characterized in that the artificial intelligence based data classification method comprises:
obtaining a plurality of initial samples, wherein each initial sample comprises sample values of a sample user in a plurality of data characteristics and a user result corresponding to the sample user;
performing discrete processing on the sample value to obtain a discrete result;
performing kernel density estimation analysis on the discrete result based on the plurality of data characteristics to obtain characteristic distribution information of each data characteristic;
extracting information in each feature distribution information to obtain a training sample;
determining a feature weight of each data feature according to the discrete result and the user result;
generating a sample result of the training sample according to the training sample and the feature weight;
adjusting a preset network based on the training sample and the sample result to obtain a classification model;
when a classification request is received, acquiring request characteristics according to the classification request;
and processing the request characteristics according to the classification model to obtain a classification result of the classification request.
2. The artificial intelligence based data classification method of claim 1, wherein the discretizing the sample values to obtain discretized results comprises:
detecting a data type of the sample value;
screening sample values with the data types being numerical types from the sample values to serve as first numerical values, and screening sample values with the data types not being the numerical types from the sample values to serve as information to be processed;
acquiring data characteristics to which the information to be processed belongs as target characteristics, and acquiring a plurality of preset ranges of the target characteristics;
dispersing the information to be processed into scores corresponding to the preset ranges to obtain a second score;
determining the first numerical value and the second numerical value as the discrete result.
3. The artificial intelligence based data classification method of claim 1, wherein the performing a kernel density estimation analysis on the discrete results based on the plurality of data features to obtain feature distribution information of each data feature comprises:
for each data feature, acquiring an attribute feature of the data feature;
selecting a kernel function corresponding to the attribute characteristics from preset functions;
calculating the feature distribution information of the data feature according to the following formula based on the kernel function and the discrete result:
Figure 520878DEST_PATH_IMAGE001
;
wherein the content of the first and second substances,
Figure 257890DEST_PATH_IMAGE002
it is referred to the feature distribution information,
Figure 781275DEST_PATH_IMAGE003
refers to the number of such discrete results,
Figure 211119DEST_PATH_IMAGE004
Figure 452745DEST_PATH_IMAGE005
it is meant that the discrete result is,
Figure 727868DEST_PATH_IMAGE006
is referred to as
Figure 371339DEST_PATH_IMAGE007
The number of the discrete results is one,
Figure 440926DEST_PATH_IMAGE008
refers to the kernel function.
4. The artificial intelligence based data classification method of claim 1, wherein the extracting information in each feature distribution information to obtain training samples comprises:
randomly selecting any feature distribution information from the plurality of feature distribution information as target feature distribution information, and determining the plurality of feature distribution information except the target feature distribution information as the rest feature distribution information;
randomly extracting any feature data from the target feature distribution information, and extracting initial feature data from the rest feature distribution information;
calculating the coexistence probability of the arbitrary feature data and the initial feature data;
determining the initial characteristic data with the coexistence probability larger than a preset probability threshold value as target characteristic data;
and determining the combination of the arbitrary characteristic data and each target characteristic data as the training sample.
5. The artificial intelligence based data classification method of claim 4, wherein the calculating the coexistence probability of the arbitrary feature data and the initial feature data in the remaining feature distribution information comprises:
calculating the information correlation degree of the target characteristic distribution information and the rest characteristic distribution information;
calculating a first data probability of the arbitrary feature data in the target feature distribution information, and calculating a second data probability of the initial feature data in the rest feature distribution information;
and calculating the product of the information correlation degree, the first data probability and the second data probability to obtain the coexistence probability.
6. The artificial intelligence based data classification method of claim 1, wherein the determining a feature weight for each data feature according to the discretized results and the user results comprises:
discretizing the user result to obtain a labeling result;
based on the labeling result and the discrete result, calculating the feature weight according to the following formula:
Figure 560061DEST_PATH_IMAGE009
Figure 638875DEST_PATH_IMAGE010
wherein the content of the first and second substances,
Figure 136853DEST_PATH_IMAGE011
the result of the annotation is referred to as the result of the annotation,
Figure 705237DEST_PATH_IMAGE012
Figure 655876DEST_PATH_IMAGE013
Figure 538381DEST_PATH_IMAGE014
Figure 890865DEST_PATH_IMAGE015
respectively, refer to the weight of the feature,
Figure 833413DEST_PATH_IMAGE005
Figure 333665DEST_PATH_IMAGE016
Figure 754282DEST_PATH_IMAGE014
Figure 226851DEST_PATH_IMAGE017
the feature weight value corresponding to each data feature is referred to.
7. The artificial intelligence based data classification method of claim 1, wherein the adjusting a predetermined network based on the training samples and the sample results to obtain a classification model comprises:
determining classification scenes corresponding to the plurality of initial samples;
acquiring a network matched with the classification scene from a classification network library as the preset network;
inputting the training sample into the preset network to obtain a prediction result;
calculating a network loss value of the preset network based on the prediction result and the sample result;
and adjusting the network parameters of the preset network according to the network loss value until the network loss value is not reduced any more, and obtaining the classification model.
8. An artificial intelligence based data classification apparatus, characterized in that the artificial intelligence based data classification apparatus comprises:
the device comprises an acquisition unit, a processing unit and a processing unit, wherein the acquisition unit is used for acquiring a plurality of initial samples, and each initial sample comprises a sample value of a sample user in a plurality of data characteristics and a user result corresponding to the sample user;
the discrete unit is used for carrying out discrete processing on the sample value to obtain a discrete result;
the analysis unit is used for carrying out nuclear density estimation analysis on the discrete result based on the plurality of data characteristics to obtain characteristic distribution information of each data characteristic;
the extraction unit is used for extracting information in each feature distribution information to obtain a training sample;
a determining unit, configured to determine a feature weight of each data feature according to the discrete result and the user result;
the generating unit is used for generating a sample result of the training sample according to the training sample and the characteristic weight;
the adjusting unit is used for adjusting a preset network based on the training sample and the sample result to obtain a classification model;
the obtaining unit is further used for obtaining request characteristics according to the classification request when the classification request is received;
and the processing unit is used for processing the request characteristics according to the classification model to obtain a classification result of the classification request.
9. An electronic device, characterized in that the electronic device comprises:
a memory storing computer readable instructions; and
a processor executing computer readable instructions stored in the memory to implement the artificial intelligence based data classification method of any of claims 1 to 7.
10. A computer-readable storage medium characterized by: the computer-readable storage medium has stored therein computer-readable instructions that are executed by a processor in an electronic device to implement the artificial intelligence based data classification method of any of claims 1 to 7.
CN202111029679.3A 2021-09-03 2021-09-03 Employee stability classification method based on artificial intelligence and related equipment Active CN113516205B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111029679.3A CN113516205B (en) 2021-09-03 2021-09-03 Employee stability classification method based on artificial intelligence and related equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111029679.3A CN113516205B (en) 2021-09-03 2021-09-03 Employee stability classification method based on artificial intelligence and related equipment

Publications (2)

Publication Number Publication Date
CN113516205A true CN113516205A (en) 2021-10-19
CN113516205B CN113516205B (en) 2021-12-14

Family

ID=78063201

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111029679.3A Active CN113516205B (en) 2021-09-03 2021-09-03 Employee stability classification method based on artificial intelligence and related equipment

Country Status (1)

Country Link
CN (1) CN113516205B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113850632A (en) * 2021-11-29 2021-12-28 平安科技(深圳)有限公司 User category determination method, device, equipment and storage medium

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103645249A (en) * 2013-11-27 2014-03-19 国网黑龙江省电力有限公司 Online fault detection method for reduced set-based downsampling unbalance SVM (Support Vector Machine) transformer
US20160366565A1 (en) * 2015-06-12 2016-12-15 Telefonaktiebolaget L M Ericsson (Publ) Grouping wireless devices in a communications network
CN108509982A (en) * 2018-03-12 2018-09-07 昆明理工大学 A method of the uneven medical data of two classification of processing
CN109087022A (en) * 2018-08-22 2018-12-25 北京三快在线科技有限公司 Analysis method, device, medium and the electronic equipment of user's stability
US20190347277A1 (en) * 2018-05-09 2019-11-14 Kabushiki Kaisha Toshiba Clustering device, clustering method, and computer program product
CN111598168A (en) * 2020-05-18 2020-08-28 腾讯科技(深圳)有限公司 Image classification method, device, computer equipment and medium
CN112801718A (en) * 2021-02-22 2021-05-14 平安科技(深圳)有限公司 User behavior prediction method, device, equipment and medium
CN112966768A (en) * 2021-03-19 2021-06-15 廊坊银行股份有限公司 User data classification method, device, equipment and medium

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103645249A (en) * 2013-11-27 2014-03-19 国网黑龙江省电力有限公司 Online fault detection method for reduced set-based downsampling unbalance SVM (Support Vector Machine) transformer
US20160366565A1 (en) * 2015-06-12 2016-12-15 Telefonaktiebolaget L M Ericsson (Publ) Grouping wireless devices in a communications network
CN108509982A (en) * 2018-03-12 2018-09-07 昆明理工大学 A method of the uneven medical data of two classification of processing
US20190347277A1 (en) * 2018-05-09 2019-11-14 Kabushiki Kaisha Toshiba Clustering device, clustering method, and computer program product
CN109087022A (en) * 2018-08-22 2018-12-25 北京三快在线科技有限公司 Analysis method, device, medium and the electronic equipment of user's stability
CN111598168A (en) * 2020-05-18 2020-08-28 腾讯科技(深圳)有限公司 Image classification method, device, computer equipment and medium
CN112801718A (en) * 2021-02-22 2021-05-14 平安科技(深圳)有限公司 User behavior prediction method, device, equipment and medium
CN112966768A (en) * 2021-03-19 2021-06-15 廊坊银行股份有限公司 User data classification method, device, equipment and medium

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
FIRUZ KAMALOV: "Kernel density estimation based sampling for imbalanced class distribution", 《INFORMATION SCIENCES》 *
李俊林等: "改进的基于核密度估计的数据分类算法", 《控制与决策》 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113850632A (en) * 2021-11-29 2021-12-28 平安科技(深圳)有限公司 User category determination method, device, equipment and storage medium

Also Published As

Publication number Publication date
CN113516205B (en) 2021-12-14

Similar Documents

Publication Publication Date Title
CN112417096B (en) Question-answer pair matching method, device, electronic equipment and storage medium
CN113656547B (en) Text matching method, device, equipment and storage medium
CN113689436B (en) Image semantic segmentation method, device, equipment and storage medium
CN114037545A (en) Client recommendation method, device, equipment and storage medium
CN114090794A (en) Event map construction method based on artificial intelligence and related equipment
CN115222443A (en) Client group division method, device, equipment and storage medium
CN113570391B (en) Community division method, device, equipment and storage medium based on artificial intelligence
CN114510487A (en) Data table merging method, device, equipment and storage medium
CN113516205B (en) Employee stability classification method based on artificial intelligence and related equipment
CN113705468A (en) Digital image identification method based on artificial intelligence and related equipment
CN113268597A (en) Text classification method, device, equipment and storage medium
CN113850632B (en) User category determination method, device, equipment and storage medium
CN116629423A (en) User behavior prediction method, device, equipment and storage medium
CN113420545B (en) Abstract generation method, device, equipment and storage medium
CN113627186B (en) Entity relation detection method based on artificial intelligence and related equipment
CN113343700B (en) Data processing method, device, equipment and storage medium
CN112949305B (en) Negative feedback information acquisition method, device, equipment and storage medium
CN114529209A (en) User allocation method, device, equipment and storage medium
CN113902302A (en) Data analysis method, device, equipment and storage medium based on artificial intelligence
CN114942749A (en) Development method, device and equipment of approval system and storage medium
CN113065947A (en) Data processing method, device, equipment and storage medium
CN113742455B (en) Resume searching method, device, equipment and storage medium based on artificial intelligence
CN114020687B (en) User retention analysis method, device, equipment and storage medium
CN114581177B (en) Product recommendation method, device, equipment and storage medium
CN113688119B (en) Medical database construction method based on artificial intelligence and related equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant