CN109409672A - A kind of auto repair technician classifies grading modeling method and device - Google Patents

A kind of auto repair technician classifies grading modeling method and device Download PDF

Info

Publication number
CN109409672A
CN109409672A CN201811114429.8A CN201811114429A CN109409672A CN 109409672 A CN109409672 A CN 109409672A CN 201811114429 A CN201811114429 A CN 201811114429A CN 109409672 A CN109409672 A CN 109409672A
Authority
CN
China
Prior art keywords
cluster
data
classification
diagnostic
multiple cluster
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201811114429.8A
Other languages
Chinese (zh)
Inventor
刘均
潘洋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Launch Technology Co Ltd
Original Assignee
Shenzhen Launch Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Launch Technology Co Ltd filed Critical Shenzhen Launch Technology Co Ltd
Priority to CN201811114429.8A priority Critical patent/CN109409672A/en
Publication of CN109409672A publication Critical patent/CN109409672A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0639Performance analysis of employees; Performance analysis of enterprise or organisation operations
    • G06Q10/06398Performance of employee with respect to a job function
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/20Administration of product repair or maintenance

Landscapes

  • Business, Economics & Management (AREA)
  • Human Resources & Organizations (AREA)
  • Engineering & Computer Science (AREA)
  • Strategic Management (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Economics (AREA)
  • Physics & Mathematics (AREA)
  • Development Economics (AREA)
  • Marketing (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Tourism & Hospitality (AREA)
  • Educational Administration (AREA)
  • General Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Game Theory and Decision Science (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

This application discloses auto repair technician classification grading modeling method and devices, wherein, the described method includes: obtaining multiple diagnostic datas, wherein, the multiple diagnostic data is the data that different Motor Vehicle Technicians repairs generation, the diagnostic data after multiple cleanings is clustered again, to obtain multiple cluster set and corresponding cluster classification, multiple cluster datas and corresponding cluster classification is reused disaggregated model is trained to obtain classification rating model, wherein, the multiple cluster data belongs to the multiple cluster set.The process obtains training data by Unsupervised clustering algorithm, then the training data is trained disaggregated model to obtain classification rating model, to realize comprehensive to auto repair technician, objective, scientific, reasonable and lower-cost evaluation and classification.

Description

A kind of auto repair technician classifies grading modeling method and device
Technical field
This application involves artificial intelligence machine learning areas, and in particular to auto repair technician classifies grading modeling method And device.
Background technique
In the numerous Automobile Service Factories in the whole nation, the auto repair technician of different technologies level provides vehicle failure for customer and examines Disconnected, maintenance service, due to the difference of the factors such as region, the entire period of actual operation and experience accumulation, the required level of service of every maintenance technique, It is also different in terms of the auto repair being good at.
Currently, being capable of providing the advanced technician in all directions, reliably repaired in all auto repair technicians in the whole nation and only accounting for 5%, remaining 95% technician can only provide profession detection, maintenance service in a certain respect.On the one hand, special in order to be provided to customer On the other hand industry, good automobile inspection service convenient for carrying out operation management to auto repair technician, and promote it and overhaul water Flat, therefore, it is necessary to carry out classification grading to auto repair technician.
Currently, the classification grading to auto repair technician is mainly the filter screen in such a way that artificial experience lays down a regulation The auto repair technician group of each rank is selected, the method has the following problems:
First, it is unilateral to the evaluation of auto repair technician, it, can only be from measuring technician in a certain respect by setting key index Technical level is influenced by subjectivity setting, can not fully assess the ability of auto repair technician.
Second, evaluation method is at high cost, and periodical qualification needs are spent human and material resources, and evaluation rubric complexity is at high cost, and It is easy to appear mistake.
Third lays down a regulation to auto repair technician's rating result irrational distribution according to statistical indicator, and generates and meet The technician group of specific distribution, there is a situation where it is inconsistent with practical technician's horizontal distribution, will cause technician grading it is excessively high or It is too low.
Summary of the invention
The embodiment of the present application provides auto repair technician classification grading modeling method and device, to realize to auto repair Technician is comprehensive, objective, scientific, reasonable and lower-cost evaluation and classification.
In a first aspect, providing a kind of auto repair technician classification grading modeling method, comprising:
Obtain multiple diagnostic datas, wherein the multiple diagnostic data is that different Motor Vehicle Technicians repairs generation Data;
Diagnostic data after multiple cleanings is clustered, to obtain multiple cluster set and the multiple cluster set Corresponding cluster classification, wherein each cluster set in the multiple cluster set includes at least one diagnostic data;
Disaggregated model is instructed using multiple cluster datas and the multiple cluster data corresponding cluster classification Practice to obtain classification rating model, wherein the multiple cluster data belongs to the multiple cluster set.
More specifically, after the multiple diagnostic datas of acquisition, the method also includes: to the multiple diagnostic data Data prediction is carried out, abnormity diagnosed data is cleaned, to obtain the diagnostic data after multiple cleanings.
More specifically, the diagnostic data to after multiple cleanings clusters, comprising: use Unsupervised clustering algorithm pair Diagnostic data after the multiple cleaning is clustered;
After the diagnostic data to after multiple cleanings clusters, the method also includes: according to preset pumping Sample ratio and sample size gather the multiple cluster carries out equivalent random sampling respectively, to obtain the multiple cluster numbers According to.
More specifically, the method also includes: diagnostic data after the multiple cleaning is input to the classification grading mould Type obtains multiple classification set;
Compare the multiple classification set and the multiple cluster set, obtain the multiple classification gather with it is the multiple Variance rate between cluster set;
Determine whether the variance rate is more than specified threshold;If the variance rate is no more than specified threshold, determine described more A cluster set is combined into effective training data.
More specifically, the method also includes: if the variance rate is more than specified threshold, adjust the default sampling proportion And sample size, equivalent random sampling is carried out to the multiple cluster set respectively, retrieves the multiple cluster data;
Alternatively, adjusting the parameter preset of the Unsupervised clustering algorithm if the variance rate is more than specified threshold;Using nothing Supervision clustering algorithm clusters the diagnostic data after the multiple cleaning, to obtain the multiple cluster set;According to Preset sampling proportion and sample size gather the multiple cluster carries out equivalent random sampling respectively, retrieves described more A cluster data.
Second aspect provides a kind of auto repair technician and classifies grading model building device, comprising: acquiring unit, cluster cell, Processing unit:
The acquiring unit, for obtaining multiple diagnostic datas, wherein the multiple diagnostic data is different automobile skill Teacher repairs the data of generation;
The cluster cell, for being clustered to the diagnostic data after multiple cleanings, to obtain multiple cluster set Gather corresponding cluster classification with the multiple cluster, wherein each cluster set packet in the multiple cluster set Include at least one diagnostic data;
The processing unit, for using multiple cluster datas and the corresponding cluster classification of the multiple cluster data Disaggregated model is trained to obtain classification rating model, wherein the multiple cluster data belongs to the multiple cluster Set.
More specifically, the cluster cell is specifically used for: using Unsupervised clustering algorithm to examining after the multiple cleaning Disconnected data are clustered;
Described device further includes sampling unit, and the sampling unit is used for, in the diagnostic data to after multiple cleanings After being clustered, progress equivalent respectively is gathered to the multiple cluster according to preset sampling proportion and sample size and is taken out at random Sample, to obtain the multiple cluster data.
More specifically, described device further include: taxon, comparison unit, judging unit,
The taxon is obtained for diagnostic data after the multiple cleaning to be input to the classification rating model Multiple classification set;
The comparison unit obtains the multiple for comparing the multiple classification set and the multiple cluster set Variance rate between classification set and the multiple cluster set;
The judging unit refers to for determining whether the variance rate is more than specified threshold if the variance rate is no more than Determine threshold value, determines that the multiple cluster set is combined into effective training data.
The third aspect provides a kind of server, including processor, input equipment, output equipment and memory, wherein institute Memory is stated for storing computer program, the computer program includes program instruction, and the processor is described for calling Program instruction, the method for executing above-mentioned first aspect.
Fourth aspect, provides a kind of computer readable storage medium, and the computer storage medium is stored with computer journey Sequence, the computer program include program instruction, and described program instruction when being executed by a processor executes the processor The method for stating first aspect.
Implement the embodiment of the present application, will have the following beneficial effects:
In above scheme, server obtains multiple diagnostic datas, wherein the multiple diagnostic data is different automobile skill Teacher repairs the data of generation, then clusters to the diagnostic data after multiple cleanings, thus obtain multiple clusters set and The multiple cluster gathers corresponding cluster classification, reuses multiple cluster datas and the multiple cluster data is right respectively The cluster classification answered is trained disaggregated model to obtain classification rating model, wherein the multiple cluster data belongs to The multiple cluster set.The process obtains training data by Unsupervised clustering algorithm, then by the training data to classification Model is trained to obtain classification rating model, thus realize it is comprehensive to auto repair technician, objective, scientific, reasonable and at This lower evaluation and classification.
Detailed description of the invention
Fig. 1 is a kind of flow diagram of auto repair technician classification grading modeling method provided by the present application;
Fig. 2 is a kind of structural schematic diagram of auto repair technician classification grading model building device provided by the present application;
Fig. 3 is a kind of schematic block diagram of equipment provided by the present application;
Fig. 4 is Clustering Effect schematic diagram provided by the present application.
Specific embodiment
The embodiment of the present application provides auto repair technician classification grading modeling method and device, can be realized and ties up to automobile Repair comprehensive technician, objective, scientific, reasonable and lower-cost evaluation and classification.
Below in conjunction with the attached drawing in the embodiment of the present application, technical solutions in the embodiments of the present application carries out clear, complete Site preparation description, it is clear that described embodiments are only a part of embodiments of the present application, instead of all the embodiments.It is based on Embodiment in the application, it is obtained by those of ordinary skill in the art without making creative efforts every other Embodiment shall fall in the protection scope of this application.
It should be appreciated that ought use in this specification and in the appended claims, term " includes " and "comprising" instruction Described feature, entirety, step, operation, the presence of element and/or component, but one or more of the other feature, whole is not precluded Body, step, operation, the presence or addition of element, component and/or its set.
It should be noted that the term used in the embodiment of the present application is only merely for the mesh of description specific embodiment , it is not intended to be limiting the application." the one of the embodiment of the present application and singular used in the attached claims Kind ", " described " and "the" are also intended to including most forms, unless the context clearly indicates other meaning.It is also understood that this Term "and/or" used herein refers to and includes one or more associated any or all possible group for listing project It closes.
One embodiment of the application auto repair technician classification grading modeling method.Wherein, a kind of auto repair technician Classification grading modeling method includes: to obtain multiple diagnostic datas, wherein the multiple diagnostic data be different Motor Vehicle Technicians into The data that row maintenance generates;Diagnostic data after multiple cleanings is clustered, to obtain multiple clusters set and described more A cluster gathers corresponding cluster classification, wherein each cluster set in the multiple cluster set includes at least one A diagnostic data;Disaggregated model is carried out using multiple cluster datas and the multiple cluster data corresponding cluster classification Training is to obtain classification rating model, wherein the multiple cluster data belongs to the multiple cluster set.
Referring first to Fig. 1, Fig. 1 is a kind of auto repair technician classification grading modeling that one embodiment of the application provides The flow diagram of method.Wherein, as shown in Figure 1, one kind that one embodiment that one embodiment of the application provides provides Auto repair technician classify grading modeling method may include:
101, multiple diagnostic datas are obtained.
In a concrete implementation mode, before the multiple diagnostic datas of acquisition, comprising: receive auto repair technician Diagnosis report, the diagnosis report be auto repair technician pass through terminal device upload, wherein the diagnosis report content It include: technician's number, Diagnostic Time, geographical location (longitude and latitude), error code, maintenance situation.The diagnosis report is carried out again Feature mining is the theme with auto repair technician and excavates key message, wherein the key message includes: maintenance number, maintenance Vehicle, maintenance position, repair time, trouble shooting, technician's length of service and technician's liveness feature are one such or a variety of, Data quantization is realized to the behavior of auto repair technician accordingly, constructs the portrait of the auto repair technician.Here, described to examine Disconnected report content and the key message can also include more or less data item, are used only for illustrating herein, not answer Constitute specific limit.
102, the diagnostic data after multiple cleanings is clustered, to obtain multiple cluster set and the multiple cluster Gather corresponding cluster classification.
In a concrete implementation mode, after the multiple diagnostic datas of acquisition, the method also includes: to described Multiple diagnostic datas carry out data prediction, clean abnormity diagnosed data, so that the diagnostic data after multiple cleanings is obtained, In, data cleansing is the process that data are examined and verified again, it is therefore intended that deletes mistake existing for duplicate message, correction Accidentally, and data consistency is provided.The data cleansing step includes: data, removal or the modification lattice that removal or completion have missing The data of formula and content mistake, the data of removal or modification logic error, removal redundant data etc., it should be appreciated that the data Cleaning step may include more or less steps, be used only for illustrating here, should not constitute specific restriction.
In a concrete implementation mode, the diagnostic data to after multiple cleanings is clustered, comprising: using no prison It superintends and directs clustering algorithm to cluster the diagnostic data after the multiple cleaning, wherein the Unsupervised clustering is data mining neck Important one of technology in domain, for finding classification unknown in data object, to the data set of a large amount of unknown marks, by data Inherent similitude data set is divided into multiple classifications, make that the data similarity in classification is larger and the data between classification are similar It spends smaller.The Unsupervised clustering algorithm mainly divides five classes: clustering algorithm, network-based clustering algorithm, base based on division Clustering algorithm in density, the clustering algorithm based on level and the clustering algorithm based on model.Specifically, described unsupervised poly- Class algorithm includes: K-means algorithm, gauss hybrid models, hierarchical clustering etc., here with K-means algorithm as an example into One step introduction, K-means algorithm idea are to split data into multiple heaps, and each heap is one kind, and each heap has in a cluster The heart, and this cluster centre is the mean value of all data in this class, in this heap all the points to the heap cluster centre away from With a distance from the cluster centre for both less than arriving other heaps, the Clustering Effect of K-means algorithm is as shown in Figure 4.It should be understood that the nothing Supervision clustering algorithm is only served in citing, should not constitute specific restriction.
In a concrete implementation mode, after the diagnostic data to after multiple cleanings clusters, the side Method further include: the multiple cluster is gathered according to preset sampling proportion and sample size and carries out equivalent random sampling respectively, To obtain the multiple cluster data.For example, being I respectively if the diagnostic data is divided into 5 cluster set Class, II class, Group III, IV class and V class, the I class include 20000 cluster datas, and the II class includes 10000 clusters Data, the Group III include 10000 cluster datas, and the IV class includes 5000 cluster datas, and the V class includes 5000 A cluster data gathers each cluster according to 2% sampling proportion and carries out random sampling, obtains 400 samples of I class, II class 100 200 samples, 200 samples of Group III, 100 samples of VI class and V class samples.It advanced optimizes, instructs in order to prevent It practises the model come and tends to a certain classification, the sample size of every one kind should keep identical, it is therefore desirable to minimum sample number Subject to amount, the sample for the classification for being more than the minimum sample size is deleted again, final I class, II class, Group III, VI Class, V class are all 100 sample sizes.It should be understood that random sampling here is only served in citing, specific limit should not be constituted It is fixed.
103, using multiple cluster datas and the corresponding cluster classification of the multiple cluster data to disaggregated model into Row training is to obtain classification rating model.
In a concrete implementation mode, the disaggregated model can be extreme random tree (Extremely Randomized trees, ExtraTrees), the extreme random tree is made of many decision trees, and every decision tree uses phase Same whole training samples, when having a new samples, each decision tree allowed is judged respectively, is determined described new Which kind of sample belongs to, then with the mode of ballot, most one kind of ballot quantity, as final classification results.Here, institute Stating disaggregated model can also be random forests algorithm (Random Forest), logistic regression algorithm (Logistic Regression) or other sorting algorithms, above-mentioned disaggregated model are only served in citing, should not constitute specific restriction.
In a concrete implementation mode, the method also includes: diagnostic data after the multiple cleaning is input to institute Classification rating model is stated, multiple classification set are obtained;The multiple classification set and the multiple cluster set are compared, institute is obtained State the variance rate between multiple classification set and the multiple cluster set;Determine whether the variance rate is more than specified threshold; If the variance rate is no more than specified threshold, determine that the multiple cluster set is combined into effective training data.Specifically, if extracting Training sample of the training data as disaggregated model after 2% cluster, then training data conduct after remaining 98% cluster The test set of disaggregated model.Test set is input to the classification rating model, if what obtained classification set and cluster were gathered Variance rate is less than specified threshold value, then illustrates that the cluster set is combined into effective training data, wherein the variance rate is will to classify The numerical value that the sample number different with cluster result is obtained divided by total number of samples.It should be understood that above-mentioned be only served in citing, do not answer Constitute specific limit.
In a concrete implementation mode, the method also includes: if the variance rate is more than specified threshold, described in adjustment Default sampling proportion and sample size carry out equivalent random sampling to the multiple cluster set respectively, retrieve described more A cluster data;Alternatively, adjusting the parameter preset of the Unsupervised clustering algorithm if the variance rate is more than specified threshold;It adopts The diagnostic data after the multiple cleaning is clustered with Unsupervised clustering algorithm, to obtain the multiple cluster set; The multiple cluster is gathered according to preset sampling proportion and sample size and carries out equivalent random sampling respectively, retrieves institute State multiple cluster datas.
In the embodiment of the present application, server obtains multiple diagnostic datas, wherein the multiple diagnostic data is different Motor Vehicle Technician repairs the data of generation, then clusters to the diagnostic data after multiple cleanings, to obtain multiple clusters Set and corresponding cluster classification, reuse multiple cluster datas and corresponding cluster classification disaggregated model is trained thus Obtain classification rating model, wherein the multiple cluster data belongs to the multiple cluster set.The process passes through unsupervised poly- Class algorithm obtains training data, then the training data is trained disaggregated model to obtain classification rating model, thus real Now comprehensive to auto repair technician, objective, scientific, reasonable and lower-cost evaluation and classification.
The embodiment of the present application also provides a kind of auto repair technician classification grading model building device, referring to Fig. 2, Fig. 2 is this Apply for a kind of structural schematic diagram for auto repair technician classification grading model building device that embodiment provides, described device 200 includes: Acquiring unit 201, cluster cell 202, processing unit 203.
The acquiring unit 201, for obtaining multiple diagnostic datas, wherein the multiple diagnostic data is different vapour Driving skills teacher repair the data of generation.
In a concrete implementation mode, before the multiple diagnostic datas of acquisition, the receiving unit is also used to connect The diagnosis report of auto repair technician is received, the diagnosis report is that auto repair technician is uploaded by terminal device, wherein institute Stating diagnosis report content includes: technician's number, Diagnostic Time, geographical location (longitude and latitude), error code, maintenance situation.Again to institute It states diagnosis report and carries out feature mining, be the theme with auto repair technician and excavate key message, wherein the key message packet Include: maintenance number, maintenance vehicle, maintenance position, repair time, trouble shooting, technician's length of service and technician's liveness feature its One or more of, data quantization is realized to the behavior of auto repair technician accordingly, constructs the auto repair technician Portrait.Here, the diagnosis report content and the key message can also include more or less data item, herein It is used only for illustrating, specific restriction should not be constituted.
The cluster cell 202, for being clustered to the diagnostic data after multiple cleanings, to obtain multiple cluster sets It closes and the multiple cluster gathers corresponding cluster classification, wherein each cluster set in the multiple cluster set Including at least one diagnostic data.
In a concrete implementation mode, described device further includes pretreatment unit, and the pretreatment unit is used for, in institute It states after obtaining multiple diagnostic datas, data prediction is carried out to the multiple diagnostic data, cleans abnormity diagnosed data, thus Diagnostic data after obtaining multiple cleanings, wherein data cleansing is the process that data are examined and verified again, and purpose exists The mistake existing for deletion duplicate message, correction, and data consistency is provided.The data cleansing step includes: removal or benefit There are the data of missing, the data of removal or modification format and content mistake, removal or the data of modification logic error, removal superfluous entirely Remainder evidence etc., it should be appreciated that the data cleansing step may include more or less steps, be used only for lifting here Example, should not constitute specific restriction.
In a concrete implementation mode, described device also cluster cell, the cluster cell is specifically used for: using no prison It superintends and directs clustering algorithm to cluster the diagnostic data after the multiple cleaning, wherein the Unsupervised clustering is data mining neck Important one of technology in domain, for finding classification unknown in data object, to the data set of a large amount of unknown marks, by data Inherent similitude data set is divided into multiple classifications, make that the data similarity in classification is larger and the data between classification are similar It spends smaller.The Unsupervised clustering algorithm mainly divides five classes: clustering algorithm, network-based clustering algorithm, base based on division Clustering algorithm in density, the clustering algorithm based on level and the clustering algorithm based on model.Specifically, described unsupervised poly- Class algorithm includes: K-means algorithm, gauss hybrid models, hierarchical clustering etc., here with K-means algorithm as an example into One step introduction, K-means algorithm idea are to split data into multiple heaps, and each heap is one kind, and each heap has in a cluster The heart, and this cluster centre is the mean value of all data in this class, in this heap all the points to the heap cluster centre away from With a distance from the cluster centre for both less than arriving other heaps, the Clustering Effect of K-means algorithm is as shown in Figure 4.It should be understood that the nothing Supervision clustering algorithm is only served in citing, should not constitute specific restriction.
In a concrete implementation mode, described device further includes sampling unit 204, and the sampling unit is used for described After being clustered to the diagnostic data after multiple cleanings, according to preset sampling proportion and sample size to the multiple cluster Set carries out equivalent random sampling respectively, to obtain the multiple cluster data.For example, if the diagnostic data is drawn It is divided into 5 cluster set, is I class, II class, Group III, IV class and V class respectively, the I class includes 20000 cluster datas, The II class includes 10000 cluster datas, and the Group III includes 10000 cluster datas, and the IV class includes 5000 poly- Class data, the V class include 5000 cluster datas, gather according to 2% sampling proportion each cluster and carry out random sampling, Obtain 100 400 samples of I class, 200 samples of II class, 200 samples of Group III, 100 samples of VI class and V class samples. It advanced optimizing, trains the model come in order to prevent and tend to a certain classification, the sample size of every one kind should keep identical, Therefore it needs to be subject to minimum sample size, the sample for the classification for being more than the minimum sample size is deleted again, Final I class, II class, Group III, VI class, V class are all 100 sample sizes.It should be understood that the example of random sampling here is only used In citing, specific restriction should not be constituted.
The processing unit 203, for using multiple cluster datas and the corresponding cluster of the multiple cluster data Classification is trained disaggregated model to obtain classification rating model, wherein the multiple cluster data belongs to the multiple Cluster set.
In a concrete implementation mode, the disaggregated model can be extreme random tree (Extremely Randomized trees, ExtraTrees), the extreme random tree is made of many decision trees, and every decision tree uses phase Same whole training samples, when having a new samples, each decision tree allowed is judged respectively, is determined described new Which kind of sample belongs to, then with the mode of ballot, most one kind of ballot quantity, as final classification results.Here, institute Stating disaggregated model can also be random forests algorithm (Random Forest), logistic regression algorithm (Logistic Regression) or other sorting algorithms, above-mentioned disaggregated model are only served in citing, should not constitute specific restriction.
In a concrete implementation mode, described device further includes taxon 205, comparison unit 206, judging unit 207, the taxon is used for, and diagnostic data after the multiple cleaning is input to the classification rating model, is obtained multiple Classification set;The comparison unit obtains the multiple for comparing the multiple classification set and the multiple cluster set Variance rate between classification set and the multiple cluster set;The judging unit, for determining whether the variance rate surpasses Cross specified threshold;If the variance rate is no more than specified threshold, determine that the multiple cluster set is combined into effective training data.Specifically For, if training sample of the training data as disaggregated model after the cluster of extraction 2%, then being instructed after remaining 98% cluster Practice test set of the data as disaggregated model.Test set is input to the classification rating model, if obtained classification set and The variance rate of cluster set is less than specified threshold value, then illustrates that the cluster set is combined into effective training data, wherein the difference Rate is the numerical value for obtaining the sample number different with cluster result of classifying divided by total number of samples.It should be understood that above-mentioned example is only used In citing, specific restriction should not be constituted.
In a concrete implementation mode, if the variance rate is more than specified threshold, adjust the default sampling proportion and Sample size carries out equivalent random sampling to the multiple cluster set respectively, retrieves the multiple cluster data;Or Person adjusts the parameter preset of the Unsupervised clustering algorithm if the variance rate is more than specified threshold;It is calculated using Unsupervised clustering Method clusters the diagnostic data after the multiple cleaning, to obtain the multiple cluster set;According to preset sampling Ratio and sample size gather the multiple cluster carries out equivalent random sampling respectively, retrieves the multiple cluster numbers According to.
In the embodiment of the present application, server obtains multiple diagnostic datas, wherein the multiple diagnostic data is different Motor Vehicle Technician repairs the data of generation, then clusters to the diagnostic data after multiple cleanings, to obtain multiple clusters Set and corresponding cluster classification, reuse multiple cluster datas and corresponding cluster classification disaggregated model is trained thus Obtain classification rating model, wherein the multiple cluster data belongs to the multiple cluster set.The process passes through unsupervised poly- Class algorithm obtains training data, then the training data is trained disaggregated model to obtain classification rating model, thus real Now comprehensive to auto repair technician, objective, scientific, reasonable and lower-cost evaluation and classification.
It is a kind of equipment provided by the embodiments of the present application referring to Fig. 3, Fig. 3, which can be server, as shown in Figure 3 Equipment includes: one or more processors 301;One or more input equipments 302, one or more output equipments 303 and are deposited Reservoir 304.Above-mentioned processor 301, input equipment 302, output equipment 303 and memory 304 are connected by bus 305.Storage For storing instruction, processor 301 is used to execute the instruction of the storage of memory 302 to device 302.
Wherein, in the case which uses as server, processor 301 is for obtaining multiple diagnostic datas, wherein The multiple diagnostic data is the data that different Motor Vehicle Technicians repairs generation;Diagnostic data after multiple cleanings is carried out Cluster, to obtain multiple cluster set and the corresponding cluster classification of the multiple cluster set, wherein the multiple poly- Each cluster set in class set includes at least one diagnostic data;Use multiple cluster datas and the multiple cluster data Corresponding cluster classification is trained disaggregated model to obtain classification rating model, wherein the multiple cluster numbers Gather according to the multiple cluster is belonged to.
It should be appreciated that in the embodiment of the present application, alleged processor 301 can be central processing unit (Central Processing Unit, CPU), which can also be other general processors, digital signal processor (Digital Signal Processor, DSP), specific integrated circuit (Application Specific Integrated Circuit, ASIC), ready-made programmable gate array (Field-Programmable Gate Array, FPGA) or other programmable logic Device, discrete gate or transistor logic, discrete hardware components etc..General processor can be microprocessor or this at Reason device is also possible to any conventional processor etc..
Input equipment 302 may include that Trackpad, fingerprint adopt sensor (for acquiring the finger print information and fingerprint of user Directional information), microphone etc., output equipment 303 may include display (LCD etc.), loudspeaker etc..
The memory 304 may include read-only memory and random access memory, and to processor 301 provide instruction and Data.The a part of of memory 304 can also include nonvolatile RAM.For example, memory 304 can also be deposited Store up the information of device type.
In the specific implementation, processor 301, input equipment 302 described in the embodiment of the present application, output equipment 303 can Execute a kind of auto repair technician provided by the embodiments of the present application classify grading modeling method and device first embodiment and The implementation of terminal described in the embodiment of the present application also can be performed, herein in implementation described in second embodiment It repeats no more.
A kind of computer readable storage medium, the computer-readable storage medium are provided in another embodiment of the application Matter is stored with computer program, and the computer program includes program instruction, and described program instructs realization when being executed by processor: Multiple diagnostic datas are obtained, the diagnostic data after multiple cleanings is clustered, to obtain multiple clusters set and corresponding Classification is clustered, disaggregated model is trained to obtain classification grading mould using multiple cluster datas and corresponding cluster classification Type, wherein the multiple cluster data belongs to the multiple cluster set.
The computer readable storage medium can be the internal storage unit of terminal described in aforementioned any embodiment, example Such as the hard disk or memory of terminal.The computer readable storage medium is also possible to the External memory equipment of the terminal, such as The plug-in type hard disk being equipped in the terminal, intelligent memory card (Smart Media Card, SMC), secure digital (Secure Digital, SD) card, flash card (Flash Card) etc..Further, the computer readable storage medium can also be wrapped both The internal storage unit for including the terminal also includes External memory equipment.The computer readable storage medium is described for storing Other programs and data needed for computer program and the terminal.The computer readable storage medium can be also used for temporarily When store the data that has exported or will export.
Those of ordinary skill in the art may be aware that list described in conjunction with the examples disclosed in the embodiments of the present disclosure Member and algorithm steps, can be realized with electronic hardware, computer software, or a combination of the two, in order to clearly demonstrate hardware With the interchangeability of software, each exemplary composition and step are generally described according to function in the above description.This A little functions are implemented in hardware or software actually, the specific application and design constraint depending on technical solution.Specially Industry technical staff can use different methods to achieve the described function each specific application, but this realization is not It is considered as beyond scope of the present application.
It is apparent to those skilled in the art that for convenience of description and succinctly, the clothes of foregoing description The specific work process of business device, equipment and unit, can refer to corresponding processes in the foregoing method embodiment, no longer superfluous herein It states.
In several embodiments provided herein, it should be understood that disclosed server, device and method, it can To realize by another way.For example, server example described above is only schematical, for example, the list Member division, only a kind of logical function partition, there may be another division manner in actual implementation, for example, multiple units or Component can be combined or can be integrated into another system, or some features can be ignored or not executed.In addition, shown Or the mutual coupling, direct-coupling or communication connection discussed can be through some interfaces, device or unit it is indirect Coupling or communication connection are also possible to electricity, mechanical or other form connections.
The unit as illustrated by the separation member may or may not be physically separated, aobvious as unit The component shown may or may not be physical unit, it can and it is in one place, or may be distributed over multiple In network unit.Some or all of unit therein can be selected to realize the embodiment of the present application scheme according to the actual needs Purpose.
It, can also be in addition, each functional unit in each embodiment of the application can integrate in one processing unit It is that each unit physically exists alone, is also possible to two or more units and is integrated in one unit.It is above-mentioned integrated Unit both can take the form of hardware realization, can also realize in the form of software functional units.
If the integrated unit is realized in the form of SFU software functional unit and sells or use as independent product When, it can store in a computer readable storage medium.Based on this understanding, the technical solution of the application is substantially The all or part of the part that contributes to existing technology or the technical solution can be in the form of software products in other words It embodies, which is stored in a storage medium, including some instructions are used so that a computer Equipment (can be personal computer, server or the network equipment etc.) executes the complete of each embodiment the method for the application Portion or part steps.And storage medium above-mentioned includes: USB flash disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), magnetic or disk etc. are various can store journey The medium of sequence code.
The above, the only specific embodiment of the application, but the protection scope of the application is not limited thereto, it is any Those familiar with the art within the technical scope of the present application, can readily occur in various equivalent modifications or replace It changes, these modifications or substitutions should all cover within the scope of protection of this application.Therefore, the protection scope of the application should be with right It is required that protection scope subject to.

Claims (10)

  1. The grading modeling method 1. a kind of auto repair technician classifies characterized by comprising
    Obtain multiple diagnostic datas, wherein the multiple diagnostic data is the data that different Motor Vehicle Technicians repairs generation;
    Diagnostic data after multiple cleanings is clustered, to obtain multiple cluster set and the multiple cluster set difference Corresponding cluster classification, wherein each cluster set in the multiple cluster set includes at least one diagnostic data;
    Using multiple cluster datas and the corresponding cluster classification of the multiple cluster data to disaggregated model be trained from And obtain classification rating model, wherein the multiple cluster data belongs to the multiple cluster set.
  2. 2. the method according to claim 1, wherein it is described obtain multiple diagnostic datas after, the method Further include:
    Data prediction is carried out to the multiple diagnostic data, abnormity diagnosed data is cleaned, to obtain examining after multiple cleanings Disconnected data.
  3. 3. the method according to claim 1, wherein the diagnostic data to after multiple cleanings clusters, Include:
    The diagnostic data after the multiple cleaning is clustered using Unsupervised clustering algorithm;
    After the diagnostic data to after multiple cleanings clusters, the method also includes:
    The multiple cluster is gathered according to preset sampling proportion and sample size and carries out equivalent random sampling respectively, thus To the multiple cluster data.
  4. 4. according to the method described in claim 3, it is characterized in that, the method also includes:
    Diagnostic data after the multiple cleaning is input to the classification rating model, obtains multiple classification set;
    The multiple classification set and the multiple cluster set are compared, the multiple classification set and the multiple cluster are obtained Variance rate between set;
    Determine whether the variance rate is more than specified threshold;
    If the variance rate is no more than specified threshold, determine that the multiple cluster set is combined into effective training data.
  5. 5. according to the method described in claim 4, it is characterized in that, the method also includes:
    If the variance rate is more than specified threshold, the default sampling proportion and sample size are adjusted, to the multiple cluster set It closes and carries out equivalent random sampling respectively, retrieve the multiple cluster data;Alternatively,
    If the variance rate is more than specified threshold, the parameter preset of the Unsupervised clustering algorithm is adjusted;
    The diagnostic data after the multiple cleaning is clustered using Unsupervised clustering algorithm, to obtain the multiple cluster Set;
    The multiple cluster is gathered according to preset sampling proportion and sample size and carries out equivalent random sampling respectively, again To the multiple cluster data.
  6. The grading model building device 6. a kind of auto repair technician classifies characterized by comprising acquiring unit, cluster cell, processing Unit:
    The acquiring unit, for obtaining multiple diagnostic datas, wherein the multiple diagnostic data be different Motor Vehicle Technicians into The data that row maintenance generates;
    The cluster cell, for being clustered to the diagnostic data after multiple cleanings, to obtain multiple cluster set and institute It states multiple clusters and gathers corresponding cluster classification, wherein each cluster set in the multiple cluster set includes extremely A few diagnostic data;
    The processing unit, for using multiple cluster datas and the corresponding cluster classification of the multiple cluster data to dividing Class model is trained to obtain classification rating model, wherein the multiple cluster data belongs to the multiple cluster set.
  7. 7. device according to claim 6, which is characterized in that the cluster cell is specifically used for:
    The diagnostic data after the multiple cleaning is clustered using Unsupervised clustering algorithm;
    Described device further includes sampling unit, and the sampling unit is used for, and is carried out in the diagnostic data to after multiple cleanings After cluster, the multiple cluster is gathered according to preset sampling proportion and sample size and carries out equivalent random sampling respectively, To obtain the multiple cluster data.
  8. 8. device according to claim 7, which is characterized in that described device further include: taxon, is sentenced comparison unit Disconnected unit,
    The taxon obtains multiple for diagnostic data after the multiple cleaning to be input to the classification rating model Classification set;
    The comparison unit obtains the multiple classification for comparing the multiple classification set and the multiple cluster set Variance rate between set and the multiple cluster set;
    The judging unit, for determining whether the variance rate is more than specified threshold, if the variance rate is no more than specified threshold Value, determines that the multiple cluster set is combined into effective training data.
  9. 9. a kind of server, which is characterized in that including processor, input equipment, output equipment and memory, wherein described to deposit For reservoir for storing computer program, the computer program includes program instruction, and the processor is for calling described program Instruction executes the method according to claim 1 to 5.
  10. 10. a kind of computer readable storage medium, which is characterized in that the computer storage medium is stored with computer program, The computer program includes program instruction, and described program instruction makes the processor execute such as right when being executed by a processor It is required that the described in any item methods of 1-5.
CN201811114429.8A 2018-09-25 2018-09-25 A kind of auto repair technician classifies grading modeling method and device Pending CN109409672A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811114429.8A CN109409672A (en) 2018-09-25 2018-09-25 A kind of auto repair technician classifies grading modeling method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811114429.8A CN109409672A (en) 2018-09-25 2018-09-25 A kind of auto repair technician classifies grading modeling method and device

Publications (1)

Publication Number Publication Date
CN109409672A true CN109409672A (en) 2019-03-01

Family

ID=65465932

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811114429.8A Pending CN109409672A (en) 2018-09-25 2018-09-25 A kind of auto repair technician classifies grading modeling method and device

Country Status (1)

Country Link
CN (1) CN109409672A (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110175195A (en) * 2019-04-23 2019-08-27 哈尔滨工业大学 Mixed gas detection model construction method based on extreme random tree
CN110276013A (en) * 2019-06-27 2019-09-24 深圳市元征科技股份有限公司 A kind of recommended method of maintenance technician, device and storage medium
CN110363307A (en) * 2019-06-03 2019-10-22 阿里巴巴集团控股有限公司 Auto repair maintenance establishment ranking method and device
CN110414866A (en) * 2019-08-07 2019-11-05 云南电网有限责任公司信息中心 Attend a banquet capability assessment method and device based on decision Tree algorithms
CN111669353A (en) * 2019-03-08 2020-09-15 顺丰科技有限公司 Phishing website detection method and system
CN113076697A (en) * 2021-04-20 2021-07-06 潍柴动力股份有限公司 Typical driving condition construction method, related device and computer storage medium
CN113112160A (en) * 2021-04-16 2021-07-13 深圳市轱辘车联数据技术有限公司 Diagnostic data processing method, diagnostic data processing device and electronic equipment

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107169518A (en) * 2017-05-18 2017-09-15 北京京东金融科技控股有限公司 Data classification method, device, electronic installation and computer-readable medium
CN107480696A (en) * 2017-07-12 2017-12-15 深圳信息职业技术学院 A kind of disaggregated model construction method, device and terminal device
CN107633265A (en) * 2017-09-04 2018-01-26 深圳市华傲数据技术有限公司 For optimizing the data processing method and device of credit evaluation model
CN108304427A (en) * 2017-04-28 2018-07-20 腾讯科技(深圳)有限公司 A kind of user visitor's heap sort method and apparatus

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108304427A (en) * 2017-04-28 2018-07-20 腾讯科技(深圳)有限公司 A kind of user visitor's heap sort method and apparatus
CN107169518A (en) * 2017-05-18 2017-09-15 北京京东金融科技控股有限公司 Data classification method, device, electronic installation and computer-readable medium
CN107480696A (en) * 2017-07-12 2017-12-15 深圳信息职业技术学院 A kind of disaggregated model construction method, device and terminal device
CN107633265A (en) * 2017-09-04 2018-01-26 深圳市华傲数据技术有限公司 For optimizing the data processing method and device of credit evaluation model

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111669353A (en) * 2019-03-08 2020-09-15 顺丰科技有限公司 Phishing website detection method and system
CN110175195A (en) * 2019-04-23 2019-08-27 哈尔滨工业大学 Mixed gas detection model construction method based on extreme random tree
CN110175195B (en) * 2019-04-23 2022-11-29 哈尔滨工业大学 Mixed gas detection model construction method based on extreme random tree
CN110363307A (en) * 2019-06-03 2019-10-22 阿里巴巴集团控股有限公司 Auto repair maintenance establishment ranking method and device
CN110276013A (en) * 2019-06-27 2019-09-24 深圳市元征科技股份有限公司 A kind of recommended method of maintenance technician, device and storage medium
CN110414866A (en) * 2019-08-07 2019-11-05 云南电网有限责任公司信息中心 Attend a banquet capability assessment method and device based on decision Tree algorithms
CN113112160A (en) * 2021-04-16 2021-07-13 深圳市轱辘车联数据技术有限公司 Diagnostic data processing method, diagnostic data processing device and electronic equipment
CN113112160B (en) * 2021-04-16 2024-04-30 深圳市轱辘车联数据技术有限公司 Diagnostic data processing method, diagnostic data processing device and electronic equipment
CN113076697A (en) * 2021-04-20 2021-07-06 潍柴动力股份有限公司 Typical driving condition construction method, related device and computer storage medium

Similar Documents

Publication Publication Date Title
CN109409672A (en) A kind of auto repair technician classifies grading modeling method and device
WO2021184630A1 (en) Method for locating pollutant discharge object on basis of knowledge graph, and related device
CN108304427B (en) User passenger group classification method and device
CN109902721A (en) Outlier detection model verification method, device, computer equipment and storage medium
CN112241494B (en) Key information pushing method and device based on user behavior data
CN112232476A (en) Method and device for updating test sample set
CN103796183B (en) A kind of refuse messages recognition methods and device
CN108304567B (en) Method and system for identifying working condition mode and classifying data of high-voltage transformer
CN110689440A (en) Vehicle insurance claim settlement identification method and device based on image identification, computer equipment and storage medium
CN110458240A (en) A kind of three-phase bridge rectifier method for diagnosing faults, terminal device and storage medium
CN109241397A (en) A kind of method and apparatus for cleaning data
CN107944479A (en) Disease forecasting method for establishing model and device based on semi-supervised learning
CN114021784A (en) Method and device for determining residual service life of equipment and electronic equipment
CN110232405A (en) Method and device for personal credit file
CN111984442A (en) Method and device for detecting abnormality of computer cluster system, and storage medium
CN111832654A (en) Electricity stealing and leakage user identification method and device, computer equipment and storage medium
CN108182444A (en) The method and device of video quality diagnosis based on scene classification
CN114444570A (en) Fault detection method, device, electronic equipment and medium
CN114139931A (en) Enterprise data evaluation method and device, computer equipment and storage medium
CN111367782A (en) Method and device for automatically generating regression test data
CN113824580A (en) Network index early warning method and system
CN108476147A (en) Automated method for managing computing system
CN116915710A (en) Traffic early warning method, device, equipment and readable storage medium
CN110415779A (en) Insulation validation checking method, apparatus, equipment and storage medium
CN115640518A (en) Training of user recognition model, user recognition method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20190301