WO2020015140A1 - Passenger rating model generation method and apparatus, and computer device and storage medium - Google Patents

Passenger rating model generation method and apparatus, and computer device and storage medium Download PDF

Info

Publication number
WO2020015140A1
WO2020015140A1 PCT/CN2018/106083 CN2018106083W WO2020015140A1 WO 2020015140 A1 WO2020015140 A1 WO 2020015140A1 CN 2018106083 W CN2018106083 W CN 2018106083W WO 2020015140 A1 WO2020015140 A1 WO 2020015140A1
Authority
WO
WIPO (PCT)
Prior art keywords
rating model
passenger
rating
feature
place
Prior art date
Application number
PCT/CN2018/106083
Other languages
French (fr)
Chinese (zh)
Inventor
孙闳绅
金戈
徐亮
肖京
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2020015140A1 publication Critical patent/WO2020015140A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0639Performance analysis of employees; Performance analysis of enterprise or organisation operations
    • G06Q10/06393Score-carding, benchmarking or key performance indicator [KPI] analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services

Definitions

  • the present application relates to a method, a device, a computer device, and a storage medium for generating a passenger rating model.
  • a method, an apparatus, a computer device, and a storage medium for generating a passenger rating model are provided.
  • a method for generating a passenger rating model includes:
  • the obtained place rating models are combined to obtain a passenger rating model.
  • a passenger rating model generating device includes:
  • a historical clearance record acquisition module configured to obtain a historical clearance record, which carries a risk passenger label or an ordinary passenger label
  • a grouping module configured to group the historical clearance records according to the clearance locations
  • Place rating model generation module which is used to train the historical clearance records after grouping to obtain corresponding place rating models
  • the passenger rating model generating module is configured to combine the obtained place rating models to obtain a passenger rating model.
  • a computer device includes a memory and one or more processors.
  • the memory stores computer-readable instructions.
  • the one or more processors are executed. The following steps:
  • the obtained place rating models are combined to obtain a passenger rating model.
  • One or more non-volatile computer-readable storage media storing computer-readable instructions.
  • the one or more processors execute the following steps:
  • the obtained place rating models are combined to obtain a passenger rating model.
  • FIG. 1 is an application scenario diagram of a method for generating a passenger rating model according to one or more embodiments.
  • FIG. 2 is a schematic flowchart of a method for generating a passenger rating model according to one or more embodiments.
  • FIG. 3 is a block diagram of a passenger rating model generating device according to one or more embodiments.
  • FIG. 4 is a block diagram of a computer device in accordance with one or more embodiments.
  • the method for generating a passenger rating model provided in this application can be applied to the application environment shown in FIG. 1.
  • the server 102 communicates with the database 104 through a network.
  • the server 102 can read the historical clearance records from the database 104.
  • the historical clearance records carry risk travel tags or ordinary passenger labels.
  • the server first groups the historical clearance records according to the clearance locations, and then separates the grouped historical clearance records into groups.
  • the corresponding place rating model is obtained through training; finally, the obtained place rating models are combined to obtain the passenger rating model, which makes the passenger rating model cover a wide range, thereby improving the accuracy of the passenger rating.
  • the server 102 may be implemented by an independent server or a server cluster composed of multiple servers.
  • a method for generating a passenger rating model is provided.
  • the method is applied to the server in FIG. 1 as an example, and includes the following steps:
  • the historical clearance record carries a risk passenger label or an ordinary passenger label.
  • the historical customs clearance record is a record generated by previous passengers during customs clearance.
  • the clearance record contains several basic fields, including name, age, gender, and customs clearance time.
  • the clearance record may include the clearance records of unchecked passengers.
  • the customs clearance records of the inspected passengers correspond to the records of whether the inspected passengers are risky passengers. Among them, for convenience, the unchecked passengers and the inspected and the inspection results are ordinary passengers
  • the corresponding customs clearance records carry ordinary passenger tags.
  • the customs clearance records that are checked and the inspection result is risk passenger carry risk passenger tags.
  • the server can obtain all historical clearance records from the database, or for convenience, the server can also cache some historical clearance records locally to facilitate model training.
  • S204 Group the historical clearance records according to the clearance locations.
  • the historical clearance records may be grouped according to the clearance locations for convenience.
  • the server may first detect the clearance location field of the historical clearance record, and then group the obtained historical clearance records through the clearance location field.
  • the grouped historical clearance records may be separately trained to obtain a place rating model corresponding to the clearance locations.
  • each location rating model is obtained by using a decision tree model, and can predict the risk level of passengers at the crossing points.
  • the server combines the trained place rating models to obtain a passenger rating model.
  • different place rating models can be assigned different weights, and then according to the weights and different place rating models.
  • the passenger rating model is obtained, and the assigned weights can be configured according to the proportion of the clearance history records with risky passenger tags in the clearance history records corresponding to the rating models of various locations. For example, when the above ratio corresponds to a certain clearance location When it is larger, its weight becomes larger.
  • the server may obtain the proportions corresponding to each of the crossing points, and then normalize the proportions as the final weight.
  • combining the obtained place rating models to obtain a passenger rating model may include: receiving an input weight allocation instruction for the place rating model; obtaining the weight of the place rating model according to the weight allocation instruction; according to a weighted average The passenger rating model is calculated based on the location rating model and the corresponding weights.
  • training the grouped historical clearance records to obtain a corresponding place rating model may include: obtaining a current business rule, and querying supplementary characteristic parameters corresponding to the current business rule; respectively, according to the grouped first Complementary feature parameters are calculated from the original feature parameters in the pass record; the corresponding feature rating model is obtained by training the complementary feature parameters and the original feature parameters.
  • the combination of the obtained place rating models to obtain a passenger rating model may be intervened by the user.
  • the server may output each place rating model and display the clearance points corresponding to each place rating model.
  • the user may Corresponding weights need to be assigned to the rating models of various places. For example, if the clearance place is the first place, the user can set the weight of the place rating model corresponding to the first place to be relatively large, and assign the first weight associated with the first place.
  • the weights of the rating models of the two locations are correspondingly set to be relatively large, while the weights of the rating models of other locations are correspondingly set to be relatively small.
  • the second place associated with the first place may be a second place in which there is a business connection with the first place.
  • the current business rule can be obtained and the supplementary feature parameters corresponding to the current business rule can be queried, so that the supplementary feature parameters can be calculated according to the original feature parameters in the grouped first clearance record.
  • the current business rules include traffic characteristics, inspection-type characteristics, clearance information-type characteristics, frequency static characteristics, and frequency dynamic characteristics; thus, the generated supplementary characteristic parameters may include: peer-type supplementary characteristic parameters: 30, such as "15 days ago, The number of peers in 7 days "," the average number of peers per person in 30 days before 15 days "and so on.
  • Supplementary characteristic parameters for inspection 20, such as “hits in 30 days, hits in 30 days", “in 30 days, the number of days between the last customs clearance and inspection time” and so on.
  • Additional characteristic parameters for clearance information 23, such as “15 days ago, the latest customs clearance time", “30 days ago, the number of customs clearance within 30 days” and so on.
  • Frequency static supplementary characteristic parameters 14 such as “the number of customs clearances within 15 days and within 30 days”, “the number of customs clearance days within 30 days and within 7 days” and so on.
  • the current business rule obtained by the server is a peer rule
  • the number of peers of different passengers within a preset number of days is obtained from the record, and the feature of the number of peers is used as a supplementary characteristic parameter.
  • the pass-through information-type features, frequency-type static features, and frequency-dynamic-type features also use the same method to generate corresponding logic according to business rules.
  • the server trains the supplementary feature parameters and the original feature parameters to obtain a place rating model.
  • the server may combine the supplementary feature parameters and the original feature parameters to train and obtain several place rating models, and will verify
  • the model obtains the number of historical clearance records with the same risk level as the passenger's risk level in the historical clearance records / verifies the historical clearance records.
  • the server selects the place rating model with the highest success rate as the place rating model for the crossing place, and sets the success rate
  • the features corresponding to the largest place rating model are used as training features.
  • a plurality of supplementary feature parameters are first generated according to the historical clearance records, so that the generated place rating model is more accurate.
  • users can configure the weights of rating models for various locations as required, so that the generated passenger rating model covers a wide range, which can improve the accuracy of passenger ratings.
  • training the grouped historical clearance records to obtain a corresponding place rating model may include: dividing the grouped historical clearance records into training set data and test set data; and extracting a first set of training set data from the training set data.
  • a feature parameter, performing feature gain evaluation according to the first feature parameter, and extracting target features from the first feature parameter according to the result of the feature gain evaluation; according to the extracted target feature and the passenger label corresponding to the corresponding training set data, the training set data Classify to obtain the initial rating model, and calculate the rating level of each classification node in the initial rating model; extract the second feature parameter corresponding to the selected target feature from the test set data; pass the second feature parameter and the test set data corresponding to The passenger tag verifies the rating levels of each node in the initial rating model to obtain a first verification result; and adjusts the initial rating model according to the first verification result to obtain a place rating model.
  • the method for generating a passenger rating model may further include: loading an updated historical clearance record when the update time of the historical clearance record is reached; and extracting a third feature corresponding to the location rating model from the updated historical clearance record Parameters; verifying the rating level of each node in the place rating model based on the third characteristic parameter and the passenger tags corresponding to the updated historical clearance records to obtain a second verification result; and optimizing the place rating model based on the second verification result.
  • the A clearance point is taken as an example for description.
  • the historical clearance record corresponding to the A clearance point is divided into training set data and test set data, and the first feature parameter and the first feature parameter are extracted from the training set data.
  • a target category (corresponding to the passenger rating level); the feature information gain evaluation is performed according to the first feature parameter, and the feature selection is performed according to the feature information evaluation result, that is, the target feature is selected, and then the extracted target feature and the corresponding training set data correspond to the passenger
  • the label classifies the training set data to obtain an initial rating model, and calculates the rating level of each node in the initial decision tree evaluation model; extracts the second feature parameter and the second target category from the test set data; according to the second feature parameter and the first
  • the two target categories verify the scoring levels of each node in the initial decision evaluation model; optimize and adjust the decision tree structure in the initial rating model and generate a place rating model according to the first verification result.
  • the first feature parameter and the second feature parameter in this embodiment are the same as the supplementary feature parameters and the
  • Decision tree is a tree structure composed of nodes and directed edges for classifying instances.
  • nodes There are two types of nodes: internal nodes and leaf nodes.
  • the internal nodes represent test conditions for features or attributes
  • the leaf nodes represent classification.
  • the specific method of using the decision tree model for classification is: starting from the root node, testing a certain feature of the instance, and assigning the instance to its child nodes according to the test results.
  • the new test condition is used to recursively execute until a leaf node is reached.
  • the final classification result is obtained.
  • the decision tree model uses the ID3 algorithm. Based on the principle that a smaller decision tree is better than a large decision tree, according to the information gain evaluation and selection features, each time the feature with the largest information gain is selected as the judgment condition builder. Node.
  • the information gain indicates the degree to which the uncertainty of the information of class Y is reduced by knowing the information of feature X.
  • the information gain g (D, A) of feature A on the training data set D is defined as the difference between the empirical entropy H (D) of set D and the empirical conditional entropy H (D
  • g (D, A) is the information gain of feature A on training data set D
  • H (D) is the empirical entropy of training data set D
  • A) is the empirical conditional entropy of feature A on data set D .
  • the feature selection method is to calculate the information gain of each feature of the training data set (or subset) and select the feature with the largest information gain.
  • the algorithm for calculating the information gain is as follows: Its input is the training data set D and feature A, and the output is the information gain g (D, A) of feature A versus training data set D.
  • C k is the number of samples corresponding to the first target category
  • K is the number of categories of the first target category.
  • the first target category is divided into each rating level of the passenger.
  • value (A) wherein A is a set of all values, i is a value characteristic of the A, D i is a training data set D wherein A is a sample set of values of i,
  • all the values of feature A corresponding to the gender feature parameter are male and female. Can be represented by 1, value (A) is (0,1).
  • the server extracts the second feature parameter and the second target category one by one from each sample of the test set data.
  • the second feature parameter is the same as the category of the first feature parameter.
  • the second feature parameter may be the feature with the largest information gain selected, that is, the target feature, which is not described herein again.
  • the second target category is the category of the security inspection results, and the second target category is the passenger rating level.
  • the selected feature parameters can be adjusted, such as adjusting statistics, etc., to reconstruct the decision tree of the preset rule model. And perform verification until the first verification result is within the error range.
  • And obtaining the most distinguished feature (field) according to the result of the feature information evaluation may include calculating the information gain of each feature parameter corresponding to the first feature parameter; selecting the feature with the largest information gain as a judgment condition to establish a child node;
  • the training set data is divided into subset data, and the subset data is branched recursively until the data corresponding to all branch nodes corresponds to the same target category.
  • a decision tree is established recursively.
  • the recursive definition of Hunt algorithm is as follows: If Dt All records belong to the same class, then t is a leaf node, marked with yt. If Dt contains records belonging to multiple classes, then select an attribute test condition to divide the records into smaller subsets. For each output of the test condition, a child node is created and the records in Dt are distributed to the child nodes based on the test results. Then, for each child node, the algorithm is called recursively.
  • the server calculates the negative sample data that matches the combination of the feature parameter corresponding to each classification node in the preset rule model from the test data set, and calculates the statistical negative sample data.
  • the proportion of the total negative sample data in the test data set, and the decision tree is verified based on the calculated proportion.
  • the server may set a preset tolerance error. When the calculated absolute difference value is less than the preset tolerance error, the verification passes, and when the calculated absolute difference value is greater than the preset tolerance error, the verification fails.
  • the server can add the sample data in the test data set to the training data set, expand the sample capacity to train the preset rule model, and adjust the preset rule model.
  • the server after generating the place rating model according to the decision tree model, the server generates a passenger rating model according to the place rating model, so that the passenger rating model can be used, but optionally to ensure the correctness of the passenger rating model
  • the server can set the historical clearance record update time in advance, and the historical clearance record update time can be the time to update the clearance record of the security place.
  • the server loads the updated historical clearance record.
  • the historical clearance record contains several basic fields, including name, age, gender, and customs clearance time.
  • the security terminal can actively or passively send the updated history to the server. Clearance records.
  • the server extracts the third characteristic parameter and passenger risk level from the historical clearance records.
  • the third characteristic parameter corresponds to the characteristic set in the location rating model that generates the passenger rating model, that is, it corresponds to the target characteristic, and the passenger risk level is Security inspection result mark.
  • the server calculates the negative sample data that matches the combination of the characteristic parameter corresponding to each classification node in the rating model of each place according to the third characteristic parameter and passenger risk level in the historical clearance record, and calculates the statistical negative sample data in the The percentage of the total negative sample data of historical clearance records, and the passenger levels of each classification node in the rating model of each place are verified according to the calculated ratio.
  • the server can set a preset deviation. When the absolute difference between the calculated ratio and the negative sample proportion of the level in the passenger level is less than the preset deviation, the verification passes; when the calculated ratio is in the passenger level, When the absolute difference of the proportion of negative samples of this level is greater than the preset deviation, the verification fails.
  • the server can continue to train and adjust the passenger rating model with historical clearance records, so as to continuously optimize the location rating model based on the historical clearance records, and further optimize the passenger rating model, so that training through big data enables Passenger ratings obtained through passenger rating models are becoming more and more accurate.
  • a decision tree model is used to generate a place rating model, which improves accuracy.
  • steps in the flowchart of FIG. 2 are sequentially displayed according to the directions of the arrows, these steps are not necessarily performed sequentially in the order indicated by the arrows. Unless explicitly stated in this document, the execution of these steps is not strictly limited, and these steps can be performed in other orders. Moreover, at least a part of the steps in FIG. 2 may include multiple sub-steps or stages. These sub-steps or stages are not necessarily performed at the same time, but may be performed at different times. The execution of these sub-steps or stages The sequence is not necessarily performed sequentially, but may be performed in turn or alternately with other steps or at least a part of the sub-steps or stages of other steps.
  • a passenger rating model generating device which includes a historical clearance record acquisition module 100, a grouping module 200, a place rating model generating module 300, and a passenger rating model generating module 400. among them:
  • the historical clearance record acquisition module 100 is configured to obtain a historical clearance record.
  • the historical clearance record carries a risk passenger label or an ordinary passenger label.
  • a grouping module 200 is configured to group the historical clearance records according to the clearance locations.
  • the location rating model generating module 300 is configured to train the historical clearance records after grouping to obtain corresponding location rating models.
  • a passenger rating model generating module 400 configured to combine the obtained place rating models to obtain a passenger rating model.
  • the passenger rating model generation module 400 includes:
  • the receiving unit is configured to receive an input weight allocation instruction for a place rating model.
  • the weight obtaining unit is configured to obtain the weight of the place rating model according to the weight allocation instruction.
  • a first calculating unit configured to calculate a passenger rating model according to the weighted average method according to the location rating model and the corresponding weight.
  • the place rating model generation module 300 includes:
  • a business rule obtaining unit is configured to obtain a current business rule and query a supplementary characteristic parameter corresponding to the current business rule.
  • a second calculation unit is configured to calculate supplementary feature parameters according to the original feature parameters in the first pass record after grouping.
  • a place rating model generating unit configured to train the supplementary feature parameters and the original feature parameters to obtain a corresponding place rating model.
  • the place rating model generation module 300 includes:
  • a dividing unit configured to divide the grouped historical clearance records into training set data and test set data.
  • a target feature extraction unit is configured to extract a first feature parameter from the training set data, perform a feature gain evaluation according to the first feature parameter, and extract a target feature from the first feature parameter according to a result of the feature gain evaluation.
  • An initial rating model generating unit is configured to classify the training set data according to the extracted target features and the passenger tags corresponding to the corresponding training set data to obtain an initial rating model, and calculate a rating level of each classification node in the initial rating model.
  • a feature extraction unit is configured to extract a second feature parameter corresponding to the selected target feature from the test set data.
  • the verification unit is configured to verify the rating level of each node in the initial rating model by using the second feature parameter and the passenger label corresponding to the test set data to obtain a first verification result.
  • an adjusting unit configured to adjust the initial rating model according to the first verification result to obtain a location rating model.
  • the apparatus further includes:
  • a loading module is configured to load an updated historical clearance record when the update time of the historical clearance record is reached.
  • a feature extraction module is configured to extract a third feature parameter corresponding to the place rating model from the updated historical clearance record.
  • the verification module is configured to verify the rating level of each node in the place rating model according to the third characteristic parameter and the passenger tag corresponding to the updated historical clearance record to obtain a second verification result.
  • Each module in the above-mentioned passenger rating model generating device may be implemented in whole or in part by software, hardware, and a combination thereof.
  • the above-mentioned modules may be embedded in the hardware form or independent of the processor in the computer device, or may be stored in the memory of the computer device in the form of software, so that the processor calls and performs the operations corresponding to the above modules.
  • a computer device is provided.
  • the computer device may be a server, and its internal structure diagram may be as shown in FIG. 4.
  • the computer device includes a processor, a memory, a network interface, and a database connected through a system bus.
  • the processor of the computer device is used to provide computing and control capabilities.
  • the memory of the computer device includes a non-volatile storage medium and an internal memory.
  • the non-volatile storage medium stores an operating system, computer-readable instructions, and a database.
  • the internal memory provides an environment for operating the operating system and computer-readable instructions in a non-volatile storage medium.
  • the computer equipment database is used to store historical clearance records.
  • the network interface of the computer device is used to communicate with an external terminal through a network connection.
  • the computer-readable instructions are executed by a processor to implement a passenger rating model generation method.
  • FIG. 4 is only a block diagram of a part of the structure related to the solution of the present application, and does not constitute a limitation on the computer equipment to which the solution of the present application is applied.
  • the specific computer equipment may be Include more or fewer parts than shown in the figure, or combine certain parts, or have a different arrangement of parts.
  • a computer device includes a memory and one or more processors.
  • Computer-readable instructions are stored in the memory, and when the computer-readable instructions are executed by the processor, the one or more processors execute the following steps: obtaining a historical clearance record, Historical clearance records carry risky passenger labels or ordinary passenger labels; group historical clearance records according to clearance locations; train historical clearance records after grouping to obtain corresponding location rating models; and combine the obtained location rating models Get passenger rating models.
  • the combination of the obtained place rating model and the passenger rating model realized when the processor executes the computer-readable instructions may include: receiving an input weight allocation instruction for the place rating model; and according to the weight allocation instruction Obtain the weight of the place rating model; and calculate the passenger rating model according to the weighted average method according to the place rating model and the corresponding weight.
  • training the grouped historical clearance records to obtain the corresponding place rating model implemented when the processor executes computer-readable instructions may include: obtaining a current business rule, and querying supplements corresponding to the current business rule Feature parameters; Complementary feature parameters are calculated according to the original feature parameters in the first pass record after grouping; and training is performed on the supplementary feature parameters and the original feature parameters to obtain a corresponding place rating model.
  • training the grouped historical clearance records and implementing the computer-readable instructions to obtain a corresponding place rating model may include dividing the grouped historical clearance records into training set data and tests.
  • the passenger tags corresponding to the set of data are used to classify the training set data to obtain an initial rating model, and calculate the rating level of each classification node in the initial rating model; extract the second feature parameter corresponding to the selected target feature from the test set data;
  • the second feature parameter and the passenger label corresponding to the test set data are used to verify the rating levels of each node in the initial rating model to obtain a first verification result; and the initial rating model is adjusted to obtain a location rating model according to the first verification result.
  • the processor when the processor executes the computer-readable instructions, the processor further implements the following steps: when the update time of the historical clearance record is reached, loading the updated historical clearance record; and extracting the first corresponding to the place rating model from the updated historical clearance record. Three characteristic parameters; verifying the rating level of each node in the place rating model to obtain a second verification result according to the third feature parameter and the passenger tag corresponding to the updated historical clearance record; and optimizing the place rating model according to the second verification result.
  • One or more non-volatile computer-readable storage media storing computer-readable instructions, and when the computer-readable instructions are executed by one or more processors, cause the one or more processors to perform the following steps: obtaining a historical clearance record ,
  • the historical clearance records carry the risk passenger label or ordinary passenger label; group the historical clearance records according to the clearance location; separately train the historical clearance records after the grouping to obtain the corresponding location rating model; and perform the obtained location rating model
  • the combination gets the passenger rating model.
  • the combination of the obtained place rating model and the passenger rating model realized when the computer-readable instructions are executed by the processor may include: receiving input weight allocation instructions for the place rating model; and assigning according to the weight The instruction obtains the weight of the place rating model; and calculates the passenger rating model according to the weighted average method according to the place rating model and the corresponding weight.
  • the training of the grouped historical clearance records implemented by the processor when the computer-readable instructions are executed to obtain the corresponding place rating model may include: obtaining the current business rule, and querying the corresponding business rule Complementary feature parameters; Complementary feature parameters are calculated according to the original feature parameters in the first pass record after grouping; and the corresponding feature rating model is obtained by training the supplemental feature parameters and the original feature parameters.
  • the training of the grouped historical clearance records and the corresponding place rating model implemented when the computer-readable instructions are executed by the processor may include: dividing the grouped historical clearance records into training set data and Test set data; extract first feature parameters from training set data, perform feature gain evaluation based on the first feature parameters, and extract target features from the first feature parameters based on the feature gain evaluation results; according to the extracted target features and corresponding
  • the passenger tags corresponding to the training set data are used to classify the training set data to obtain an initial rating model, and calculate the rating level of each classification node in the initial rating model; extracting second feature parameters corresponding to the selected target feature from the test set data;
  • the first rating result is obtained by verifying the rating level of each node in the initial rating model by using the second feature parameter and the passenger label corresponding to the test set data; and adjusting the initial rating model according to the first verification result to obtain the location rating model.
  • the following steps are further implemented: when the historical clearance record update time is reached, loading the updated historical clearance record; and extracting the corresponding historical place clearance model from the updated historical clearance record.
  • the third characteristic parameter verifying the rating level of each node in the place rating model according to the third characteristic parameter and the updated passenger clearance record to obtain a second verification result; and optimizing the place rating model according to the second verification result .
  • Non-volatile memory may include read-only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory.
  • Volatile memory can include random access memory (RAM) or external cache memory.
  • RAM is available in various forms, such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), dual data rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous chain Synchlink DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).
  • SRAM static RAM
  • DRAM dynamic RAM
  • SDRAM synchronous DRAM
  • DDRSDRAM dual data rate SDRAM
  • ESDRAM enhanced SDRAM
  • SLDRAM synchronous chain Synchlink DRAM
  • Rambus direct RAM
  • DRAM direct memory bus dynamic RAM
  • RDRAM memory bus dynamic RAM

Landscapes

  • Business, Economics & Management (AREA)
  • Human Resources & Organizations (AREA)
  • Engineering & Computer Science (AREA)
  • Strategic Management (AREA)
  • Tourism & Hospitality (AREA)
  • Economics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Marketing (AREA)
  • Development Economics (AREA)
  • General Business, Economics & Management (AREA)
  • Educational Administration (AREA)
  • Physics & Mathematics (AREA)
  • Quality & Reliability (AREA)
  • Operations Research (AREA)
  • Game Theory and Decision Science (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Primary Health Care (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

Disclosed is a passenger rating model generation method, comprising: acquiring historical records of going through customs, wherein the historical records of going through customs carry a risky passenger tag or an ordinary passenger tag (S202); dividing the historical records of going through customs into groups according to locations where the behaviors of going through customs occur (S204); respectively training the grouped historical records of going through customs to obtain corresponding location rating models (S206); and combining the obtained location rating models to obtain a passenger rating model (S208).

Description

旅客评级模型生成方法、装置、计算机设备和存储介质Passenger rating model generation method, device, computer equipment and storage medium
相关申请的交叉引用Cross-reference to related applications
本申请要求于2018年7月18日提交中国专利局,申请号为2018107884177,申请名称为“旅客评级模型生成方法、装置、计算机设备和存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of a Chinese patent application filed on July 18, 2018 with the Chinese Patent Office under the application number of 2018107884177 and the application name is "Passenger Rating Model Generation Method, Apparatus, Computer Equipment, and Storage Medium." Citations are incorporated in this application.
技术领域Technical field
本申请涉及一种旅客评级模型生成方法、装置、计算机设备和存储介质。The present application relates to a method, a device, a computer device, and a storage medium for generating a passenger rating model.
背景技术Background technique
机场、口岸等出入境场所每天都会大量旅客过关,其中不乏一些走私、偷渡等不法分子。目前安防场景中有大量人员通行记录,在通行人群中有少部分人员具有高风险。Airports, ports, and other places of entry and exit will pass through a large number of passengers every day, and some of them are smugglers, smugglers and other illegal elements. At present, there are a large number of personnel traffic records in the security scene, and a small number of people in the crowd are at high risk.
然而,发明人意识到,目前安防人员仅能够凭经验对过关旅客进行抽查,或运用简单规则对大量数据进行初步筛选,这两种方法均不能够准确地对过关旅客进行评级。However, the inventors realized that at present, security personnel can only conduct random checks on crossing passengers based on experience, or use simple rules to conduct preliminary screening of large amounts of data. Neither of these methods can accurately rate crossing passengers.
发明内容Summary of the invention
根据本申请公开的各种实施例,提供一种旅客评级模型生成方法、装置、计算机设备和存储介质。According to various embodiments disclosed in the present application, a method, an apparatus, a computer device, and a storage medium for generating a passenger rating model are provided.
一种旅客评级模型生成方法,包括:A method for generating a passenger rating model includes:
获取历史过关记录,所述历史过关记录携带有风险旅客标签或普通旅客标签;Obtaining historical clearance records, which carry the risk passenger label or ordinary passenger label;
将所述历史过关记录按照过关地点进行分组;Grouping the historical clearance records according to the clearance locations;
对分组后的历史过关记录分别进行训练得到对应的地点评级模型;及Train the historical clearance records after grouping to obtain corresponding place rating models; and
将所得到的地点评级模型进行组合得到旅客评级模型。The obtained place rating models are combined to obtain a passenger rating model.
一种旅客评级模型生成装置,包括:A passenger rating model generating device includes:
历史过关记录获取模块,用于获取历史过关记录,所述历史过关记录携带有风险旅客标签或普通旅客标签;A historical clearance record acquisition module, configured to obtain a historical clearance record, which carries a risk passenger label or an ordinary passenger label;
分组模块,用于将所述历史过关记录按照过关地点进行分组;A grouping module, configured to group the historical clearance records according to the clearance locations;
地点评级模型生成模块,用于对分组后的历史过关记录分别进行训练得到对应的地点评级模型;Place rating model generation module, which is used to train the historical clearance records after grouping to obtain corresponding place rating models;
旅客评级模型生成模块,用于将所得到的地点评级模型进行组合得到旅客评级模型。The passenger rating model generating module is configured to combine the obtained place rating models to obtain a passenger rating model.
一种计算机设备,包括存储器和一个或多个处理器,所述存储器中储存有计算机可读指令,所述计算机可读指令被所述处理器执行时,使得所述一个或多个处理器执行以下步骤:A computer device includes a memory and one or more processors. The memory stores computer-readable instructions. When the computer-readable instructions are executed by the processor, the one or more processors are executed. The following steps:
获取历史过关记录,所述历史过关记录携带有风险旅客标签或普通旅客标签;Obtaining historical clearance records, which carry the risk passenger label or ordinary passenger label;
将所述历史过关记录按照过关地点进行分组;Grouping the historical clearance records according to the clearance locations;
对分组后的历史过关记录分别进行训练得到对应的地点评级模型;及Train the historical clearance records after grouping to obtain corresponding place rating models; and
将所得到的地点评级模型进行组合得到旅客评级模型。The obtained place rating models are combined to obtain a passenger rating model.
一个或多个存储有计算机可读指令的非易失性计算机可读存储介质,计算机可读指令被一个或多个处理器执行时,使得一个或多个处理器执行以下步骤:One or more non-volatile computer-readable storage media storing computer-readable instructions. When the computer-readable instructions are executed by one or more processors, the one or more processors execute the following steps:
获取历史过关记录,所述历史过关记录携带有风险旅客标签或普通旅客标签;Obtaining historical clearance records, which carry the risk passenger label or ordinary passenger label;
将所述历史过关记录按照过关地点进行分组;Grouping the historical clearance records according to the clearance locations;
对分组后的历史过关记录分别进行训练得到对应的地点评级模型;及Train the historical clearance records after grouping to obtain corresponding place rating models; and
将所得到的地点评级模型进行组合得到旅客评级模型。The obtained place rating models are combined to obtain a passenger rating model.
本申请的一个或多个实施例的细节在下面的附图和描述中提出。本申请的其它特征和优点将从说明书、附图以及权利要求书变得明显。Details of one or more embodiments of the present application are set forth in the accompanying drawings and description below. Other features and advantages of the application will become apparent from the description, the drawings, and the claims.
附图说明BRIEF DESCRIPTION OF THE DRAWINGS
为了更清楚地说明本申请实施例中的技术方案,下面将对实施例中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其它的附图。In order to explain the technical solutions in the embodiments of the present application more clearly, the drawings used in the embodiments will be briefly introduced below. Obviously, the drawings in the following description are only some embodiments of the present application. Those of ordinary skill in the art can obtain other drawings according to the drawings without paying creative labor.
图1为根据一个或多个实施例中旅客评级模型生成方法的应用场景图。FIG. 1 is an application scenario diagram of a method for generating a passenger rating model according to one or more embodiments.
图2为根据一个或多个实施例中旅客评级模型生成方法的流程示意图。FIG. 2 is a schematic flowchart of a method for generating a passenger rating model according to one or more embodiments.
图3为根据一个或多个实施例中旅客评级模型生成装置的框图。FIG. 3 is a block diagram of a passenger rating model generating device according to one or more embodiments.
图4为根据一个或多个实施例中计算机设备的框图。FIG. 4 is a block diagram of a computer device in accordance with one or more embodiments.
具体实施方式detailed description
为了使本申请的技术方案及优点更加清楚明白,以下结合附图及实施例,对本申请进行进一步详细说明。应当理解,此处描述的具体实施例仅仅用以解释本申请,并不用于限定本申请。In order to make the technical solution and advantages of the present application more clear and clear, the present application is further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are only used to explain the application, and are not used to limit the application.
本申请提供的旅客评级模型生成方法,可以应用于如图1所示的应用环境中。服务器102通过网络与数据库104进行通信。服务器102可以从数据库104中读取历史过关记录,该历史过关记录中携带有风险旅标签或普通旅客标签,服务器首先将历史过关记录按照过关地点进行分组,然后将分组后的历史过关记录分别进行训练得到对应的地点评级模型;最后将所得到的地点评级模型进行组合得到旅客评级模型,使得旅客评级模型涉及范围广,从而可以提高旅客评级的准确性。服务器102可以用独立的服务器或者是多个服务器组成的服务器集群来实现。The method for generating a passenger rating model provided in this application can be applied to the application environment shown in FIG. 1. The server 102 communicates with the database 104 through a network. The server 102 can read the historical clearance records from the database 104. The historical clearance records carry risk travel tags or ordinary passenger labels. The server first groups the historical clearance records according to the clearance locations, and then separates the grouped historical clearance records into groups. The corresponding place rating model is obtained through training; finally, the obtained place rating models are combined to obtain the passenger rating model, which makes the passenger rating model cover a wide range, thereby improving the accuracy of the passenger rating. The server 102 may be implemented by an independent server or a server cluster composed of multiple servers.
在其中一个实施例中,如图2所示,提供了一种旅客评级模型生成方法,以该方法应用于图1中的服务器为例进行说明,包括以下步骤:In one embodiment, as shown in FIG. 2, a method for generating a passenger rating model is provided. The method is applied to the server in FIG. 1 as an example, and includes the following steps:
S202:获取历史过关记录,历史过关记录携带有风险旅客标签或普通旅客标签。S202: Obtain a historical clearance record. The historical clearance record carries a risk passenger label or an ordinary passenger label.
具体地,历史过关记录是以往旅客通关时所产生的记录,该过关记录中包含了若干基本字段,包括姓名、年龄、性别以及通关时间等,该过关记录可以包括未被检查的旅客的过关记录以及被检查的旅客的过关记录,被检查的旅客的过关记录对应的是被检查旅客是否为风险旅客的记录,其中,为了方便,未被检查的旅客以及被检查的且检查结果为普通的旅客对应的过关记录携带有普通旅客标签,被检查的且检查结果为风险旅客对应的过关记录携带有风险旅客标签。Specifically, the historical customs clearance record is a record generated by previous passengers during customs clearance. The clearance record contains several basic fields, including name, age, gender, and customs clearance time. The clearance record may include the clearance records of unchecked passengers. And the customs clearance records of the inspected passengers. The clearance records of the inspected passengers correspond to the records of whether the inspected passengers are risky passengers. Among them, for convenience, the unchecked passengers and the inspected and the inspection results are ordinary passengers The corresponding customs clearance records carry ordinary passenger tags. The customs clearance records that are checked and the inspection result is risk passenger carry risk passenger tags.
其中服务器可以从数据库中获取全部的历史过关记录,或者为了方便,服务器本地还可以缓存有部分历史过关记录,从而方便模型训练。The server can obtain all historical clearance records from the database, or for convenience, the server can also cache some historical clearance records locally to facilitate model training.
S204:将历史过关记录按照过关地点进行分组。S204: Group the historical clearance records according to the clearance locations.
具体地,由于数据库中存储了大量的历史过关记录,因此为了方便可以将历史过关记录按照过关地点进行分组。例如服务器可以首先检测历史过关记录的过关地点字段,然后通过该过关地点字段对所获取的历史过关记录进行分组。Specifically, since a large number of historical clearance records are stored in the database, the historical clearance records may be grouped according to the clearance locations for convenience. For example, the server may first detect the clearance location field of the historical clearance record, and then group the obtained historical clearance records through the clearance location field.
S206:对分组后的历史过关记录分别进行训练得到对应的地点评级模型。S206: Train the historical clearance records after grouping to obtain corresponding place rating models.
具体地,在对历史过关记录按照就过关地点进行分组后,可以对分组后的历史过关记录分别进行训练得到过关地点对应的地点评级模型。且每一个地点评级模型均是采用决策树模型得到的,且可以预测旅客在过关地点的风险等级。Specifically, after the historical clearance records are grouped according to the clearance locations, the grouped historical clearance records may be separately trained to obtain a place rating model corresponding to the clearance locations. And each location rating model is obtained by using a decision tree model, and can predict the risk level of passengers at the crossing points.
S208:将所得到的地点评级模型进行组合得到旅客评级模型。S208: Combine the obtained place rating models to obtain a passenger rating model.
具体地,服务器在训练得到地点评级模型后,对训练得到的地点评级模型进行组合得到旅客评级模型,例如可以给不同的地点评级模型分配不同的权值,然后根据权值以及不同的地点评级模型得到旅客评级模型,其中所分配的权值可以是根据各个地点评级模型对应的过关历史记录中具有风险旅客标签的过关历史记录的比例的大小进行配置的,例如当某一过关地点对应的上述比例较大时,则其权值变大。可选地,服务器可以获取到各个过关地点对应的比例,然后将该些比例进行归一化作为最后的权值。Specifically, after the server obtains the place rating model, the server combines the trained place rating models to obtain a passenger rating model. For example, different place rating models can be assigned different weights, and then according to the weights and different place rating models. The passenger rating model is obtained, and the assigned weights can be configured according to the proportion of the clearance history records with risky passenger tags in the clearance history records corresponding to the rating models of various locations. For example, when the above ratio corresponds to a certain clearance location When it is larger, its weight becomes larger. Optionally, the server may obtain the proportions corresponding to each of the crossing points, and then normalize the proportions as the final weight.
上述旅客评级模型生成方法,对历史过关记录按照过关地点进行分组,并根据过关地点的不同分别生成对应的地点评级模型,最后将地点评级模型进行组合得到了旅客评级模型,使得旅客评级模型涉及范围广,从而可以提高旅客评级的准确性。In the above passenger rating model generation method, historical clearance records are grouped according to customs clearance locations, and corresponding location rating models are generated according to different customs clearance locations. Finally, the location rating models are combined to obtain a passenger rating model, which makes the passenger rating model cover the scope Wide, which can improve the accuracy of passenger ratings.
在其中一个实施例中,将所得到的地点评级模型进行组合得到旅客评级模型,可以包括:接收输入的针对地点评级模型的权重分配指令;根据权重分配指令得到地点评级模型的权重;按照加权平均的方式根据地点评级模型以及对应的权重计算得到旅客评级模型。In one embodiment, combining the obtained place rating models to obtain a passenger rating model may include: receiving an input weight allocation instruction for the place rating model; obtaining the weight of the place rating model according to the weight allocation instruction; according to a weighted average The passenger rating model is calculated based on the location rating model and the corresponding weights.
在其中一个实施例中,对分组后的历史过关记录进行训练得到对应的地点评级模型,可以包括:获取当前业务规则,并查询与当前业务规则对应的补充特征参数;分别根据分组后的第一过关记录中的原始特征参数计算补充特征参数;对补充特征参数以及原始特征参数进行训练得到对应的地点评级模型。In one of the embodiments, training the grouped historical clearance records to obtain a corresponding place rating model may include: obtaining a current business rule, and querying supplementary characteristic parameters corresponding to the current business rule; respectively, according to the grouped first Complementary feature parameters are calculated from the original feature parameters in the pass record; the corresponding feature rating model is obtained by training the complementary feature parameters and the original feature parameters.
具体地,将所得到的地点评级模型进行组合得到旅客评级模型的可以是由用户进行干 预的,例如服务器可以将各个地点评级模型输出,并显示各个地点评级模型对应的过关地点,用户可以根据实际需要对各个地点评级模型分配对应的权重,例如,如果过关地点是第一地点,则用户可以将第一地点对应的地点评级模型的权重对应地设置较大,并将与第一地点关联的第二地点的地点评级模型的权重对应地设置较大,而将其他地点评级模型的权重对应地设置较少。其中与第一地点关联的第二地点可以是与第一地点存在业务联系的第二地点。Specifically, the combination of the obtained place rating models to obtain a passenger rating model may be intervened by the user. For example, the server may output each place rating model and display the clearance points corresponding to each place rating model. The user may Corresponding weights need to be assigned to the rating models of various places. For example, if the clearance place is the first place, the user can set the weight of the place rating model corresponding to the first place to be relatively large, and assign the first weight associated with the first place. The weights of the rating models of the two locations are correspondingly set to be relatively large, while the weights of the rating models of other locations are correspondingly set to be relatively small. The second place associated with the first place may be a second place in which there is a business connection with the first place.
服务器在接收到用户输入的地点评级模型对应的权重后,则按照加权平均的方式根据地点评级模型以及对应的权重计算得到旅客评级模型。如A(旅客评级模型)=(A1*a1+A2*a2+A3*a3+……+AN*an)/N,其中A1、A2……AN是指模型,a1、a2……an是指权重。After receiving the weight corresponding to the location rating model input by the user, the server calculates the passenger rating model according to the location rating model and the corresponding weight in a weighted average manner. For example, A (Passenger Rating Model) = (A1 * a1 + A2 * a2 + A3 * a3 + ... + AN * an) / N, where A1, A2 ... AN refers to the model, and a1, a2 ... an refers to the weight .
具体地,上述的地点评级模型生成之前,可以获取到当前业务规则,并查询当前业务规则对应的补充特征参数,从而可以根据分组后的第一过关记录中的原始特征参数计算补充特征参数。其中当前业务规则包括通行特征、检查类特征、过关信息类特征、频率静态特征、频率动态特征;从而所生成的补充特征参数可以包括:同行类补充特征参数:30个,如“15天前,7天内同行次数”,“15天之前,30天内平均每人同行次数”等。检查类补充特征参数:20个,如“30天前,30天内命中数”,“30天之前,最近一次通关和查验时间的间隔天数”等。过关信息类补充特征参数:23个,如“15天之前,最近一次通关时间”,“30天之前,30天内关口数”等。频率类静态补充特征参数:14个,如“15天之前,30天内通关次数”,“30天之前,7天内通关天数”等。频率类动态补充特征参数:34个,如“15天之前,90天内第一次和第五次最小间隔时间”,“30天之前,30天内第一次和第五次间隔时间平均值”等。Specifically, before the above-mentioned place rating model is generated, the current business rule can be obtained and the supplementary feature parameters corresponding to the current business rule can be queried, so that the supplementary feature parameters can be calculated according to the original feature parameters in the grouped first clearance record. The current business rules include traffic characteristics, inspection-type characteristics, clearance information-type characteristics, frequency static characteristics, and frequency dynamic characteristics; thus, the generated supplementary characteristic parameters may include: peer-type supplementary characteristic parameters: 30, such as "15 days ago, The number of peers in 7 days "," the average number of peers per person in 30 days before 15 days "and so on. Supplementary characteristic parameters for inspection: 20, such as "hits in 30 days, hits in 30 days", "in 30 days, the number of days between the last customs clearance and inspection time" and so on. Additional characteristic parameters for clearance information: 23, such as "15 days ago, the latest customs clearance time", "30 days ago, the number of customs clearance within 30 days" and so on. Frequency static supplementary characteristic parameters: 14 such as "the number of customs clearances within 15 days and within 30 days", "the number of customs clearance days within 30 days and within 7 days" and so on. Frequency-type dynamic supplementary characteristic parameters: 34, such as "the minimum and the first interval between the first and fifth times within 15 days, within 90 days", "the average of the interval between the first and fifth times within 30 days, within 30 days", etc. .
可选地,当服务器获取到的当前业务规则是同行规则时,则通过关记录中获取到不同旅客在预设天数内的同行次数,将给同行次数特征作为补充特征参数。同样地,过关信息类特征、频率类静态特征以及频率动态类特征也采用同样的方式,根据业务规则生成对应的逻辑。Optionally, when the current business rule obtained by the server is a peer rule, the number of peers of different passengers within a preset number of days is obtained from the record, and the feature of the number of peers is used as a supplementary characteristic parameter. Similarly, the pass-through information-type features, frequency-type static features, and frequency-dynamic-type features also use the same method to generate corresponding logic according to business rules.
服务器在生成补充特征参数后,将补充特征参数以及原始特征参数进行训练得到地点评级模型,可选地,服务器可以将补充特征参数以及原始特征参数进行组合进行训练得到若干地点评级模型,且将验证历史过关记录输入至所生成的若干地点评级模型中得到旅客的风险等级,并与历史过关记录中的风险等级进行比较,获取所生成的若干地点评级模型对应的成功率,该成功率=地点评级模型中得到旅客的风险等级与历史过关记录中的风险等级相同的历史过关记录的数量/验证历史过关记录,服务器选取成功率最大的地点评级模型作为该过关地点的地点评级模型,并将成功率最大的地点评级模型对应的特征作为训练特征。After generating the supplementary feature parameters, the server trains the supplementary feature parameters and the original feature parameters to obtain a place rating model. Alternatively, the server may combine the supplementary feature parameters and the original feature parameters to train and obtain several place rating models, and will verify The historical clearance records are input into the generated location rating models to obtain the passenger's risk level, and compared with the historical level's risk levels, to obtain the success rates corresponding to the generated location rating models, which success rate = location rating The model obtains the number of historical clearance records with the same risk level as the passenger's risk level in the historical clearance records / verifies the historical clearance records. The server selects the place rating model with the highest success rate as the place rating model for the crossing place, and sets the success rate The features corresponding to the largest place rating model are used as training features.
上述实施例中,在训练地点评级模型前,首先根据历史过关记录生成了多个补充特征参数,从而使得生成的地点评级模型更加准确。且用户可以根据需要来配置各个地点评级模型的权重,从而使得所生成的旅客评级模型涉及范围广,从而可以提高旅客评级的准确 性。In the above embodiment, before training the place rating model, a plurality of supplementary feature parameters are first generated according to the historical clearance records, so that the generated place rating model is more accurate. In addition, users can configure the weights of rating models for various locations as required, so that the generated passenger rating model covers a wide range, which can improve the accuracy of passenger ratings.
在其中一个实施例中,对分组后的历史过关记录进行训练得到对应的地点评级模型,可以包括:将分组后的历史过关记录划分为训练集数据和测试集数据;从训练集数据中提取第一特征参数,根据第一特征参数进行特征增益评估,并根据特征增益评估结果从第一特征参数中提取目标特征;根据所提取的目标特征以及对应的训练集数据对应的旅客标签对训练集数据进行分类得到初始评级模型,并计算初始评级模型中各个分类节点的评分等级;从测试集数据中提取与所选择的目标特征对应的第二特征参数;通过第二特征参数以及测试集数据对应的旅客标签对初始评级模型中各个节点的评分等级进行验证得到第一验证结果;根据第一验证结果对初始评级模型进行调整得到地点评级模型。In one embodiment, training the grouped historical clearance records to obtain a corresponding place rating model may include: dividing the grouped historical clearance records into training set data and test set data; and extracting a first set of training set data from the training set data. A feature parameter, performing feature gain evaluation according to the first feature parameter, and extracting target features from the first feature parameter according to the result of the feature gain evaluation; according to the extracted target feature and the passenger label corresponding to the corresponding training set data, the training set data Classify to obtain the initial rating model, and calculate the rating level of each classification node in the initial rating model; extract the second feature parameter corresponding to the selected target feature from the test set data; pass the second feature parameter and the test set data corresponding to The passenger tag verifies the rating levels of each node in the initial rating model to obtain a first verification result; and adjusts the initial rating model according to the first verification result to obtain a place rating model.
在其中一个实施例中,上述旅客评级模型生成方法还可以包括:当到达历史过关记录更新时间时,加载更新的历史过关记录;从更新的历史过关记录中提取与地点评级模型对应的第三特征参数;根据第三特征参数和更新的历史过关记录对应的旅客标签对地点评级模型中各节点的评分等级进行验证得到第二验证结果;根据第二验证结果对地点评级模型进行优化。In one embodiment, the method for generating a passenger rating model may further include: loading an updated historical clearance record when the update time of the historical clearance record is reached; and extracting a third feature corresponding to the location rating model from the updated historical clearance record Parameters; verifying the rating level of each node in the place rating model based on the third characteristic parameter and the passenger tags corresponding to the updated historical clearance records to obtain a second verification result; and optimizing the place rating model based on the second verification result.
具体地,在本实施例中以A过关地点为例进行说明,其中将A过关地点对应的历史过关记录划分为训练集数据和测试集数据,从训练集数据中提取出第一特征参数和第一目标类别(对应旅客评分等级);根据第一特征参数进行特征信息增益评估,根据特征信息评估结果进行特征选择,即选择目标特征,然后所提取的目标特征以及对应的训练集数据对应的旅客标签对训练集数据进行分类得到初始评级模型,并计算初始决策树评估模型中各节点的评分等级;从测试集数据中提取出第二特征参数和第二目标类别;根据第二特征参数和第二目标类别对初始决策评估模型中各节点的评分等级进行验证;根据第一验证结果对初始评级模型中的决策树结构进行优化调整并生成地点评级模型。在本实施例中的第一特征参数和第二特征参数为与上述实施例中所提到的补充特征参数和原始特征参数,目标类别分为多类,即旅客的评分等级,该样本数据的评分等级是预先已经知道的。Specifically, in this embodiment, the A clearance point is taken as an example for description. The historical clearance record corresponding to the A clearance point is divided into training set data and test set data, and the first feature parameter and the first feature parameter are extracted from the training set data. A target category (corresponding to the passenger rating level); the feature information gain evaluation is performed according to the first feature parameter, and the feature selection is performed according to the feature information evaluation result, that is, the target feature is selected, and then the extracted target feature and the corresponding training set data correspond to the passenger The label classifies the training set data to obtain an initial rating model, and calculates the rating level of each node in the initial decision tree evaluation model; extracts the second feature parameter and the second target category from the test set data; according to the second feature parameter and the first The two target categories verify the scoring levels of each node in the initial decision evaluation model; optimize and adjust the decision tree structure in the initial rating model and generate a place rating model according to the first verification result. The first feature parameter and the second feature parameter in this embodiment are the same as the supplementary feature parameters and the original feature parameters mentioned in the above embodiment. The target category is divided into multiple categories, that is, the rating grade of the passenger. The grading level is known in advance.
决策树是一种由节点和有向边组成的、用于对实例进行分类的树形结构。节点的类型有两种:内部节点和叶子节点。其中,内部节点表示特征或属性的测试条件,叶子节点表示分类。使用决策树模型进行分类的具体方法是:从根节点开始,对实例的某一特征进行测试,根据测试结果将实例分配到其子节点。沿该分支可能达到叶子节点或者到达另一个内部节点时,则使用新的测试条件递归执行下去,直到抵达一个叶子节点。当到达叶子节点时,则得到最终分类结果。Decision tree is a tree structure composed of nodes and directed edges for classifying instances. There are two types of nodes: internal nodes and leaf nodes. Among them, the internal nodes represent test conditions for features or attributes, and the leaf nodes represent classification. The specific method of using the decision tree model for classification is: starting from the root node, testing a certain feature of the instance, and assigning the instance to its child nodes according to the test results. When it is possible to reach a leaf node along this branch or reach another internal node, the new test condition is used to recursively execute until a leaf node is reached. When the leaf node is reached, the final classification result is obtained.
在本实施例中决策树模型采用ID3算法,基于越是小型的决策树越优于大的决策树的原则,根据信息增益评估和选择特征,每次选择信息增益最大的特征作为判断条件建立子结点。信息增益表示得知特征X的信息而使得类Y的信息的不确定性减少的程度。特征A对训练数据集D的信息增益g(D,A),定义为集合D的经验熵H(D)与特征A给定条件下D的经验条件熵H(D|A)之差,即In this embodiment, the decision tree model uses the ID3 algorithm. Based on the principle that a smaller decision tree is better than a large decision tree, according to the information gain evaluation and selection features, each time the feature with the largest information gain is selected as the judgment condition builder. Node. The information gain indicates the degree to which the uncertainty of the information of class Y is reduced by knowing the information of feature X. The information gain g (D, A) of feature A on the training data set D is defined as the difference between the empirical entropy H (D) of set D and the empirical conditional entropy H (D | A) of D under the given conditions of feature A, that is,
g(D,A)=H(D)-H(D|A)    (1)g (D, A) = H (D) -H (D | A) (1)
其中,g(D,A)为特征A对训练数据集D的信息增益,H(D)为训练数据集D的经验熵,H(D|A)为特征A对数据集D的经验条件熵。Among them, g (D, A) is the information gain of feature A on training data set D, H (D) is the empirical entropy of training data set D, and H (D | A) is the empirical conditional entropy of feature A on data set D .
根据信息增益准则的特征选择方法是:对训练数据集(或子集)计算其每个特征的信息增益,选择信息增益最大的特征。计算信息增益的算法如下:其输入为训练数据集D和特征A,输出为特征A对训练数据集D的信息增益g(D,A)。The feature selection method according to the information gain criterion is to calculate the information gain of each feature of the training data set (or subset) and select the feature with the largest information gain. The algorithm for calculating the information gain is as follows: Its input is the training data set D and feature A, and the output is the information gain g (D, A) of feature A versus training data set D.
首先,计算数据集D的经验熵H(D):First, calculate the empirical entropy H (D) of the data set D:
Figure PCTCN2018106083-appb-000001
Figure PCTCN2018106083-appb-000001
其中,C k为第一目标类别对应的样本数量,K为第一目标类别的类别数量,在本实施例中,第一目标类别分为旅客的各个评分等级。 Among them, C k is the number of samples corresponding to the first target category, and K is the number of categories of the first target category. In this embodiment, the first target category is divided into each rating level of the passenger.
其次,计算特征A对数据集D的经验条件熵H(D|A):Second, calculate the empirical conditional entropy H (D | A) of feature A on data set D:
Figure PCTCN2018106083-appb-000002
Figure PCTCN2018106083-appb-000002
其中,value(A)是特征A所有的取值集合,i是特征A的一个取值,D i是训练数据集D中特征A取值为i的样例集合,|D i|表示取值为i的样例集合的样本数量,|D|表示进行样例集合划分前样本的总数量,如性别特征参数对应的特征A所有的取值为男和女,如男可以用0表示,女可以用1表示,value(A)为(0,1)。 Wherein, value (A) wherein A is a set of all values, i is a value characteristic of the A, D i is a training data set D wherein A is a sample set of values of i, | D i | that value Is the number of samples in the sample set of i, | D | represents the total number of samples before the sample set is divided. For example, all the values of feature A corresponding to the gender feature parameter are male and female. Can be represented by 1, value (A) is (0,1).
第三,计算信息增益:Third, calculate the information gain:
g(D,A)=H(D)-H(D|A)     (1)g (D, A) = H (D) -H (D | A) (1)
服务器从测试集数据的各个样本中逐个提取出第二特征参数和第二目标类别。其中,第二特征参数与上述的第一特征参数的类别相同,可选地,可以是所选择的信息增益最大的特征,即上述目标特征,在此不再赘述。第二目标类别为安全检查结果的类别,第二目标类别为旅客评分等级。The server extracts the second feature parameter and the second target category one by one from each sample of the test set data. The second feature parameter is the same as the category of the first feature parameter. Alternatively, the second feature parameter may be the feature with the largest information gain selected, that is, the target feature, which is not described herein again. The second target category is the category of the security inspection results, and the second target category is the passenger rating level.
通过测试集数据在对预设规则模型进行验证时,若验证结果的偏差过大时,可以对所选取的特征参数进行调整,如将统计量进行调整等,重新构建预设规则模型的决策树并进行验证直至第一验证结果在误差范围内,也可以从根节点开始对分支节点的特征选择进行调整,对决策树进行优化,在调整时,可以采用增加训练集的数据量等方式,直至优化的决策树的第一验证结果可以在误差范围内。When the preset rule model is verified through the test set data, if the deviation of the verification result is too large, the selected feature parameters can be adjusted, such as adjusting statistics, etc., to reconstruct the decision tree of the preset rule model. And perform verification until the first verification result is within the error range. You can also adjust the feature selection of the branch node from the root node and optimize the decision tree. During the adjustment, you can increase the amount of data in the training set, etc. until The first verification result of the optimized decision tree may be within the error range.
且根据特征信息评估结果获取到区分程度最大的特征(字段)可以包括计算第一特征参数对应的各特征参数的信息增益;选取信息增益最大的特征作为判断条件建立子节点;根据子节点对应的训练集数据划分为子集数据,对子集数据以递归方式进行分支直至所有分支节点对应的数据对应于相同的目标类别。通过将训练记录相继划分成较纯的子集,以递归方式建立决策树。设Dt是与节点t相关联的训练记录集,而 y={y1,y2,…,yc}y={y1,y2,…,yc}是类标号,Hunt算法的递归定义如下:如果Dt中所有记录都属于同一个类,则t是叶节点,用yt标记。如果Dt中包含属于多个类的记录,则选择一个属性测试条件(attribute test condition),将记录划分成较小的子集。对于测试条件的每个输出,创建一个子女节点,并根据测试结果将Dt中的记录分布到子女节点中。然后,对于每个子女节点,递归地调用该算法。And obtaining the most distinguished feature (field) according to the result of the feature information evaluation may include calculating the information gain of each feature parameter corresponding to the first feature parameter; selecting the feature with the largest information gain as a judgment condition to establish a child node; The training set data is divided into subset data, and the subset data is branched recursively until the data corresponding to all branch nodes corresponds to the same target category. By successively dividing the training records into relatively pure subsets, a decision tree is established recursively. Let Dt be the training record set associated with node t, and y = {y1, y2, ..., yc} y = {y1, y2, ..., yc} is the class label. The recursive definition of Hunt algorithm is as follows: If Dt All records belong to the same class, then t is a leaf node, marked with yt. If Dt contains records belonging to multiple classes, then select an attribute test condition to divide the records into smaller subsets. For each output of the test condition, a child node is created and the records in Dt are distributed to the child nodes based on the test results. Then, for each child node, the algorithm is called recursively.
服务器根据测试数据集中各样本的第二特征参数和第二目标类别,从测试数据集中统计出与预设规则模型中各分类节点对应的特征参数组合匹配的负样本数据,计算统计的负样本数据在测试数据集中总的负样本数据中所占的比例,并根据计算出的比例对决策树进行验证。在验证时,服务器可以设定预设容错误差,当所计算出的绝对差值小于预设容错误差时,验证通过,当所计算出的绝对差值大于预设容错误差时,验证不通过。当验证不通过时,服务器可以将测试数据集中的样本数据加入训练数据集中,扩大样本容量对预设规则模型进行训练,对预设规则模型进行调整。Based on the second feature parameters and the second target category of each sample in the test data set, the server calculates the negative sample data that matches the combination of the feature parameter corresponding to each classification node in the preset rule model from the test data set, and calculates the statistical negative sample data. The proportion of the total negative sample data in the test data set, and the decision tree is verified based on the calculated proportion. During the verification, the server may set a preset tolerance error. When the calculated absolute difference value is less than the preset tolerance error, the verification passes, and when the calculated absolute difference value is greater than the preset tolerance error, the verification fails. When the verification fails, the server can add the sample data in the test data set to the training data set, expand the sample capacity to train the preset rule model, and adjust the preset rule model.
在其中一个实施例中,在上述根据决策树模型生成地点评级模型后,服务器根据地点评级模型再生成旅客评级模型,从而可以使用该旅客评级模型,但可选地为了保证旅客评级模型的正确性,服务器可以预先设定历史过关记录更新时间,历史过关记录更新时间可以为对安防场所的过关记录进行更新的时间。当到达历史过关记录更新时间后,服务器加载更新的历史过关记录,历史过关记录包含了若干基本字段,包括姓名、年龄、性别以及通关时间等,安防终端可以主动或被动地向服务器发送更新的历史过关记录。In one embodiment, after generating the place rating model according to the decision tree model, the server generates a passenger rating model according to the place rating model, so that the passenger rating model can be used, but optionally to ensure the correctness of the passenger rating model The server can set the historical clearance record update time in advance, and the historical clearance record update time can be the time to update the clearance record of the security place. When the historical clearance record update time is reached, the server loads the updated historical clearance record. The historical clearance record contains several basic fields, including name, age, gender, and customs clearance time. The security terminal can actively or passively send the updated history to the server. Clearance records.
服务器从历史过关记录中提取出第三特征参数和旅客风险等级,第三特征参数与生成旅客评级模型的地点评级模型中设定的特征相对应,即与目标特征相对应,旅客风险等级即为安全检查结果标记。The server extracts the third characteristic parameter and passenger risk level from the historical clearance records. The third characteristic parameter corresponds to the characteristic set in the location rating model that generates the passenger rating model, that is, it corresponds to the target characteristic, and the passenger risk level is Security inspection result mark.
服务器根据历史过关记录中的第三特征参数和旅客风险等级,从历史过关记录中统计出与各个地点评级模型中各分类节点对应的特征参数组合匹配的负样本数据,计算统计的负样本数据在历史过关记录总的负样本数据中所占的比例,并根据计算出的比例对各个地点评级模型中各分类节点的旅客等级进行验证。在验证时,服务器可以设定预设偏差,当计算出的比例与旅客等级中该等级的负样本占比的绝对差值小于预设偏差时,验证通过;当计算出的比例与旅客等级中该等级的负样本占比的绝对差值大于预设偏差时,验证不通过。当验证不通过时,服务器可以将历史过关记录继续对旅客评分模型进行训练和调整,从而根据历史过关记录对地点评级模型进行不断优化,进一步对旅客评级模型进行优化,从而通过大数据的训练使得通过旅客评级模型得到的旅客等级越来越准确。The server calculates the negative sample data that matches the combination of the characteristic parameter corresponding to each classification node in the rating model of each place according to the third characteristic parameter and passenger risk level in the historical clearance record, and calculates the statistical negative sample data in the The percentage of the total negative sample data of historical clearance records, and the passenger levels of each classification node in the rating model of each place are verified according to the calculated ratio. During verification, the server can set a preset deviation. When the absolute difference between the calculated ratio and the negative sample proportion of the level in the passenger level is less than the preset deviation, the verification passes; when the calculated ratio is in the passenger level, When the absolute difference of the proportion of negative samples of this level is greater than the preset deviation, the verification fails. When the verification fails, the server can continue to train and adjust the passenger rating model with historical clearance records, so as to continuously optimize the location rating model based on the historical clearance records, and further optimize the passenger rating model, so that training through big data enables Passenger ratings obtained through passenger rating models are becoming more and more accurate.
上述实施例中,通过决策树模型来生成地点评级模型,提高了准确性。In the above embodiment, a decision tree model is used to generate a place rating model, which improves accuracy.
应该理解的是,虽然图2的流程图中的各个步骤按照箭头的指示依次显示,但是这些步骤并不是必然按照箭头指示的顺序依次执行。除非本文中有明确的说明,这些步骤的执行并没有严格的顺序限制,这些步骤可以以其它的顺序执行。而且,图2中的至少一部分步骤可以包括多个子步骤或者多个阶段,这些子步骤或者阶段并不必然是在同一时刻执 行完成,而是可以在不同的时刻执行,这些子步骤或者阶段的执行顺序也不必然是依次进行,而是可以与其它步骤或者其它步骤的子步骤或者阶段的至少一部分轮流或者交替地执行。It should be understood that although the steps in the flowchart of FIG. 2 are sequentially displayed according to the directions of the arrows, these steps are not necessarily performed sequentially in the order indicated by the arrows. Unless explicitly stated in this document, the execution of these steps is not strictly limited, and these steps can be performed in other orders. Moreover, at least a part of the steps in FIG. 2 may include multiple sub-steps or stages. These sub-steps or stages are not necessarily performed at the same time, but may be performed at different times. The execution of these sub-steps or stages The sequence is not necessarily performed sequentially, but may be performed in turn or alternately with other steps or at least a part of the sub-steps or stages of other steps.
在其中一个实施例中,如图3所示,提供了一种旅客评级模型生成装置,包括:历史过关记录获取模块100、分组模块200、地点评级模型生成模块300和旅客评级模型生成模块400,其中:In one embodiment, as shown in FIG. 3, a passenger rating model generating device is provided, which includes a historical clearance record acquisition module 100, a grouping module 200, a place rating model generating module 300, and a passenger rating model generating module 400. among them:
历史过关记录获取模块100,用于获取历史过关记录,历史过关记录携带有风险旅客标签或普通旅客标签。The historical clearance record acquisition module 100 is configured to obtain a historical clearance record. The historical clearance record carries a risk passenger label or an ordinary passenger label.
分组模块200,用于将历史过关记录按照过关地点进行分组。A grouping module 200 is configured to group the historical clearance records according to the clearance locations.
地点评级模型生成模块300,用于对分组后的历史过关记录分别进行训练得到对应的地点评级模型。The location rating model generating module 300 is configured to train the historical clearance records after grouping to obtain corresponding location rating models.
及旅客评级模型生成模块400,用于将所得到的地点评级模型进行组合得到旅客评级模型。And a passenger rating model generating module 400, configured to combine the obtained place rating models to obtain a passenger rating model.
在其中一个实施例中,旅客评级模型生成模块400包括:In one embodiment, the passenger rating model generation module 400 includes:
接收单元,用于接收输入的针对地点评级模型的权重分配指令。The receiving unit is configured to receive an input weight allocation instruction for a place rating model.
权重获取单元,用于根据权重分配指令得到地点评级模型的权重。The weight obtaining unit is configured to obtain the weight of the place rating model according to the weight allocation instruction.
及第一计算单元,用于按照加权平均的方式根据地点评级模型以及对应的权重计算得到旅客评级模型。And a first calculating unit, configured to calculate a passenger rating model according to the weighted average method according to the location rating model and the corresponding weight.
在其中一个实施例中,地点评级模型生成模块300包括:In one embodiment, the place rating model generation module 300 includes:
业务规则获取单元,用于获取当前业务规则,并查询与当前业务规则对应的补充特征参数。A business rule obtaining unit is configured to obtain a current business rule and query a supplementary characteristic parameter corresponding to the current business rule.
第二计算单元,用于分别根据分组后的第一过关记录中的原始特征参数计算补充特征参数。A second calculation unit is configured to calculate supplementary feature parameters according to the original feature parameters in the first pass record after grouping.
及地点评级模型生成单元,用于对补充特征参数以及原始特征参数进行训练得到对应的地点评级模型。And a place rating model generating unit, configured to train the supplementary feature parameters and the original feature parameters to obtain a corresponding place rating model.
在其中一个实施例中,地点评级模型生成模块300包括:In one embodiment, the place rating model generation module 300 includes:
划分单元,用于将分组后的历史过关记录划分为训练集数据和测试集数据。A dividing unit, configured to divide the grouped historical clearance records into training set data and test set data.
目标特征提取单元,用于从训练集数据中提取第一特征参数,根据第一特征参数进行特征增益评估,并根据特征增益评估结果从第一特征参数中提取目标特征。A target feature extraction unit is configured to extract a first feature parameter from the training set data, perform a feature gain evaluation according to the first feature parameter, and extract a target feature from the first feature parameter according to a result of the feature gain evaluation.
初始评级模型生成单元,用于根据所提取的目标特征以及对应的训练集数据对应的旅客标签对训练集数据进行分类得到初始评级模型,并计算初始评级模型中各个分类节点的评分等级。An initial rating model generating unit is configured to classify the training set data according to the extracted target features and the passenger tags corresponding to the corresponding training set data to obtain an initial rating model, and calculate a rating level of each classification node in the initial rating model.
特征提取单元,用于从测试集数据中提取与所选择的目标特征对应的第二特征参数。A feature extraction unit is configured to extract a second feature parameter corresponding to the selected target feature from the test set data.
验证单元,用于通过第二特征参数以及测试集数据对应的旅客标签对初始评级模型中各个节点的评分等级进行验证得到第一验证结果。The verification unit is configured to verify the rating level of each node in the initial rating model by using the second feature parameter and the passenger label corresponding to the test set data to obtain a first verification result.
及调整单元,用于根据第一验证结果对初始评级模型进行调整得到地点评级模型。And an adjusting unit, configured to adjust the initial rating model according to the first verification result to obtain a location rating model.
在其中一个实施例中,装置还包括:In one embodiment, the apparatus further includes:
加载模块,用于当到达历史过关记录更新时间时,加载更新的历史过关记录。A loading module is configured to load an updated historical clearance record when the update time of the historical clearance record is reached.
特征提取模块,用于从更新的历史过关记录中提取与地点评级模型对应的第三特征参数。A feature extraction module is configured to extract a third feature parameter corresponding to the place rating model from the updated historical clearance record.
验证模块,用于根据第三特征参数和更新的历史过关记录对应的旅客标签对地点评级模型中各节点的评分等级进行验证得到第二验证结果。The verification module is configured to verify the rating level of each node in the place rating model according to the third characteristic parameter and the passenger tag corresponding to the updated historical clearance record to obtain a second verification result.
及优化模块,用于根据第二验证结果对地点评级模型进行优化。And an optimization module for optimizing the place rating model according to the second verification result.
关于旅客评级模型生成装置的具体限定可以参见上文中对于旅客评级模型生成方法的限定,在此不再赘述。上述旅客评级模型生成装置中的各个模块可全部或部分通过软件、硬件及其组合来实现。上述各模块可以硬件形式内嵌于或独立于计算机设备中的处理器中,也可以以软件形式存储于计算机设备中的存储器中,以便于处理器调用执行以上各个模块对应的操作。For specific limitations on the device for generating a passenger rating model, refer to the foregoing limitation on the method for generating a passenger rating model, which will not be repeated here. Each module in the above-mentioned passenger rating model generating device may be implemented in whole or in part by software, hardware, and a combination thereof. The above-mentioned modules may be embedded in the hardware form or independent of the processor in the computer device, or may be stored in the memory of the computer device in the form of software, so that the processor calls and performs the operations corresponding to the above modules.
在一个实施例中,提供了一种计算机设备,该计算机设备可以是服务器,其内部结构图可以如图4所示。该计算机设备包括通过***总线连接的处理器、存储器、网络接口和数据库。其中,该计算机设备的处理器用于提供计算和控制能力。该计算机设备的存储器包括非易失性存储介质、内存储器。该非易失性存储介质存储有操作***、计算机可读指令和数据库。该内存储器为非易失性存储介质中的操作***和计算机可读指令的运行提供环境。该计算机设备的数据库用于存储历史过关记录。该计算机设备的网络接口用于与外部的终端通过网络连接通信。该计算机可读指令被处理器执行时以实现一种旅客评级模型生成方法。In one embodiment, a computer device is provided. The computer device may be a server, and its internal structure diagram may be as shown in FIG. 4. The computer device includes a processor, a memory, a network interface, and a database connected through a system bus. The processor of the computer device is used to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, computer-readable instructions, and a database. The internal memory provides an environment for operating the operating system and computer-readable instructions in a non-volatile storage medium. The computer equipment database is used to store historical clearance records. The network interface of the computer device is used to communicate with an external terminal through a network connection. The computer-readable instructions are executed by a processor to implement a passenger rating model generation method.
本领域技术人员可以理解,图4中示出的结构,仅仅是与本申请方案相关的部分结构的框图,并不构成对本申请方案所应用于其上的计算机设备的限定,具体的计算机设备可以包括比图中所示更多或更少的部件,或者组合某些部件,或者具有不同的部件布置。Those skilled in the art can understand that the structure shown in FIG. 4 is only a block diagram of a part of the structure related to the solution of the present application, and does not constitute a limitation on the computer equipment to which the solution of the present application is applied. The specific computer equipment may be Include more or fewer parts than shown in the figure, or combine certain parts, or have a different arrangement of parts.
一种计算机设备,包括存储器和一个或多个处理器,存储器中储存有计算机可读指令,计算机可读指令被处理器执行时,使得一个或多个处理器执行以下步骤:获取历史过关记录,历史过关记录携带有风险旅客标签或普通旅客标签;将历史过关记录按照过关地点进行分组;对分组后的历史过关记录分别进行训练得到对应的地点评级模型;及将所得到的地点评级模型进行组合得到旅客评级模型。A computer device includes a memory and one or more processors. Computer-readable instructions are stored in the memory, and when the computer-readable instructions are executed by the processor, the one or more processors execute the following steps: obtaining a historical clearance record, Historical clearance records carry risky passenger labels or ordinary passenger labels; group historical clearance records according to clearance locations; train historical clearance records after grouping to obtain corresponding location rating models; and combine the obtained location rating models Get passenger rating models.
在一个实施例中,处理器执行计算机可读指令时所实现的将所得到的地点评级模型进行组合得到旅客评级模型,可以包括:接收输入的针对地点评级模型的权重分配指令;根据权重分配指令得到地点评级模型的权重;及按照加权平均的方式根据地点评级模型以及对应的权重计算得到旅客评级模型。In one embodiment, the combination of the obtained place rating model and the passenger rating model realized when the processor executes the computer-readable instructions may include: receiving an input weight allocation instruction for the place rating model; and according to the weight allocation instruction Obtain the weight of the place rating model; and calculate the passenger rating model according to the weighted average method according to the place rating model and the corresponding weight.
在一个实施例中,处理器执行计算机可读指令时所实现的对分组后的历史过关记录进行训练得到对应的地点评级模型,可以包括:获取当前业务规则,并查询与当前业务规则 对应的补充特征参数;分别根据分组后的第一过关记录中的原始特征参数计算补充特征参数;及对补充特征参数以及原始特征参数进行训练得到对应的地点评级模型。In one embodiment, training the grouped historical clearance records to obtain the corresponding place rating model implemented when the processor executes computer-readable instructions may include: obtaining a current business rule, and querying supplements corresponding to the current business rule Feature parameters; Complementary feature parameters are calculated according to the original feature parameters in the first pass record after grouping; and training is performed on the supplementary feature parameters and the original feature parameters to obtain a corresponding place rating model.
在一个实施例中,处理器执行计算机可读指令时所实现的对分组后的历史过关记录进行训练得到对应的地点评级模型,可以包括:将分组后的历史过关记录划分为训练集数据和测试集数据;从训练集数据中提取第一特征参数,根据第一特征参数进行特征增益评估,并根据特征增益评估结果从第一特征参数中提取目标特征;根据所提取的目标特征以及对应的训练集数据对应的旅客标签对训练集数据进行分类得到初始评级模型,并计算初始评级模型中各个分类节点的评分等级;从测试集数据中提取与所选择的目标特征对应的第二特征参数;通过第二特征参数以及测试集数据对应的旅客标签对初始评级模型中各个节点的评分等级进行验证得到第一验证结果;及根据第一验证结果对初始评级模型进行调整得到地点评级模型。In one embodiment, training the grouped historical clearance records and implementing the computer-readable instructions to obtain a corresponding place rating model may include dividing the grouped historical clearance records into training set data and tests. Set the data; extract the first feature parameter from the training set data, perform feature gain evaluation according to the first feature parameter, and extract the target feature from the first feature parameter according to the result of the feature gain evaluation; according to the extracted target feature and the corresponding training The passenger tags corresponding to the set of data are used to classify the training set data to obtain an initial rating model, and calculate the rating level of each classification node in the initial rating model; extract the second feature parameter corresponding to the selected target feature from the test set data; The second feature parameter and the passenger label corresponding to the test set data are used to verify the rating levels of each node in the initial rating model to obtain a first verification result; and the initial rating model is adjusted to obtain a location rating model according to the first verification result.
在一个实施例中,处理器执行计算机可读指令时还实现以下步骤:当到达历史过关记录更新时间时,加载更新的历史过关记录;从更新的历史过关记录中提取与地点评级模型对应的第三特征参数;根据第三特征参数和更新的历史过关记录对应的旅客标签对地点评级模型中各节点的评分等级进行验证得到第二验证结果;及根据第二验证结果对地点评级模型进行优化。In one embodiment, when the processor executes the computer-readable instructions, the processor further implements the following steps: when the update time of the historical clearance record is reached, loading the updated historical clearance record; and extracting the first corresponding to the place rating model from the updated historical clearance record. Three characteristic parameters; verifying the rating level of each node in the place rating model to obtain a second verification result according to the third feature parameter and the passenger tag corresponding to the updated historical clearance record; and optimizing the place rating model according to the second verification result.
一个或多个存储有计算机可读指令的非易失性计算机可读存储介质,计算机可读指令被一个或多个处理器执行时,使得一个或多个处理器执行以下步骤:获取历史过关记录,历史过关记录携带有风险旅客标签或普通旅客标签;将历史过关记录按照过关地点进行分组;对分组后的历史过关记录分别进行训练得到对应的地点评级模型;及将所得到的地点评级模型进行组合得到旅客评级模型。One or more non-volatile computer-readable storage media storing computer-readable instructions, and when the computer-readable instructions are executed by one or more processors, cause the one or more processors to perform the following steps: obtaining a historical clearance record , The historical clearance records carry the risk passenger label or ordinary passenger label; group the historical clearance records according to the clearance location; separately train the historical clearance records after the grouping to obtain the corresponding location rating model; and perform the obtained location rating model The combination gets the passenger rating model.
在一个实施例中,计算机可读指令被处理器执行时所实现的将所得到的地点评级模型进行组合得到旅客评级模型,可以包括:接收输入的针对地点评级模型的权重分配指令;根据权重分配指令得到地点评级模型的权重;及按照加权平均的方式根据地点评级模型以及对应的权重计算得到旅客评级模型。In one embodiment, the combination of the obtained place rating model and the passenger rating model realized when the computer-readable instructions are executed by the processor may include: receiving input weight allocation instructions for the place rating model; and assigning according to the weight The instruction obtains the weight of the place rating model; and calculates the passenger rating model according to the weighted average method according to the place rating model and the corresponding weight.
在一个实施例中,计算机可读指令被处理器执行时所实现的对分组后的历史过关记录进行训练得到对应的地点评级模型,可以包括:获取当前业务规则,并查询与当前业务规则对应的补充特征参数;分别根据分组后的第一过关记录中的原始特征参数计算补充特征参数;及对补充特征参数以及原始特征参数进行训练得到对应的地点评级模型。In one embodiment, the training of the grouped historical clearance records implemented by the processor when the computer-readable instructions are executed to obtain the corresponding place rating model may include: obtaining the current business rule, and querying the corresponding business rule Complementary feature parameters; Complementary feature parameters are calculated according to the original feature parameters in the first pass record after grouping; and the corresponding feature rating model is obtained by training the supplemental feature parameters and the original feature parameters.
在一个实施例中,计算机可读指令被处理器执行时所实现的对分组后的历史过关记录进行训练得到对应的地点评级模型,可以包括:将分组后的历史过关记录划分为训练集数据和测试集数据;从训练集数据中提取第一特征参数,根据第一特征参数进行特征增益评估,并根据特征增益评估结果从第一特征参数中提取目标特征;根据所提取的目标特征以及对应的训练集数据对应的旅客标签对训练集数据进行分类得到初始评级模型,并计算初始评级模型中各个分类节点的评分等级;从测试集数据中提取与所选择的目标特征对应的 第二特征参数;通过第二特征参数以及测试集数据对应的旅客标签对初始评级模型中各个节点的评分等级进行验证得到第一验证结果;及根据第一验证结果对初始评级模型进行调整得到地点评级模型。In one embodiment, the training of the grouped historical clearance records and the corresponding place rating model implemented when the computer-readable instructions are executed by the processor may include: dividing the grouped historical clearance records into training set data and Test set data; extract first feature parameters from training set data, perform feature gain evaluation based on the first feature parameters, and extract target features from the first feature parameters based on the feature gain evaluation results; according to the extracted target features and corresponding The passenger tags corresponding to the training set data are used to classify the training set data to obtain an initial rating model, and calculate the rating level of each classification node in the initial rating model; extracting second feature parameters corresponding to the selected target feature from the test set data; The first rating result is obtained by verifying the rating level of each node in the initial rating model by using the second feature parameter and the passenger label corresponding to the test set data; and adjusting the initial rating model according to the first verification result to obtain the location rating model.
在一个实施例中,计算机可读指令被处理器执行时还实现以下步骤:当到达历史过关记录更新时间时,加载更新的历史过关记录;从更新的历史过关记录中提取与地点评级模型对应的第三特征参数;根据第三特征参数和更新的历史过关记录对应的旅客标签对地点评级模型中各节点的评分等级进行验证得到第二验证结果;及根据第二验证结果对地点评级模型进行优化。In one embodiment, when the computer-readable instructions are executed by the processor, the following steps are further implemented: when the historical clearance record update time is reached, loading the updated historical clearance record; and extracting the corresponding historical place clearance model from the updated historical clearance record. The third characteristic parameter; verifying the rating level of each node in the place rating model according to the third characteristic parameter and the updated passenger clearance record to obtain a second verification result; and optimizing the place rating model according to the second verification result .
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,是可以通过计算机可读指令来指令相关的硬件来完成,所述的计算机可读指令可存储于一非易失性计算机可读取存储介质中,该计算机可读指令在执行时,可包括如上述各方法的实施例的流程。其中,本申请所提供的各实施例中所使用的对存储器、存储、数据库或其它介质的任何引用,均可包括非易失性和/或易失性存储器。非易失性存储器可包括只读存储器(ROM)、可编程ROM(PROM)、电可编程ROM(EPROM)、电可擦除可编程ROM(EEPROM)或闪存。易失性存储器可包括随机存取存储器(RAM)或者外部高速缓冲存储器。作为说明而非局限,RAM以多种形式可得,诸如静态RAM(SRAM)、动态RAM(DRAM)、同步DRAM(SDRAM)、双数据率SDRAM(DDRSDRAM)、增强型SDRAM(ESDRAM)、同步链路(Synchlink)DRAM(SLDRAM)、存储器总线(Rambus)直接RAM(RDRAM)、直接存储器总线动态RAM(DRDRAM)、以及存储器总线动态RAM(RDRAM)等。Those of ordinary skill in the art can understand that all or part of the processes in the methods of the above embodiments can be implemented by computer-readable instructions to instruct related hardware. The computer-readable instructions can be stored in a non-volatile computer. In the readable storage medium, the computer-readable instructions, when executed, may include the processes of the embodiments of the methods described above. Wherein, any reference to the memory, storage, database or other media used in the embodiments provided in this application may include non-volatile and / or volatile memory. Non-volatile memory may include read-only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory. Volatile memory can include random access memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in various forms, such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), dual data rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous chain Synchlink DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).
以上实施例的各技术特征可以进行任意的组合,为使描述简洁,未对上述实施例中的各个技术特征所有可能的组合都进行描述,然而,只要这些技术特征的组合不存在矛盾,都应当认为是本说明书记载的范围。The technical features of the above embodiments can be arbitrarily combined. In order to make the description concise, all possible combinations of the technical features in the above embodiments have not been described. However, as long as there is no contradiction in the combination of these technical features, they should be It is considered to be the range described in this specification.
以上所述实施例仅表达了本申请的几种实施方式,其描述较为具体和详细,但并不能因此而理解为对发明专利范围的限制。应当指出的是,对于本领域的普通技术人员来说,在不脱离本申请构思的前提下,还可以做出若干变形和改进,这些都属于本申请的保护范围。因此,本申请专利的保护范围应以所附权利要求为准。The above-mentioned embodiments only express several implementation manners of the present application, and their descriptions are more specific and detailed, but they cannot be understood as limiting the scope of the invention patent. It should be noted that, for those of ordinary skill in the art, without departing from the concept of the present application, several modifications and improvements can be made, which all belong to the protection scope of the present application. Therefore, the protection scope of this application patent shall be subject to the appended claims.

Claims (20)

  1. 一种旅客评级模型生成方法,包括:A method for generating a passenger rating model includes:
    获取历史过关记录,所述历史过关记录携带有风险旅客标签或普通旅客标签;Obtaining historical clearance records, which carry the risk passenger label or ordinary passenger label;
    将所述历史过关记录按照过关地点进行分组;Grouping the historical clearance records according to the clearance locations;
    对分组后的历史过关记录分别进行训练得到对应的地点评级模型;及Train the historical clearance records after grouping to obtain corresponding place rating models; and
    将所得到的地点评级模型进行组合得到旅客评级模型。The obtained place rating models are combined to obtain a passenger rating model.
  2. 根据权利要求1所述的方法,其特征在于,所述将所得到的地点评级模型进行组合得到旅客评级模型,包括:The method according to claim 1, wherein the combining the obtained place rating models to obtain a passenger rating model comprises:
    接收输入的针对所述地点评级模型的权重分配指令;Receiving an input weight allocation instruction for the place rating model;
    根据所述权重分配指令得到所述地点评级模型的权重;及Obtaining the weight of the place rating model according to the weight allocation instruction; and
    按照加权平均的方式根据所述地点评级模型以及对应的权重计算得到旅客评级模型。A passenger rating model is calculated according to the location rating model and corresponding weights in a weighted average manner.
  3. 根据权利要求1所述的方法,其特征在于,所述对分组后的历史过关记录进行训练得到对应的地点评级模型,包括:The method according to claim 1, wherein the training of the historical clearance records after grouping to obtain a corresponding place rating model comprises:
    获取当前业务规则,并查询与所述当前业务规则对应的补充特征参数;Acquiring a current business rule, and querying a supplementary characteristic parameter corresponding to the current business rule;
    分别根据分组后的所述第一过关记录中的原始特征参数计算补充特征参数;及Calculating supplementary feature parameters based on the original feature parameters in the first pass record after grouping; and
    对所述补充特征参数以及原始特征参数进行训练得到对应的地点评级模型。The complementary feature parameters and the original feature parameters are trained to obtain a corresponding place rating model.
  4. 根据权利要求1至3任意一项所述的方法,其特征在于,所述对分组后的历史过关记录进行训练得到对应的地点评级模型,包括:The method according to any one of claims 1 to 3, wherein the training of the historical clearance records after grouping to obtain a corresponding place rating model comprises:
    将分组后的所述历史过关记录划分为训练集数据和测试集数据;Dividing the grouped historical clearance records into training set data and test set data;
    从所述训练集数据中提取第一特征参数,根据所述第一特征参数进行特征增益评估,并根据特征增益评估结果从所述第一特征参数中提取目标特征;Extracting a first feature parameter from the training set data, performing a feature gain evaluation according to the first feature parameter, and extracting a target feature from the first feature parameter according to a result of the feature gain evaluation;
    根据所提取的目标特征以及对应的所述训练集数据对应的旅客标签对所述训练集数据进行分类得到初始评级模型,并计算所述初始评级模型中各个分类节点的评分等级;Classify the training set data according to the extracted target features and the corresponding passenger tags corresponding to the training set data to obtain an initial rating model, and calculate the rating level of each classification node in the initial rating model;
    从所述测试集数据中提取与所选择的目标特征对应的第二特征参数;Extracting a second feature parameter corresponding to the selected target feature from the test set data;
    通过所述第二特征参数以及所述测试集数据对应的旅客标签对所述初始评级模型中各个节点的评分等级进行验证得到第一验证结果;及Verifying the rating level of each node in the initial rating model by using the second feature parameter and a passenger tag corresponding to the test set data to obtain a first verification result; and
    根据所述第一验证结果对所述初始评级模型进行调整得到地点评级模型。Adjusting the initial rating model according to the first verification result to obtain a place rating model.
  5. 根据权利要求4所述的方法,其特征在于,所述方法还包括:The method according to claim 4, further comprising:
    当到达历史过关记录更新时间时,加载更新的历史过关记录;When the historical clearance record update time is reached, the updated historical clearance record is loaded;
    从更新的所述历史过关记录中提取与所述地点评级模型对应的第三特征参数;Extracting a third feature parameter corresponding to the place rating model from the updated historical clearance record;
    根据所述第三特征参数和更新的所述历史过关记录对应的旅客标签对所述地点评级模型中各节点的评分等级进行验证得到第二验证结果;及Verifying the rating level of each node in the place rating model according to the third characteristic parameter and the updated passenger tag corresponding to the historical clearance record to obtain a second verification result; and
    根据所述第二验证结果对所述地点评级模型进行优化。Optimize the place rating model according to the second verification result.
  6. 一种旅客评级模型生成装置,包括:A passenger rating model generating device includes:
    历史过关记录获取模块,用于获取历史过关记录,所述历史过关记录携带有风险旅客标签或普通旅客标签;A historical clearance record acquisition module, configured to obtain a historical clearance record, which carries a risk passenger label or an ordinary passenger label;
    分组模块,用于将所述历史过关记录按照过关地点进行分组;A grouping module, configured to group the historical clearance records according to the clearance locations;
    地点评级模型生成模块,用于对分组后的历史过关记录分别进行训练得到对应的地点评级模型;Place rating model generation module, which is used to train the historical clearance records after grouping to obtain corresponding place rating models;
    旅客评级模型生成模块,用于将所得到的地点评级模型进行组合得到旅客评级模型。The passenger rating model generating module is configured to combine the obtained place rating models to obtain a passenger rating model.
  7. 根据权利要求6所述的装置,其特征在于,所述旅客评级模型生成模块包括:The device according to claim 6, wherein the passenger rating model generation module comprises:
    接收单元,用于接收输入的针对所述地点评级模型的权重分配指令;A receiving unit, configured to receive an input weight allocation instruction for the place rating model;
    权重获取单元,用于根据所述权重分配指令得到所述地点评级模型的权重;及A weight acquisition unit, configured to obtain the weight of the place rating model according to the weight allocation instruction; and
    第一计算单元,用于按照加权平均的方式根据所述地点评级模型以及对应的权重计算得到旅客评级模型。A first calculation unit is configured to calculate a passenger rating model according to the location rating model and corresponding weights in a weighted average manner.
  8. 根据权利要求6所述的装置,其特征在于,所述地点评级模型生成模块包括:The apparatus according to claim 6, wherein the place rating model generation module comprises:
    业务规则获取单元,用于获取当前业务规则,并查询与所述当前业务规则对应的补充特征参数;A business rule obtaining unit, configured to obtain a current business rule and query a supplementary characteristic parameter corresponding to the current business rule;
    第二计算单元,用于分别根据分组后的所述第一过关记录中的原始特征参数计算补充特征参数;及A second calculation unit, configured to calculate supplementary feature parameters according to the original feature parameters in the first pass record after grouping; and
    地点评级模型生成单元,用于对所述补充特征参数以及原始特征参数进行训练得到对应的地点评级模型。A place rating model generating unit is configured to train the supplementary feature parameters and the original feature parameters to obtain a corresponding place rating model.
  9. 根据权利要求1至3任意一项所述的装置,其特征在于,所述地点评级模型生成模块包括:The device according to any one of claims 1 to 3, wherein the place rating model generation module comprises:
    划分单元,用于将分组后的所述历史过关记录划分为训练集数据和测试集数据;A dividing unit, configured to divide the grouped historical clearance records into training set data and test set data;
    目标特征提取单元,用于从所述训练集数据中提取第一特征参数,根据所述第一特征参数进行特征增益评估,并根据特征增益评估结果从所述第一特征参数中提取目标特征;A target feature extraction unit, configured to extract a first feature parameter from the training set data, perform a feature gain evaluation according to the first feature parameter, and extract a target feature from the first feature parameter according to a feature gain evaluation result;
    初始评级模型生成单元,用于根据所提取的目标特征以及对应的所述训练集数据对应的旅客标签对所述训练集数据进行分类得到初始评级模型,并计算所述初始评级模型中各个分类节点的评分等级;An initial rating model generating unit is configured to classify the training set data according to the extracted target features and the corresponding passenger tags corresponding to the training set data to obtain an initial rating model, and calculate each classification node in the initial rating model. Rating grade
    特征提取单元,用于从所述测试集数据中提取与所选择的目标特征对应的第二特征参数;A feature extraction unit, configured to extract a second feature parameter corresponding to the selected target feature from the test set data;
    验证单元,用于通过所述第二特征参数以及所述测试集数据对应的旅客标签对所述初始评级模型中各个节点的评分等级进行验证得到第一验证结果;及A verification unit, configured to verify a rating level of each node in the initial rating model by using the second feature parameter and a passenger tag corresponding to the test set data to obtain a first verification result; and
    调整单元,用于根据所述第一验证结果对所述初始评级模型进行调整得到地点评级模型。An adjusting unit is configured to adjust the initial rating model according to the first verification result to obtain a place rating model.
  10. 根据权利要求9所述的装置,其特征在于,所述装置还包括:The apparatus according to claim 9, further comprising:
    加载模块,用于当到达历史过关记录更新时间时,加载更新的历史过关记录;A loading module for loading an updated historical clearance record when the update time of the historical clearance record is reached;
    特征提取模块,用于从更新的所述历史过关记录中提取与所述地点评级模型对应的第 三特征参数;A feature extraction module, configured to extract a third feature parameter corresponding to the place rating model from the updated historical clearance record;
    优化模块,用于根据所述第三特征参数和更新的所述历史过关记录对应的旅客标签对所述地点评级模型中各节点的评分等级进行验证得到第二验证结果;An optimization module, configured to verify a rating level of each node in the place rating model according to the third characteristic parameter and an updated passenger tag corresponding to the historical clearance record to obtain a second verification result;
    根据所述第二验证结果对所述地点评级模型进行优化。Optimize the place rating model according to the second verification result.
  11. 一种计算机设备,包括存储器及一个或多个处理器,所述存储器中储存有计算机可读指令,所述计算机可读指令被所述一个或多个处理器执行时,使得所述一个或多个处理器执行以下步骤:A computer device includes a memory and one or more processors. The memory stores computer-readable instructions. When the computer-readable instructions are executed by the one or more processors, the one or more processors are Each processor performs the following steps:
    获取历史过关记录,所述历史过关记录携带有风险旅客标签或普通旅客标签;Obtaining historical clearance records, which carry the risk passenger label or ordinary passenger label;
    将所述历史过关记录按照过关地点进行分组;Grouping the historical clearance records according to the clearance locations;
    对分组后的历史过关记录分别进行训练得到对应的地点评级模型;及Train the historical clearance records after grouping to obtain corresponding place rating models; and
    将所得到的地点评级模型进行组合得到旅客评级模型。The obtained place rating models are combined to obtain a passenger rating model.
  12. 根据权利要求11所述的计算机设备,其特征在于,所述处理器执行所述计算机可读指令时所实现的所述将所得到的地点评级模型进行组合得到旅客评级模型,包括:The computer device according to claim 11, wherein the combination of the obtained place rating models, which is implemented when the processor executes the computer-readable instructions, to obtain a passenger rating model comprises:
    接收输入的针对所述地点评级模型的权重分配指令;Receiving an input weight allocation instruction for the place rating model;
    根据所述权重分配指令得到所述地点评级模型的权重;及Obtaining the weight of the place rating model according to the weight allocation instruction; and
    按照加权平均的方式根据所述地点评级模型以及对应的权重计算得到旅客评级模型。A passenger rating model is calculated according to the location rating model and corresponding weights in a weighted average manner.
  13. 根据权利要求11所述的计算机设备,其特征在于,所述处理器执行所述计算机可读指令时所实现的所述对分组后的历史过关记录进行训练得到对应的地点评级模型,包括:The computer device according to claim 11, wherein the training of the grouped historical clearance records implemented by the processor when executing the computer-readable instructions to obtain a corresponding place rating model comprises:
    获取当前业务规则,并查询与所述当前业务规则对应的补充特征参数;Acquiring a current business rule, and querying a supplementary characteristic parameter corresponding to the current business rule;
    分别根据分组后的所述第一过关记录中的原始特征参数计算补充特征参数;及Calculating supplementary feature parameters based on the original feature parameters in the first pass record after grouping; and
    对所述补充特征参数以及原始特征参数进行训练得到对应的地点评级模型。The complementary feature parameters and the original feature parameters are trained to obtain a corresponding place rating model.
  14. 根据权利要求11至13任意一项所述的计算机设备,其特征在于,所述处理器执行所述计算机可读指令时所实现的所述对分组后的历史过关记录进行训练得到对应的地点评级模型,包括:The computer device according to any one of claims 11 to 13, wherein the training performed on the grouped historical clearance records implemented by the processor when executing the computer-readable instructions obtains a corresponding place rating Models, including:
    将分组后的所述历史过关记录划分为训练集数据和测试集数据;Dividing the grouped historical clearance records into training set data and test set data;
    从所述训练集数据中提取第一特征参数,根据所述第一特征参数进行特征增益评估,并根据特征增益评估结果从所述第一特征参数中提取目标特征;Extracting a first feature parameter from the training set data, performing a feature gain evaluation according to the first feature parameter, and extracting a target feature from the first feature parameter according to a result of the feature gain evaluation;
    根据所提取的目标特征以及对应的所述训练集数据对应的旅客标签对所述训练集数据进行分类得到初始评级模型,并计算所述初始评级模型中各个分类节点的评分等级;Classify the training set data according to the extracted target features and the corresponding passenger tags corresponding to the training set data to obtain an initial rating model, and calculate the rating level of each classification node in the initial rating model;
    从所述测试集数据中提取与所选择的目标特征对应的第二特征参数;Extracting a second feature parameter corresponding to the selected target feature from the test set data;
    通过所述第二特征参数以及所述测试集数据对应的旅客标签对所述初始评级模型中各个节点的评分等级进行验证得到第一验证结果;及Verifying the rating level of each node in the initial rating model by using the second feature parameter and a passenger tag corresponding to the test set data to obtain a first verification result; and
    根据所述第一验证结果对所述初始评级模型进行调整得到地点评级模型。Adjusting the initial rating model according to the first verification result to obtain a place rating model.
  15. 根据权利要求14所述的计算机设备,其特征在于,所述处理器执行所述计算机 可读指令时还执行以下步骤:The computer device of claim 14, wherein the processor further executes the following steps when executing the computer-readable instructions:
    当到达历史过关记录更新时间时,加载更新的历史过关记录;When the historical clearance record update time is reached, the updated historical clearance record is loaded;
    从更新的所述历史过关记录中提取与所述地点评级模型对应的第三特征参数;Extracting a third feature parameter corresponding to the place rating model from the updated historical clearance record;
    根据所述第三特征参数和更新的所述历史过关记录对应的旅客标签对所述地点评级模型中各节点的评分等级进行验证得到第二验证结果;及Verifying the rating level of each node in the place rating model according to the third characteristic parameter and the updated passenger tag corresponding to the historical clearance record to obtain a second verification result; and
    根据所述第二验证结果对所述地点评级模型进行优化。Optimize the place rating model according to the second verification result.
  16. 一个或多个存储有计算机可读指令的非易失性计算机可读存储介质,所述计算机可读指令被一个或多个处理器执行时,使得所述一个或多个处理器执行以下步骤:One or more non-transitory computer-readable storage media storing computer-readable instructions that, when executed by one or more processors, cause the one or more processors to perform the following steps:
    获取历史过关记录,所述历史过关记录携带有风险旅客标签或普通旅客标签;Obtaining historical clearance records, which carry the risk passenger label or ordinary passenger label;
    将所述历史过关记录按照过关地点进行分组;Grouping the historical clearance records according to the clearance locations;
    对分组后的历史过关记录分别进行训练得到对应的地点评级模型;及Train the historical clearance records after grouping to obtain corresponding place rating models; and
    将所得到的地点评级模型进行组合得到旅客评级模型。The obtained place rating models are combined to obtain a passenger rating model.
  17. 根据权利要求16所述的存储介质,其特征在于,所述计算机可读指令被所述处理器执行时所实现的所述将所得到的地点评级模型进行组合得到旅客评级模型,包括:The storage medium according to claim 16, wherein the combination of the obtained place rating models, which is implemented when the computer-readable instructions are executed by the processor, to obtain a passenger rating model comprises:
    接收输入的针对所述地点评级模型的权重分配指令;Receiving an input weight allocation instruction for the place rating model;
    根据所述权重分配指令得到所述地点评级模型的权重;及Obtaining the weight of the place rating model according to the weight allocation instruction; and
    按照加权平均的方式根据所述地点评级模型以及对应的权重计算得到旅客评级模型。A passenger rating model is calculated according to the location rating model and corresponding weights in a weighted average manner.
  18. 根据权利要求16所述的存储介质,其特征在于,所述计算机可读指令被所述处理器执行时所实现的所述对分组后的历史过关记录进行训练得到对应的地点评级模型,包括:The storage medium according to claim 16, wherein the training of the grouped historical clearance records implemented when the computer-readable instructions are executed by the processor to obtain a corresponding place rating model comprises:
    获取当前业务规则,并查询与所述当前业务规则对应的补充特征参数;Acquiring a current business rule, and querying a supplementary characteristic parameter corresponding to the current business rule;
    分别根据分组后的所述第一过关记录中的原始特征参数计算补充特征参数;及Calculating supplementary feature parameters based on the original feature parameters in the first pass record after grouping; and
    对所述补充特征参数以及原始特征参数进行训练得到对应的地点评级模型。The complementary feature parameters and the original feature parameters are trained to obtain a corresponding place rating model.
  19. 根据权利要求16至18任意一项所述的存储介质,其特征在于,所述计算机可读指令被所述处理器执行时所实现的所述对分组后的历史过关记录进行训练得到对应的地点评级模型,包括:The storage medium according to any one of claims 16 to 18, wherein the computer-readable instructions are implemented by the processor to train the grouped historical clearance records to obtain corresponding locations. Rating models, including:
    将分组后的所述历史过关记录划分为训练集数据和测试集数据;Dividing the grouped historical clearance records into training set data and test set data;
    从所述训练集数据中提取第一特征参数,根据所述第一特征参数进行特征增益评估,并根据特征增益评估结果从所述第一特征参数中提取目标特征;Extracting a first feature parameter from the training set data, performing a feature gain evaluation according to the first feature parameter, and extracting a target feature from the first feature parameter according to a result of the feature gain evaluation;
    根据所提取的目标特征以及对应的所述训练集数据对应的旅客标签对所述训练集数据进行分类得到初始评级模型,并计算所述初始评级模型中各个分类节点的评分等级;Classify the training set data according to the extracted target features and the corresponding passenger tags corresponding to the training set data to obtain an initial rating model, and calculate the rating level of each classification node in the initial rating model;
    从所述测试集数据中提取与所选择的目标特征对应的第二特征参数;Extracting a second feature parameter corresponding to the selected target feature from the test set data;
    通过所述第二特征参数以及所述测试集数据对应的旅客标签对所述初始评级模型中各个节点的评分等级进行验证得到第一验证结果;及Verifying the rating level of each node in the initial rating model by using the second feature parameter and a passenger tag corresponding to the test set data to obtain a first verification result; and
    根据所述第一验证结果对所述初始评级模型进行调整得到地点评级模型。Adjusting the initial rating model according to the first verification result to obtain a place rating model.
  20. 根据权利要求19所述的存储介质,其特征在于,所述计算机可读指令被所述处理器执行时还执行以下步骤:The storage medium according to claim 19, wherein when the computer-readable instructions are executed by the processor, the following steps are further performed:
    当到达历史过关记录更新时间时,加载更新的历史过关记录;When the historical clearance record update time is reached, the updated historical clearance record is loaded;
    从更新的所述历史过关记录中提取与所述地点评级模型对应的第三特征参数;Extracting a third feature parameter corresponding to the place rating model from the updated historical clearance record;
    根据所述第三特征参数和更新的所述历史过关记录对应的旅客标签对所述地点评级模型中各节点的评分等级进行验证得到第二验证结果;及Verifying the rating level of each node in the place rating model according to the third characteristic parameter and the updated passenger tag corresponding to the historical clearance record to obtain a second verification result; and
    根据所述第二验证结果对所述地点评级模型进行优化。Optimize the place rating model according to the second verification result.
PCT/CN2018/106083 2018-07-18 2018-09-18 Passenger rating model generation method and apparatus, and computer device and storage medium WO2020015140A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201810788417.7 2018-07-18
CN201810788417.7A CN109102159B (en) 2018-07-18 2018-07-18 Passenger rating model generation method, device, computer equipment and storage medium

Publications (1)

Publication Number Publication Date
WO2020015140A1 true WO2020015140A1 (en) 2020-01-23

Family

ID=64846639

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/106083 WO2020015140A1 (en) 2018-07-18 2018-09-18 Passenger rating model generation method and apparatus, and computer device and storage medium

Country Status (2)

Country Link
CN (1) CN109102159B (en)
WO (1) WO2020015140A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111831904A (en) * 2020-06-18 2020-10-27 天讯瑞达通信技术有限公司 Passenger behavior data analysis method and system
CN115001771A (en) * 2022-05-25 2022-09-02 武汉极意网络科技有限公司 Verification code defense method, system, equipment and storage medium based on automatic updating

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110598995B (en) * 2019-08-15 2023-08-25 中国平安人寿保险股份有限公司 Smart client rating method, smart client rating device and computer readable storage medium
CN111352171B (en) * 2020-03-30 2023-01-24 重庆特斯联智慧科技股份有限公司 Method and system for realizing artificial intelligence regional shielding security inspection
CN113052689B (en) * 2021-04-30 2024-03-26 中国银行股份有限公司 Product recommendation method and device based on decision tree

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106570631A (en) * 2016-10-28 2017-04-19 南京邮电大学 Method and system of facing P2P platform operation risk estimation
CN106874951A (en) * 2017-02-14 2017-06-20 Tcl集团股份有限公司 A kind of passenger's attention rate ranking method and device
CN107590569A (en) * 2017-09-25 2018-01-16 山东浪潮云服务信息科技有限公司 A kind of data predication method and device
CN108269012A (en) * 2018-01-12 2018-07-10 中国平安人寿保险股份有限公司 Construction method, device, storage medium and the terminal of risk score model

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
AU2014363194C1 (en) * 2013-12-11 2021-01-07 Skyscanner Limited Method and server for providing fare availabilities, such as air fare availabilities
CN108076018A (en) * 2016-11-16 2018-05-25 阿里巴巴集团控股有限公司 Identity authorization system, method, apparatus and account authentication method
CN107194412A (en) * 2017-04-20 2017-09-22 百度在线网络技术(北京)有限公司 A kind of method of processing data, device, equipment and computer-readable storage medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106570631A (en) * 2016-10-28 2017-04-19 南京邮电大学 Method and system of facing P2P platform operation risk estimation
CN106874951A (en) * 2017-02-14 2017-06-20 Tcl集团股份有限公司 A kind of passenger's attention rate ranking method and device
CN107590569A (en) * 2017-09-25 2018-01-16 山东浪潮云服务信息科技有限公司 A kind of data predication method and device
CN108269012A (en) * 2018-01-12 2018-07-10 中国平安人寿保险股份有限公司 Construction method, device, storage medium and the terminal of risk score model

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111831904A (en) * 2020-06-18 2020-10-27 天讯瑞达通信技术有限公司 Passenger behavior data analysis method and system
CN115001771A (en) * 2022-05-25 2022-09-02 武汉极意网络科技有限公司 Verification code defense method, system, equipment and storage medium based on automatic updating
CN115001771B (en) * 2022-05-25 2024-01-26 武汉极意网络科技有限公司 Verification code defending method, system, equipment and storage medium based on automatic updating

Also Published As

Publication number Publication date
CN109102159B (en) 2023-06-20
CN109102159A (en) 2018-12-28

Similar Documents

Publication Publication Date Title
WO2020015140A1 (en) Passenger rating model generation method and apparatus, and computer device and storage medium
WO2020015089A1 (en) Identity information risk assessment method and apparatus, and computer device and storage medium
CN109598095B (en) Method and device for establishing scoring card model, computer equipment and storage medium
CN111783840A (en) Visualization method and device for random forest model and storage medium
CN109858737B (en) Grading model adjustment method and device based on model deployment and computer equipment
CN109087079B (en) Digital currency transaction information analysis method
CN109002988B (en) Risk passenger flow prediction method, apparatus, computer device and storage medium
CN112330685B (en) Image segmentation model training method, image segmentation device and electronic equipment
CN110009225B (en) Risk assessment system construction method, risk assessment system construction device, computer equipment and storage medium
CN109063984B (en) Method, apparatus, computer device and storage medium for risky travelers
CN112270686B (en) Image segmentation model training method, image segmentation device and electronic equipment
CN112465043B (en) Model training method, device and equipment
CN109325118B (en) Unbalanced sample data preprocessing method and device and computer equipment
WO2020034801A1 (en) Medical feature screening method and apparatus, computer device, and storage medium
CN110135943B (en) Product recommendation method, device, computer equipment and storage medium
CN112580902B (en) Object data processing method and device, computer equipment and storage medium
CN113822315A (en) Attribute graph processing method and device, electronic equipment and readable storage medium
CN111797320A (en) Data processing method, device, equipment and storage medium
CN111210158A (en) Target address determination method and device, computer equipment and storage medium
CN111177500A (en) Data object classification method and device, computer equipment and storage medium
CN111079175B (en) Data processing method, data processing device, computer readable storage medium and computer equipment
CN113283973A (en) Account checking difference data processing method and device, computer equipment and storage medium
CN112926616B (en) Image matching method and device, electronic equipment and computer readable storage medium
WO2023049280A1 (en) Systems and methods to screen a predictive model for risks of the predictive model
CN115705511A (en) Method, device and equipment for determining pickup area and storage medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18926908

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18926908

Country of ref document: EP

Kind code of ref document: A1