WO2020015139A1 - 风险旅客方法、装置、计算机设备和存储介质 - Google Patents

风险旅客方法、装置、计算机设备和存储介质 Download PDF

Info

Publication number
WO2020015139A1
WO2020015139A1 PCT/CN2018/106009 CN2018106009W WO2020015139A1 WO 2020015139 A1 WO2020015139 A1 WO 2020015139A1 CN 2018106009 W CN2018106009 W CN 2018106009W WO 2020015139 A1 WO2020015139 A1 WO 2020015139A1
Authority
WO
WIPO (PCT)
Prior art keywords
passenger
parameter
rule model
record
preset
Prior art date
Application number
PCT/CN2018/106009
Other languages
English (en)
French (fr)
Inventor
孙闳绅
金戈
徐亮
肖京
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2020015139A1 publication Critical patent/WO2020015139A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0635Risk analysis of enterprise or organisation activities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/26Government or public services
    • G06Q50/265Personal security, identity or safety

Definitions

  • the present application relates to a risk passenger identification method, device, computer equipment, and storage medium.
  • a risk passenger identification method is provided.
  • a risk passenger identification method includes:
  • a risk passenger identification device includes:
  • a first receiving module configured to receive inputted identification information of a passenger to be identified, and query a first historical clearance record corresponding to the identification information
  • a clearance frequency calculation module configured to calculate a clearance frequency of a passenger to be identified according to the first historical clearance record
  • a preset rule model acquisition module configured to determine whether the passenger to be identified is a high-frequency transit passenger according to the crossing frequency, and if yes, obtain a preset rule model
  • a first statistical parameter calculation module configured to calculate a first statistical parameter of the passenger to be identified according to the preset rule model and the first historical clearance record, where the first statistical parameter is a first statistical parameter and is preset Statistics of the time interval used for a preset number of consecutive clearances within a time;
  • An output module is configured to determine whether the first statistical parameter exceeds a threshold range in the preset rule model, and if yes, output the passenger to be identified as a risk passenger.
  • a computer device includes a memory and one or more processors.
  • the memory stores computer-readable instructions.
  • the one or more processors are executed. The following steps: receiving the inputted identification information of the passenger to be identified, querying the first historical clearance record corresponding to the identification information; calculating the clearance frequency of the passenger to be identified based on the first historical clearance record; Whether the passenger to be identified is a high-frequency passenger, and if so, obtaining a preset rule model; calculating a first statistical parameter of the passenger to be identified according to the preset rule model and the first historical clearance record,
  • the first statistic parameter is a statistic of a time interval used for successively passing a preset number of times within a preset time; and determining whether the first statistic parameter exceeds a threshold range in the preset rule model, and if yes, outputting the waiting time Identify passengers as risk passengers.
  • One or more non-volatile computer-readable storage media storing computer-readable instructions.
  • the one or more processors execute the following steps: Identify the identity information of the passenger, query the first historical clearance record corresponding to the identity information; calculate the clearance frequency of the passenger to be identified according to the first historical clearance record; determine whether the passenger to be identified is high according to the clearance frequency Passengers who frequently pass through the customs, if yes, obtain a preset rule model; and calculate a first statistical parameter of the passenger to be identified according to the preset rule model and the first historical clearance record, where the first statistical parameter is a preset The statistics of the time interval used for a preset number of consecutive clearances within the time; and determining whether the first statistical parameter exceeds a threshold range in the preset rule model, and if so, outputting the passenger to be identified as a risk passenger.
  • FIG. 1 is an application scenario diagram of a risk passenger identification method according to one or more embodiments.
  • FIG. 2 is a schematic flowchart of a risk passenger identification method according to one or more embodiments.
  • FIG. 3 is a block diagram of a risk passenger identification device according to one or more embodiments.
  • FIG. 4 is a block diagram of a computer device in accordance with one or more embodiments.
  • the risk passenger identification method provided in this application can be applied to the application environment shown in FIG. 1.
  • the terminal 102 and the server 104 communicate through a network.
  • the terminal 102 may obtain an initial rule model from the server 104, and generate a preset rule model after parameter configuration.
  • the terminal 102 can be put into use after obtaining the preset rule model.
  • the terminal 102 can be placed in a public security place such as a customs house, and the security personnel enters the identity information of the passenger to be identified, so that the terminal can query the identity information corresponding to the identity information.
  • the first historical clearance record calculates the clearance frequency of the passenger to be identified based on the first historical clearance record; and determines whether the passenger to be identified is a high-frequency clearance passenger based on the clearance frequency; if it is a high-frequency clearance passenger, a preset rule model is obtained; The preset rule model and the first historical clearance record calculate the first statistical parameter of the passenger to be identified.
  • the first statistical parameter is a statistic of a time interval used for a predetermined number of consecutive crossings within a preset time; the terminal determines whether the first statistical parameter exceeds the preset Set the threshold range in the rule model, and if it is, output the passenger to be identified as a risk passenger.
  • the terminal 102 may be, but is not limited to, various personal computers, notebook computers, smart phones, tablet computers, and portable wearable devices.
  • the server 104 may be implemented by an independent server or a server cluster composed of multiple servers.
  • a risk passenger identification method is provided.
  • the method is applied to the terminal in FIG. 1 as an example for description, and includes the following steps:
  • S202 Receive the inputted identification information of the passenger to be identified, and query the first historical clearance record corresponding to the identification information.
  • the inputted identification information of the passenger to be identified may be information such as an identification number of the passenger to be identified, a mobile phone number, etc., which can uniquely indicate the passenger to be identified.
  • the passengers to be identified hand over their ID cards to the customs security personnel, and the security personnel place the ID card on the ID card reading device, so that the terminal can read the ID card reading device through the ID card reading device.
  • the first historical customs clearance record may be a customs clearance record for a preset time period corresponding to the identity information of the passenger to be identified, such as a customs clearance record within one year.
  • it can be set according to the preset time in the preset rule model. For example, when the preset quality inspection in the preset rule model is 6 months, a clearance record within one year can be obtained.
  • the first historical clearance record may be stored in a server in the background, or stored centrally in a cloud platform, so that it can be easily obtained.
  • S204 Calculate the clearance frequency of the passenger to be identified according to the first historical clearance record.
  • the clearance frequency of the passenger to be identified can be calculated according to the preset time period and the number of the first historical clearance records, for example, the clearance frequency of the passenger to be identified.
  • the number of first historical clearance records is higher than the preset time period.
  • S206 Determine whether the passenger to be identified is a high-frequency crossing passenger according to the crossing frequency, and if yes, obtain a preset rule model.
  • the high-frequency crossing passengers refer to passengers who pass the crossing times more than a preset value within a preset time period, and the preset value may be set based on experience, for example, 5 times per month. If it is determined that the passenger to be identified is a high-frequency passenger according to the clearance frequency, it is more likely that he is a risk passenger. Therefore, further judgment needs to be performed through a preset rule model, so that the terminal obtains the preset rule model for high-frequency passengers. Passing passengers make further judgments.
  • the preset rule model is generated after the terminal configures parameters according to the initial rule model obtained from the server.
  • the initial rule model can be:
  • K represents the observation time window, which is pre-configured, such as 30 days, 60 days, 90 days
  • n represents the number of consecutive customs clearance, which is pre-configured, such as 5, 10, 20, ...
  • t is The clearance records calculate that each time it passes the customs clearance n times in a row
  • Z represents the statistical method of the t value generated by all personnel clearance records, which is pre-configured, such as minimum, maximum, mean, variance, etc .
  • Tinf, Tsup] represents the threshold range, which is pre-configured, such as [0,5].
  • S208 Calculate a first statistical parameter of the passenger to be identified according to a preset rule model and a first historical clearance record, where the first statistical parameter is a statistic of a time interval used for a preset number of consecutive crossings within a preset time.
  • the first statistical parameter of the passenger to be identified is calculated according to the first historical clearance record.
  • K in the preset rule model is configured as a preset time
  • n is configured as a preset number of times
  • Z is configured as a statistical quantity.
  • the terminal first uses the current time as the starting point, shifts the preset time to the historical time to obtain the observation time window, and then uses the current pass record as the starting point to move to the historical time, and obtains the first time interval of the preset number of consecutive passes , And then start to the historical time from the previous clearance record of the current clearance record to obtain the second time interval of the next consecutive clearance preset number of times until it reaches the earliest one clearance record in the observation time window to obtain several time intervals , And then calculate the statistics of the final time interval according to the obtained time interval, such as the average value as the first statistical parameter.
  • S210 Determine whether the first statistical parameter exceeds a threshold range in the preset rule model, and if yes, output the passenger to be identified as a risk passenger.
  • the threshold range is configured when the terminal obtains the initial rule model from the server, that is, the value of [Tinf, Tsup] above, and compares the obtained first statistical parameter with the threshold range. If a statistical parameter is not within the threshold, it indicates that the passenger to be identified corresponding to the first statistical parameter is a risk passenger.
  • the above-mentioned risk passenger identification method first determines whether the passenger to be identified is a high-frequency passenger through the first historical clearance record, and if so, continues to determine whether the passenger to be identified is a risk passenger through a preset rule model and avoids passing ordinary high-frequency passengers
  • the identification of passengers as risky passengers improves the accuracy of identification.
  • the above-mentioned risk passenger identification method may further include: sending an initial rule model acquisition request to the server, and receiving the initial rule model returned by the server; receiving input configuration parameters corresponding to the initial rule model; obtaining and configuring parameters Corresponding verification rules, and the configuration parameters are verified through the obtained verification rules; when the configuration parameters are successfully verified, a preset rule model is generated according to the configuration parameters and the initial rule model.
  • the terminal can obtain an initial rule model from the server, and the initial rule model is the statistics [Z] ⁇ [Tinf, Tsup] of the time interval [t] used for continuous customs clearance [n] times in the above [K] time,
  • the terminal can display the initial rule model, so that users can configure the parameters of the initial rule model.
  • the above K, n, Z, Tinf, and Tsup users can perform Configuration.
  • the configuration parameters can be verified.
  • the format can be verified first, and then the verification rules can be verified.
  • the selected statistic Z is the mean, then The input parameter n must be greater than or equal to Tsup, because a person usually only shuts down once a day.
  • the terminal may perform parameter configuration on the initial rule model, so that the initial rule model is more personalized.
  • the above-mentioned risk passenger identification method may further include: sending an initial rule model acquisition request to the server, and receiving the initial rule model returned by the server; obtaining the current geographic identifier, and downloading the first geographic identifier corresponding to the current geographic identifier from the server.
  • Second historical clearance record the second historical clearance record carries the risk passenger label and ordinary passenger label; obtain the third historical clearance record corresponding to the high-frequency passenger in the second historical clearance record; calculate the third historical clearance record according to the initial rule model Different types of second statistical parameters of the time interval t used to pass the n consecutive times in the time period K of the passenger corresponding to the medium risk passenger tag, and the time interval t used to pass the n consecutive times of n in the time period K of the passenger corresponding to the ordinary passenger tag.
  • the third statistical parameter of different types; the time period K, the number of consecutive crossings n, and the type of the statistical parameter are selected according to the second statistical parameter and the third statistical parameter of the corresponding type; according to the type and time period of the selected statistical parameter K.
  • the value of the number of consecutive crossings n generates a preset rule model.
  • the terminal can obtain an initial rule model from the server, and the initial rule model is the statistics [Z] ⁇ [Tinf, Tsup] of the time interval [t] used for continuous customs clearance [n] times in the above [K] time, Among them, the types of K, n, Z, Tinf, and Tsup, Z need to be configured to obtain the corresponding preset rule model.
  • the configuration was performed by the user.
  • the terminal can The second historical clearance record corresponding to the region identifier, for example, the second historical clearance record of Shanghai, generates default setting parameters, so as to generate a corresponding preset rule model.
  • the terminal first selects a third historical clearance record corresponding to the high-frequency passenger from the second historical record.
  • the terminal may separately calculate the clearance frequency of each passenger in a preset time period, and use the calculated clearance The frequency is compared with a preset value to determine whether the passenger corresponding to the second historical record is a high-frequency passenger, and if so, a third historical clearance record corresponding to the high-frequency passenger is obtained.
  • the second historical clearance record and the third historical clearance record are historical clearance records, so they have corresponding risk passenger labels and ordinary passenger labels. This is because in the past, the customs security inspection staff marked the results of the random inspection.
  • the terminal can divide the third historical clearance record into two groups based on the risk passenger label and the ordinary passenger label, and calculate the different types of time intervals t used for consecutive n clearances in the time period K corresponding to the two third historical clearance records.
  • the second statistical parameter and the third statistical parameter K, n, and the type of the statistical parameter Z can be selected within a preset range.
  • the parameters K and n corresponding to the third or third statistical parameter and the types of the statistical parameters are the most preset statistical parameters in the rule model, and the threshold ranges Tinf and Tsup are calculated according to the second and third statistical parameters.
  • the types of K, n, Z, Tinf, and Tsup, Z generate preset rule models.
  • the terminal may select a default configuration of some parameters and manually configure another parameter.
  • the terminal may select a default configuration of some parameters and manually configure another parameter.
  • the terminal may obtain the second historical clearance record according to the geographical identifier, select a third historical clearance record corresponding to the high-frequency user in the second historical clearance record, and generate a default parameter configuration according to the third historical clearance record, so that the default The parameter configuration is related to the region, which improves the accuracy of the parameter configuration, thereby making the preset rule model more applicable and more accurate.
  • the method for generating the preset rule model may include: obtaining a fourth historical clearance record, the fourth clearance record carrying a passenger tag; and selecting a fifth clearance record corresponding to the high-frequency passenger from the fourth clearance record. ; Extract the initial feature parameters from the fifth pass record and the passenger tags corresponding to the fifth pass record, and perform feature gain evaluation on the initial feature parameters; select the target feature parameters from the first feature parameter according to the evaluation result of the feature gain evaluation; When the extracted target feature parameter is a statistic of the first statistical parameter being a time interval used for a predetermined number of consecutive passes in a preset time, the threshold range corresponding to the statistic is set according to the passenger tag; the preset time is based on the first statistical parameter The statistics of the time interval and the threshold range used for the preset number of consecutive clearances in the interior generate a preset rule model.
  • extracting the initial characteristic parameters from the fifth clearance record may include: obtaining the current business rule and querying the filtering characteristic parameters corresponding to the current business rule; and calculating the filtering characteristic parameters corresponding to the fifth passing record as the initial Characteristic Parameters.
  • the fourth clearance record includes an unselected clearance record and an unchecked clearance record, where the unchecked clearance record includes a clearance record corresponding to a risk user and a clearance record corresponding to an ordinary user.
  • the fifth clearance record corresponding to the high-frequency passenger is selected from the fourth clearance record.
  • the selection method reference may be made to the method of selecting the third clearance record from the second clearance record above, and details are not described herein again.
  • extract the initial feature parameters and corresponding passenger tags from the fifth pass record, such as risk user tags or ordinary user tags, and then perform feature gain assessment on the initial feature parameters.
  • the way of gain assessment may be through a decision tree, specifically As can be seen below, the target feature parameter is selected from the first feature parameter according to the evaluation result of the feature gain evaluation.
  • a field that has the most differentiation between the risk user and the ordinary user may be selected.
  • each field may be selected for the risk user and the high user.
  • Frequent users to distinguish and obtain the field with the highest correctness of the discrimination result when the extracted target feature parameter is the statistics of the time interval used to pass the customs for a preset number of times within a preset time, the threshold range corresponding to the statistics is set according to the passenger tag ; Generating a preset rule model according to the first statistical parameter for the statistics of the time interval used for successively passing the preset number of times in the preset time and the threshold range.
  • a decision tree is a tree structure composed of nodes and directed edges that is used to classify instances.
  • nodes There are two types of nodes: internal nodes and leaf nodes. Among them, the internal nodes represent test conditions for features or attributes, and the leaf nodes represent classification.
  • the specific method of using the decision tree model for classification is: starting from the root node, testing a certain feature of the instance, and assigning the instance to its child nodes according to the test results. When it is possible to reach a leaf node along this branch or reach another internal node, the new test condition is used to recursively execute until a leaf node is reached. When the leaf node is reached, the final classification result is obtained.
  • the terminal obtains the fields in the fifth clearance record. Because the fields in the fifth clearance record are generally relatively small, and only include the name, age, ID number, and clearance time, etc., the contained field features are less, so they are acquiring
  • first expand the fifth clearance record to generate new features for example, obtain the current business rules, such as clearance information type feature rules, frequency class dynamic feature rules, frequency class static feature rules, and then, and Query the filtering feature parameter corresponding to the current business rule; calculate the filtering feature parameter corresponding to the fifth pass record as the initial feature parameter, for example, generate a new feature field from the business rule and the fields in the pass record, for example, according to the pass information class feature Rule generation "15 days before, the last pass time", "30 days before, the number of gates within 30 days", etc .; according to the static characteristics of frequency rules, "the number of customs clearance within 15 days, within 30 days", “30 days before, 7 days Customs clearance days "and so on; generate” 15 days before, 90 days based on frequency
  • the method of training the model includes: collecting sample data, dividing the sample data into training set data and test set data; extracting the first feature parameter and the first target category from the training set data; A feature parameter is used to evaluate the gain of feature information, and the field with the highest degree of discrimination is obtained according to the result of the assessment of the feature information, that is, the first statistical parameter is a statistic of a time interval used for successively passing a preset number of times within a preset time.
  • the field performs data distribution analysis, that is, statistical analysis, and sets a threshold range according to the type of statistic, and generates a preset rule model according to the set threshold range, the selected statistic, and the like; the second feature is extracted from the test set data Parameters and the second target category; verify the initial decision evaluation model based on the second feature parameter and the second target category; optimize and adjust the decision tree structure in the initial decision tree evaluation model based on the first verification result and generate the final risk assessment model.
  • the decision tree model uses the ID3 algorithm. Based on the principle that a smaller decision tree is better than a large decision tree, according to the information gain evaluation and selection features, the feature with the largest information gain is selected as the criterion each time. Child node.
  • the information gain indicates the degree to which the uncertainty of the information of class Y is reduced by knowing the information of feature X.
  • the information gain g (D, A) of feature A on the training data set D is defined as the difference between the empirical entropy H (D) of set D and the empirical conditional entropy H (D
  • g (D, A) is the information gain of feature A on training data set D
  • H (D) is the empirical entropy of training data set D
  • A) is the empirical conditional entropy of feature A on data set D .
  • the feature selection method is to calculate the information gain of each feature of the training data set (or subset) and select the feature with the largest information gain.
  • the algorithm for calculating the information gain is as follows: Its input is the training data set D and feature A, and the output is the information gain g (D, A) of feature A versus training data set D.
  • C k is the number of samples corresponding to the first target category
  • K is the number of categories of the first target category.
  • the first target category is divided into risk passengers and ordinary passengers.
  • value (A) wherein A is a set of all values, i is a value characteristic of the A, D i is a training data set D wherein A is a sample set of values of i,
  • all the values of feature A corresponding to the gender feature parameter are male and female. Can be represented by 1, value (A) is (0,1).
  • the selected feature parameters can be adjusted, such as adjusting statistics, etc., and the decision tree model is rebuilt and verified until The verification results are within the error range.
  • the verification results can be within the error range.
  • And obtaining the most distinguished feature (field) according to the feature information evaluation result may include calculating the information gain of each feature parameter corresponding to the first feature parameter; selecting the feature with the largest information gain as a judgment module to establish a child node;
  • the training set data is divided into subset data, and the subset data is branched recursively until the data corresponding to all branch nodes corresponds to the same target category.
  • a decision tree is established recursively.
  • the recursive definition of Hunt algorithm is as follows: If Dt All records belong to the same class, then t is a leaf node, marked with yt. If Dt contains records belonging to multiple classes, then select an attribute test condition to divide the records into smaller subsets. For each output of the test condition, a child node is created and the records in Dt are distributed to the child nodes based on the test results. Then, for each child node, the algorithm is called recursively.
  • the server extracts the second feature parameter and the second target category one by one from each sample of the test set data.
  • the second feature parameter is the same as the category of the first feature parameter.
  • the second feature parameter may be a feature with the largest information gain selected, and details are not described herein again.
  • the second target category is the category of the safety inspection results.
  • the second target category is divided into two categories: risk passengers and ordinary passengers.
  • the server calculates from the test data set negative sample data that matches the combination of the feature parameter corresponding to each classification node in the preset rule model, that is, continuous in the preset time.
  • the statistics of the time interval used to pass the preset number of times calculate the proportion of the statistical negative sample data in the total negative sample data in the test data set, and verify the decision tree model based on the calculated ratio.
  • the server may set a preset tolerance error. When the calculated absolute difference value is less than the preset tolerance error, the verification passes, and when the calculated absolute difference value is greater than the preset tolerance error, the verification fails. When the verification fails, the server can add the sample data in the test data set to the training data set, expand the sample capacity to train the preset rule model, and adjust the preset rule model.
  • the characteristics are generated according to the business rules, so that the characteristics are diversified, and then the degree of differentiation between the risk user and the ordinary user in the field can be more accurately analyzed.
  • steps in the flowchart of FIG. 2 are sequentially displayed according to the directions of the arrows, these steps are not necessarily performed sequentially in the order indicated by the arrows. Unless explicitly stated in this document, the execution of these steps is not strictly limited, and these steps can be performed in other orders. Moreover, at least a part of the steps in FIG. 2 may include multiple sub-steps or stages. These sub-steps or stages are not necessarily performed at the same time, but may be performed at different times. The execution of these sub-steps or stages The sequence is not necessarily performed sequentially, but may be performed in turn or alternately with other steps or at least a part of the sub-steps or stages of other steps.
  • a risk passenger identification device including: a first receiving module 100, a crossing frequency calculation module 200, a preset rule model acquisition module 300, and a first statistical parameter calculation module 400. And output module 500, where:
  • the first receiving module 100 is configured to receive inputted identity information of a passenger to be identified, and query a first historical clearance record corresponding to the identity information.
  • the clearance frequency calculation module 200 is configured to calculate the clearance frequency of the passenger to be identified according to the first historical clearance record.
  • the preset rule model acquisition module 300 is configured to determine whether the passenger to be identified is a high-frequency transit passenger according to the crossing frequency, and if yes, obtain a preset rule model.
  • the first statistical parameter calculation module 400 is configured to calculate a first statistical parameter of a passenger to be identified according to a preset rule model and a first historical clearance record.
  • the first statistical parameter is the first statistical parameter and is a preset number of consecutive crossings within a preset time. Statistics for the time interval used.
  • the output module 500 is configured to determine whether the first statistical parameter exceeds a threshold range in a preset rule model, and if yes, output the passenger to be identified as a risk passenger.
  • the risk passenger identification device may further include:
  • the second receiving module is configured to send an initial rule model acquisition request to the server, and receive the initial rule model returned by the server.
  • the third receiving module is configured to receive input configuration parameters corresponding to the initial rule model.
  • the verification module is configured to obtain a verification rule corresponding to the configuration parameter, and verify the configuration parameter by using the obtained verification rule.
  • a first generating module configured to generate a preset rule model according to the configuration parameter and the initial rule model when the configuration parameter verification is successful.
  • the risk passenger identification device may further include:
  • a fourth receiving module is configured to send an initial rule model acquisition request to the server, and receive the initial rule model returned by the server.
  • the clearance record acquisition module is used to obtain the current geographical identification, and download a second historical clearance record corresponding to the current geographical identification from the server.
  • the second historical clearance record carries the risk passenger label and the ordinary passenger label; and obtains the second historical clearance record.
  • the third historical clearance record for high-frequency passengers.
  • the second statistical parameter calculation module is configured to calculate, according to the initial rule model, different types of second statistical parameters of the time interval t used to pass the n consecutive times in the period K of the passenger corresponding to the risk passenger label in the third historical clearance record, and the general Different types of third statistical parameters of the time interval t used to pass the n consecutive times in the time period K of the passenger corresponding to the passenger tag.
  • the first selection module is configured to select the value of the time period K, the number of consecutive crossings n, and the type of the statistical parameter according to the second statistical parameter and the third statistical parameter of the corresponding type.
  • a second generating module configured to generate a preset rule model according to the selected statistical parameter type, K time, and the number of consecutive crossings n.
  • the risk passenger identification device may further include:
  • a fourth historical clearance record acquisition module is used to obtain a fourth historical clearance record, and the fourth clearance record carries a passenger tag.
  • the second selection module is configured to select a fifth clearance record corresponding to the high-frequency passenger from the fourth clearance record.
  • a feature gain evaluation module is used to extract initial feature parameters from the fifth pass record and a passenger tag corresponding to the fifth pass record, and perform a feature gain evaluation on the initial feature parameters.
  • a third selection module is configured to select a target feature parameter from the first feature parameter according to an evaluation result of the feature gain evaluation.
  • a setting module configured to set a threshold range corresponding to the statistic according to the passenger tag when the extracted target feature parameter is the statistic of the time interval used for successively passing the preset number of times within a preset time.
  • a third generation module configured to generate a preset rule model according to the first statistical parameter for a statistic of a time interval used for successively passing a preset number of times in a preset time and a threshold range.
  • the feature gain evaluation module may include:
  • the query unit is used to obtain the current business rule and query the filtering characteristic parameters corresponding to the current business rule.
  • an initial feature parameter calculation unit configured to calculate a screening feature parameter corresponding to the fifth pass record as the initial feature parameter.
  • Each module in the above-mentioned risk passenger identification device may be implemented in whole or in part by software, hardware, and a combination thereof.
  • the above-mentioned modules may be embedded in the hardware form or independent of the processor in the computer device, or may be stored in the memory of the computer device in the form of software, so that the processor calls and performs the operations corresponding to the above modules.
  • a computer device is provided.
  • the computer device may be a terminal, and its internal structure diagram may be as shown in FIG. 4.
  • the computer equipment includes a processor, a memory, a network interface, a display screen, and an input device connected through a system bus.
  • the processor of the computer device is used to provide computing and control capabilities.
  • the memory of the computer device includes a non-volatile storage medium and an internal memory.
  • the non-volatile storage medium stores an operating system and computer-readable instructions.
  • the internal memory provides an environment for operating the operating system and computer-readable instructions in a non-volatile storage medium.
  • the network interface of the computer device is used to communicate with an external terminal through a network connection.
  • the computer-readable instructions are executed by a processor to implement a risk passenger identification method.
  • the display screen of the computer equipment may be a liquid crystal display screen or an electronic ink display screen
  • the input device of the computer equipment may be a touch layer covered on the display screen, or a button, a trackball or a touchpad provided on the computer equipment casing. , Or an external keyboard, trackpad, or mouse.
  • FIG. 4 is only a block diagram of a part of the structure related to the solution of the present application, and does not constitute a limitation on the computer equipment to which the solution of the present application is applied.
  • the specific computer equipment may be Include more or fewer parts than shown in the figure, or combine certain parts, or have a different arrangement of parts.
  • a computer device includes a memory and one or more processors.
  • Computer-readable instructions are stored in the memory.
  • the one or more processors are caused to perform the following steps: receiving input to be identified Passenger identity information, query the first historical clearance record corresponding to the identity information; calculate the clearance frequency of the passenger to be identified based on the first historical clearance record; determine whether the passenger to be identified is a high-frequency clearance passenger based on the clearance frequency, and if so, obtain A preset rule model; calculating a first statistical parameter of the passenger to be identified according to the preset rule model and the first historical clearance record, the first statistical parameter is a statistic of a time interval used for a predetermined number of consecutive crossings within a preset time; and Whether a statistical parameter exceeds a threshold range in a preset rule model, and if so, output the passenger to be identified as a risk passenger.
  • the processor when the processor executes the computer-readable instructions, the processor further implements the following steps: sending an initial rule model acquisition request to the server, and receiving the initial rule model returned by the server; receiving input configuration parameters corresponding to the initial rule model; obtaining A verification rule corresponding to the configuration parameter, and the configuration parameter is verified through the obtained verification rule; and when the configuration parameter verification is successful, a preset rule model is generated according to the configuration parameter and the initial rule model.
  • the processor when the processor executes the computer-readable instructions, the processor further implements the following steps: sending an initial rule model acquisition request to the server, and receiving the initial rule model returned by the server; obtaining the current region identifier, and downloading the current region identifier from the server and the current region identifier
  • the second historical clearance record carries the risk passenger label and ordinary passenger label
  • obtain the third historical clearance record corresponding to the high-frequency passenger in the second historical clearance record calculate the third according to the initial rule model
  • Different types of second statistical parameters of the time interval t used to pass the n consecutive times in the time period K of the passenger corresponding to the risk passenger tag in the historical clearance record, and the time taken to pass the n consecutive times of n in the time period K of the passenger corresponding to the ordinary passenger tag
  • a third type of third statistical parameter of interval t selecting the time period K, the number of consecutive crossings n, and the type of the statistical parameter according to the second statistical parameter and the third statistical parameter of the corresponding type; and according to the selected statistical parameter
  • the generation method of the preset rule model involved when the processor executes the computer-readable instructions may include: obtaining a fourth historical clearance record, the fourth clearance record carrying a passenger tag; and from the fourth clearance record Select the fifth pass record corresponding to the high-frequency passenger; extract the initial feature parameters and the passenger tags corresponding to the fifth pass record from the fifth pass record, and perform feature gain evaluation on the initial feature parameters; according to the evaluation result of the feature gain evaluation,
  • the target feature parameter is selected from the first feature parameter.
  • the extracted target feature parameter is the statistic of the time interval used for the consecutive number of consecutive passes in a preset time within a preset time
  • the threshold range corresponding to the statistic is set according to the passenger label.
  • extracting the initial characteristic parameters from the fifth clearance record implemented when the processor executes the computer-readable instructions may include: obtaining a current business rule, and querying the filtering characteristic parameters corresponding to the current business rule; and calculating The screening feature parameter corresponding to the fifth pass record is used as the initial feature parameter.
  • One or more non-volatile computer-readable storage media storing computer-readable instructions.
  • the one or more processors execute the following steps: Identify the passenger ’s identity information and query the first historical clearance record corresponding to the identity information; calculate the clearance frequency of the passenger to be identified based on the first historical clearance record; determine whether the passenger to be identified is a high-frequency transit passenger based on the clearance frequency; if so, then Obtaining a preset rule model; calculating a first statistical parameter of the passenger to be identified according to the preset rule model and the first historical clearance record, the first statistical parameter is a statistic of a time interval used for a preset number of consecutive crossings within a preset time; and judgment Whether the first statistical parameter exceeds a threshold range in the preset rule model, and if yes, output the passenger to be identified as a risk passenger.
  • the following steps are further implemented: sending an initial rule model acquisition request to the server, and receiving the initial rule model returned by the server; receiving input configuration parameters corresponding to the initial rule model; Obtain a verification rule corresponding to the configuration parameter, and verify the configuration parameter through the obtained verification rule; and when the configuration parameter verification is successful, generate a preset rule model according to the configuration parameter and the initial rule model.
  • the following steps are further implemented: sending an initial rule model acquisition request to the server, and receiving the initial rule model returned by the server; obtaining the current region identifier, and downloading the current region identifier from the server and the current region Identify the corresponding second historical clearance record, which carries the risk passenger label and ordinary passenger label; obtain the third historical clearance record corresponding to the high-frequency passenger in the second historical clearance record; calculate the first historical clearance model according to the initial rule model Three historical statistics of different types of second statistical parameters of the interval t used to pass the passenger n consecutive times in the time period K of the passenger corresponding to the risk passenger tag and the passenger used to pass the n consecutive times in the time period K of the passenger corresponding to the ordinary passenger tag Different types of third statistical parameters of time interval t; selecting the time period K, the number of consecutive crossings n, and the type of statistical parameters according to the second statistical parameters and the third statistical parameters of the corresponding type; and according to the selected statistical parameters Type, time period K, and the number
  • the generation method of the preset rule model involved when the computer-readable instructions are executed by the processor may include: obtaining a fourth historical clearance record, the fourth clearance record carrying a passenger tag; and from the fourth clearance record The fifth clearance record corresponding to the high-frequency passenger is selected in the middle; the initial characteristic parameters and the passenger tags corresponding to the fifth clearance record are extracted from the fifth clearance record, and the characteristic gain evaluation is performed on the initial characteristic parameters; according to the evaluation result of the characteristic gain evaluation Select the target feature parameter from the first feature parameter; when the extracted target feature parameter is the statistic of the time interval used to pass the preset number of consecutive passes within a preset time, set a threshold corresponding to the statistic according to the passenger tag A range; and generating a preset rule model based on the first statistical parameter for the statistics of the time interval used for successively passing the preset number of times within a preset time and the threshold range.
  • the extraction of the initial characteristic parameters from the fifth clearance record implemented when the computer-readable instructions are executed by the processor may include: obtaining a current business rule and querying the filtering characteristic parameters corresponding to the current business rule; and Calculate the screening feature parameters corresponding to the fifth pass record as the initial feature parameters.
  • Non-volatile memory may include read-only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory.
  • Volatile memory can include random access memory (RAM) or external cache memory.
  • RAM is available in various forms, such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), dual data rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous chain Synchlink DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).
  • SRAM static RAM
  • DRAM dynamic RAM
  • SDRAM synchronous DRAM
  • DDRSDRAM dual data rate SDRAM
  • ESDRAM enhanced SDRAM
  • SLDRAM synchronous chain Synchlink DRAM
  • Rambus direct RAM
  • DRAM direct memory bus dynamic RAM
  • RDRAM memory bus dynamic RAM

Landscapes

  • Business, Economics & Management (AREA)
  • Human Resources & Organizations (AREA)
  • Engineering & Computer Science (AREA)
  • Strategic Management (AREA)
  • Economics (AREA)
  • Tourism & Hospitality (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Marketing (AREA)
  • Development Economics (AREA)
  • General Business, Economics & Management (AREA)
  • Educational Administration (AREA)
  • Physics & Mathematics (AREA)
  • Quality & Reliability (AREA)
  • Operations Research (AREA)
  • Game Theory and Decision Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Primary Health Care (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

一种风险旅客识别方法,包括:接收输入的待识别旅客的身份信息,查询与身份信息对应的第一历史过关记录(S202);根据第一历史过关记录计算待识别旅客的过关频率(S204);根据过关频率判断待识别旅客是否为高频过关旅客,如果是,则获取预设规则模型(S206);根据预设规则模型以及第一历史过关记录计算待识别旅客的第一统计参数,第一统计参数为预设时间内连续过关预设次数所用时间间隔的统计量(S208);判断第一统计参数是否超过预设规则模型中的阈值范围,如果是,则输出待识别旅客为风险旅客(S210)。

Description

风险旅客方法、装置、计算机设备和存储介质
相关申请的交叉引用
本申请要求于2018年7月18日提交中国专利局,申请号为2018107883333,申请名称为“风险旅客识别方法、装置、计算机设备和存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及一种风险旅客识别方法、装置、计算机设备和存储介质。
背景技术
机场、口岸等出入境场所每天都会大量旅客过关,其中不乏一些走私、偷渡等不法分子。
出入境场所的安防安检人员在对旅客进行安全检查时,通常是根据自身工作经验对旅客进行察言观色来判断旅客是否存在安全风险。但是,由于每天过关的人流量很大,单单依靠安检人员的人工检查能够排查到的具有安全风险的旅客是很有限的,导致出入境场所安防检查的准确率很低,并使得许多不法分子成为漏网之鱼。
发明内容
根据本申请公开的各种实施例,提供一种风险旅客识别方法、装置、计算机设备和存储介质。
一种风险旅客识别方法,包括:
接收输入的待识别旅客的身份信息,查询与所述身份信息对应的第一历史过关记录;
根据所述第一历史过关记录计算待识别旅客的过关频率;
根据所述过关频率判断所述待识别旅客是否为高频过关旅客,如果是,则获取预设规则模型;
根据所述预设规则模型以及所述第一历史过关记录计算所述待识别旅客的第一统计参数,所述第一统计参数为预设时间内连续过关预设次数所用时间间隔的统计量;及
判断所述第一统计参数是否超过所述预设规则模型中的阈值范围,如果是,则输出所述待识别旅客为风险旅客。
一种风险旅客识别装置,包括:
第一接收模块,用于接收输入的待识别旅客的身份信息,查询与所述身份信息对应的第一历史过关记录;
过关频率计算模块,用于根据所述第一历史过关记录计算待识别旅客的过关频率;
预设规则模型获取模块,用于根据所述过关频率判断所述待识别旅客是否为高频过关旅客,如果是,则获取预设规则模型;
第一统计参数计算模块,用于根据所述预设规则模型以及所述第一历史过关记录计算所述待识别旅客的第一统计参数,所述第一统计参数为第一统计参数为预设时间内连续过关预设次数所用时间间隔的统计量;及
输出模块,用于判断所述第一统计参数是否超过所述预设规则模型中的阈值范围,如果是,则输出所述待识别旅客为风险旅客。
一种计算机设备,包括存储器和一个或多个处理器,所述存储器中储存有计算机可读指令,所述计算机可读指令被所述处理器执行时,使得所述一个或多个处理器执行以下步骤:接收输入的待识别旅客的身份信息,查询与所述身份信息对应的第一历史过关记录;根据所述第一历史过关记录计算待识别旅客的过关频率;根据所述过关频率判断所述待识别旅客是否为高频过关旅客,如果是,则获取预设规则模型;根据所述预设规则模型以及所述第一历史过关记录计算所述待识别旅客的第一统计参数,所述第一统计参数为预设时间内连续过关预设次数所用时间间隔的统计量;及判断所述第一统计参数是否超过所述预设规则模型中的阈值范围,如果是,则输出所述待识别旅客为风险旅客。
一个或多个存储有计算机可读指令的非易失性计算机可读存储介质,计算机可读指令被一个或多个处理器执行时,使得一个或多个处理器执行以下步骤:接收输入的待识别旅客的身份信息,查询与所述身份信息对应的第一历史过关记录;根据所述第一历史过关记录计算待识别旅客的过关频率;根据所述过关频率判断所述待识别旅客是否为高频过关旅客,如果是,则获取预设规则模型;根据所述预设规则模型以及所述第一历史过关记录计算所述待识别旅客的第一统计参数,所述第一统计参数为预设时间内连续过关预设次数所用时间间隔的统计量;及判断所述第一统计参数是否超过所述预设规则模型中的阈值范围,如果是,则输出所述待识别旅客为风险旅客。
本申请的一个或多个实施例的细节在下面的附图和描述中提出。本申请的其它特征和优点将从说明书、附图以及权利要求书变得明显。
附图说明
为了更清楚地说明本申请实施例中的技术方案,下面将对实施例中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其它的附图。
图1为根据一个或多个实施例中风险旅客识别方法的应用场景图。
图2为根据一个或多个实施例中风险旅客识别方法的流程示意图。
图3为根据一个或多个实施例中风险旅客识别装置的框图。
图4为根据一个或多个实施例中计算机设备的框图。
具体实施方式
为了使本申请的技术方案及优点更加清楚明白,以下结合附图及实施例,对本申请进行进一步详细说明。应当理解,此处描述的具体实施例仅仅用以解释本申请,并不用于限定本申请。
本申请提供的风险旅客识别方法,可以应用于如图1所示的应用环境中。终端102与服务器104通过网络进行通信。终端102可以从服务器104获取到初始规则模型,并经过参数配置后生成预设规则模型。终端102在获取到预设规则模型后则可以投入使用,例如终端102可以放置在海关等公共安检场所,安检人员向其输入待识别旅客的身份信息,从而终端可以查询到与该身份信息对应的第一历史过关记录,根据第一历史过关记录计算待识别旅客的过关频率;并根据过关频率判断待识别旅客是否为高频过关旅客,如果是高频过关旅客,则获取预设规则模型;根据预设规则模型以及第一历史过关记录计算待识别旅客的第一统计参数,第一统计参数为预设时间内连续过关预设次数所用时间间隔的统计量;终端判断第一统计参数是否超过预设规则模型中的阈值范围,如果是,则输出待识别旅客为风险旅客。从而完成风险旅客的识别,避免将普通高频过关旅客识别为风险旅客,提高了识别的准确性。终端102可以但不限于是各种个人计算机、笔记本电脑、智能手机、平板电脑和便携式可穿戴设备,服务器104可以用独立的服务器或者是多个服务器组成的服务器集群来实现。
在其中一个实施例中,如图2所示,提供了一种风险旅客识别方法,以该方法应用于图1中的终端为例进行说明,包括以下步骤:
S202:接收输入的待识别旅客的身份信息,查询与身份信息对应的第一历史过关记录。
具体地,输入的待识别旅客的身份信息可以是待识别旅客的身份证号、手机号等可以唯一指示待识别旅客的信息。一般地,在海关进行安检时,待识别旅客将身份证等交给海关安检人员,安检人员将身份证放置在身份证读取装置上,从而终端可以通过该身份证读取装置读取到待识别旅客的身份信息。
第一历史过关记录可以是与待识别旅客的身份信息对应的预设时间段的过关记录,例如一年以内的过关记录。可选地可以根据预设规则模型中的预设时间进行设置,例如当预设规则模型中的预设质检为6个月,则可以获取到一年以内的过关记录等。第一历史过关记录可以是存储在后台的服务器中,或者是集中存储在云平台中,从而可以方便地获取。
S204:根据第一历史过关记录计算待识别旅客的过关频率。
具体地,在获取到预设时间段的第一历史过关记录后,可以根据预设时间段以及第一历史过关记录的条数计算得到待识别旅客的过关频率,例如,待识别旅客的过关频率=第一历史过关记录的条数比上预设时间段。
S206:根据过关频率判断待识别旅客是否为高频过关旅客,如果是,则获取预设规则模型。
具体地,高频过关旅客是指在预设时间段内的过关次数大于预设值的旅客,该预设值 可以是根据经验进行设置的,例如,每个月5次等。如果根据过关频率判断该待识别旅客为高频过关旅客则其为风险旅客的可能性较大,因此需要通过预设规则模型进行进一步的判断,从而终端获取到预设规则模型用于对高频过关旅客进行进一步的判断。
预设规则模型是终端根据从服务器获取的初始规则模型进行参数配置后所生成的,通过该预设规则模型可以将高频过关旅客划分为风险旅客和普通高频过关旅客。该初始规则模型可以是:
[K]时间内连续通关[n]次所用时间间隔[t]的统计量[Z]∈[Tinf,Tsup]
其中各个参数的含义如下:K表示观察时间窗,是预先配置的,如30天、60天、90天;n表示连续通关次数,是预先配置的,如5,10,20,…;t由过关记录计算得出,每一次过关其最近连续通关n次所用时间;Z表示人员全部通关记录所产生t值的统计方法,是预先配置的,如最小值,最大值,均值,方差等;[Tinf,Tsup]表示阈值范围,是预先配置的,如[0,5]。
S208:根据预设规则模型以及第一历史过关记录计算待识别旅客的第一统计参数,第一统计参数为预设时间内连续过关预设次数所用时间间隔的统计量。
具体地,根据第一历史过关记录计算待识别旅客的第一统计参数,预设规则模型中的K配置为预设时间,n配置为预设次数,Z配置为统计量。
在实际应用中,终端首先以当前时间为起点,向历史时间推移预设时间以得到观察时间窗,然后以当前过关记录为起点向历史时间推移,获取到连续过关预设次数的第一时间间隔,然后以当前过关记录的前一次过关记录为起点向历史时间推移,获取到下一个连续过关预设次数的第二时间间隔,直至推移至观察时间窗中最早的一次过关记录得到若干个时间间隔,然后根据所得到的时间间隔计算得到最终的时间间隔的统计量,例如平均值等作为第一统计参数。
S210:判断第一统计参数是否超过预设规则模型中的阈值范围,如果是,则输出待识别旅客为风险旅客。
具体地,阈值范围是终端在从服务器获取到初始规则模型时所配置的,即上文中的[Tinf,Tsup]的取值,将所得到的第一统计参数与该阈值范围进行比较,如果第一统计参数不在该阈值范围内,则表示第一统计参数对应的待识别旅客为风险旅客。
上述风险旅客识别方法,首先通过第一历史过关记录判断待识别旅客是否为高频过关旅客,如果是,则继续通过预设规则模型来判断待识别旅客是否为风险旅客,避免将普通高频过关旅客识别为风险旅客,提高了识别的准确性。
在其中一个实施例中,上述风险旅客识别方法还可以包括:向服务器发送初始规则模型获取请求,并接收服务器返回的初始规则模型;接收输入的与初始规则模型对应的配置参数;获取与配置参数对应的校验规则,并通过所获取的校验规则对配置参数进行校验;当配置参数校验成功时,则根据配置参数以及初始规则模型生成预设规则模型。
具体地,终端可以从服务器获取到初始规则模型,初始规则模型即上文中的[K]时间 内连续通关[n]次所用时间间隔[t]的统计量[Z]∈[Tinf,Tsup],终端在接收到该初始规则模型后,可以将该初始规则模型进行显示,从而使用者可以对该初始规则模型的参数进行配置,例如上述的K、n、Z、Tinf以及Tsup等用户均可以进行配置。
当终端接收到用户输入的配置参数时,可以对配置参数进行校验,例如首先可以对格式进行校验,其次可以通过校验规则进行校验,例如当所选择的统计量Z为均值时,则所输入的参数n必须要大于等于Tsup,这是因为一个人通常一天只有一次过关机会。
上述实施例中,终端可以对初始规则模型进行参数配置,使得初始规则模型更加个性化。
在其中一个实施例中,上述风险旅客识别方法还可以包括:向服务器发送初始规则模型获取请求,并接收服务器返回的初始规则模型;获取当前地域标识,并从服务器下载与当前地域标识对应的第二历史过关记录,第二历史过关记录中携带有风险旅客标签和普通旅客标签;获取第二历史过关记录中的高频旅客对应的第三历史过关记录;根据初始规则模型计算第三历史过关记录中风险旅客标签对应的旅客的时间段K内连续过关n次所用时间间隔t的不同类型的第二统计参数,以及普通旅客标签对应的旅客的时间段K内连续过关n次所用时间间隔t的不同类型的第三统计参数;根据第二统计参数和对应类型的第三统计参数选取时间段K、连续过关次数n的取值以及统计参数的类型;根据所选取的统计参数的类型、时间段K、连续过关次数n的取值生成预设规则模型。
具体地,终端可以从服务器获取到初始规则模型,初始规则模型即上文中的[K]时间内连续通关[n]次所用时间间隔[t]的统计量[Z]∈[Tinf,Tsup],其中K、n、Z、Tinf以及Tsup、Z的类型需要进行配置后才能得到对应的预设规则模型,上一个实施例中是由用户进行配置,而当用户没有进行配置的时候,终端可以根据该地域标识对应的第二历史过关记录,例如上海市的第二历史过关记录,生成默认的设置参数,以便于生成对应的预设规则模型。
具体地,终端首先从第二历史记录中选取高频旅客对应的第三历史过关记录,可选地,终端可以分别计算每一个旅客在预设时间段内的过关频率,并将计算得到的过关频率与预设值进行比较,以判断该第二历史记录对应的旅客是否为高频旅客,如果是,则获取高频旅客对应的第三历史过关记录。其中第二历史过关记录以及第三历史过关记录由于是历史过关记录,因此其存在对应的风险旅客标签和普通旅客标签,这是由于以往海关安检人员在抽检时,将抽检的结果进行了标记,例如将风险旅客对应的标识添加了风险旅客标签,而没有抽检到的,以及抽检结果为正常的旅客则默认添加普通旅客标签。因此终端可以根据风险旅客标签和普通旅客标签将第三历史过关记录划分为两组,并分别计算两组第三历史过关记录对应的时间段K内连续过关n次所用时间间隔t的不同类型的第二统计参数和第三统计参数。其中K、n以及统计参数Z的类型等可以在预设范围内进行选取,在计算完成后,终端可以计算第二统计参数和第三统计参数的区分度,选取区分度最大的第二统计参数或第三统计参数对应的参数K、n以及统计参数的类型最为预设规则模型中的统计参数,并根据第二统计参数和第三统计参数计算得到阈值范围Tinf以及Tsup的取值,最 后根据K、n、Z、Tinf以及Tsup、Z的类型生成预设规则模型。
其中可选地,为了方便,终端可以将一部分参数选取默认配置,而另外的一部分参数进行人工配置,具体方式可以参见上文,在此不再赘述。
上述实施例中,终端可以根据地域标识获取到第二历史过关记录,并选取第二历史过关记录中高频用户对应的第三历史过关记录,根据第三历史过关记录生成默认的参数配置,使得默认参数配置与地域相关,提高了参数配置的准确性,从而使得预设规则模型的适用性更高,准确性更高。
在其中一个实施例中,预设规则模型的生成方式,可以包括:获取第四历史过关记录,第四过关记录携带有旅客标签;从第四过关记录中选取高频旅客对应的第五过关记录;从第五过关记录中提取初始特征参数以及与第五过关记录对应的旅客标签,并对初始特征参数进行特征增益评估;根据特征增益评估的评估结果从第一特征参数中选取目标特征参数;当所提取的目标特征参数为第一统计参数为预设时间内连续过关预设次数所用时间间隔的统计量时,则根据旅客标签设置统计量对应的阈值范围;根据第一统计参数为预设时间内连续过关预设次数所用时间间隔的统计量以及阈值范围生成预设规则模型。
在其中一个实施例中,从第五过关记录中提取初始特征参数,可以包括:获取当前业务规则,并查询与当前业务规则对应的筛选特征参数;计算第五过关记录对应的筛选特征参数作为初始特征参数。
具体地,第四过关记录包括未被抽检的过关记录和被抽检的过关记录,其中被抽检的过关记录中包括了风险用户对应的过关记录和普通用户对应的过关记录。首先从第四过关记录中选取高频旅客对应的第五过关记录,选取方式可以参照上文中从第二过关记录中选取第三过关记录的方式,在此不再赘述。然后从第五过关记录中提取初始特征参数以及对应的旅客标签,例如风险用户标签或普通用户标签,然后对初始特征参数进行特征增益评估,该增益评估的方式可以是通过决策树的方式,具体可以参见下文,根据特征增益评估的评估结果从第一特征参数中选取目标特征参数,例如可以选取对风险用户和普通用户的区分程度最大的字段,例如可以是选取每一个字段对风险用户和高频用户进行区分,获取到区分结果正确率最高的字段;当所提取的目标特征参数为预设时间内连续过关预设次数所用时间间隔的统计量时,则根据旅客标签设置统计量对应的阈值范围;根据第一统计参数为预设时间内连续过关预设次数所用时间间隔的统计量以及阈值范围生成预设规则模型。
决策树是一种由节点和有向边组成的用于对实例进行分类的树形结构。节点的类型有两种:内部节点和叶子节点。其中,内部节点表示特征或属性的测试条件,叶子节点表示分类。使用决策树模型进行分类的具体方法是:从根节点开始,对实例的某一特征进行测试,根据测试结果将实例分配到其子节点。沿该分支可能达到叶子节点或者到达另一个内部节点时,则使用新的测试条件递归执行下去,直到抵达一个叶子节点。当到达叶子节点时,则得到最终分类结果。
具体地,终端获取到第五过关记录中的字段,由于第五过关记录中字段一般比较少,只包含姓名、年龄、身份证号、过关时间等,所包含的字段特征较少,因此在获取到第五过关记录后,首先对该第五过关记录进行扩展生成新的特征,例如,获取当前业务规则,例如过关信息类特征规则、频率类动态特征规则、频率类静态特征规则,然后,并查询与当前业务规则对应的筛选特征参数;计算第五过关记录对应的筛选特征参数作为初始特征参数,例如通过该业务规则以及过关记录中的字段生成新的特征字段,例如可以根据过关信息类特征规则生成“15天之前,最近一次通过时间”,“30天之前,30天内关口数”等;根据频率类静态特征规则生成“15天之前,30天内通关次数”,“30天之前,7天内通关天数”等;根据频率类动态特征生成“15天之前,90天内第一次和第五次最小间隔时间”,“30天之前,30天内第一次和第五次间隔时间平均值”等。
在生成上述新的特征之后,训练模型的方式包括:采集样本数据,将样本数据划分为训练集数据和测试集数据;从训练集数据中提取出第一特征参数和第一目标类别;根据第一特征参数进行特征信息增益评估,根据特征信息评估结果获取到区分程度最大的字段,即第一统计参数为预设时间内连续过关预设次数所用时间间隔的统计量,对该区分程度最大的字段进行数据分布分析,即统计量分析,并根据统计量的类型设置阈值范围,根据所设置的阈值范围、所选取的统计量等生成预设规则模型;从测试集数据中提取出第二特征参数和第二目标类别;根据第二特征参数和第二目标类别对初始决策评估模型进行验证,根据第一验证结果对初始决策树评估模型中的决策树结构进行优化调整并生成最终的风险评估模型。
在本实施例中,决策树模型采用ID3算法,基于越是小型的决策树越优于大的决策树的原则,根据信息增益评估和选择特征,每次选择信息增益最大的特征作为判断标准建立子结点。信息增益表示得知特征X的信息而使得类Y的信息的不确定性减少的程度。特征A对训练数据集D的信息增益g(D,A),定义为集合D的经验熵H(D)与特征A给定条件下D的经验条件熵H(D|A)之差,即
g(D,A)=H(D)-H(D|A)     (1)
其中,g(D,A)为特征A对训练数据集D的信息增益,H(D)为训练数据集D的经验熵,H(D|A)为特征A对数据集D的经验条件熵。
根据信息增益准则的特征选择方法是:对训练数据集(或子集)计算其每个特征的信息增益,选择信息增益最大的特征。计算信息增益的算法如下:其输入为训练数据集D和特征A,输出为特征A对训练数据集D的信息增益g(D,A)。
首先,计算数据集D的经验熵H(D):
Figure PCTCN2018106009-appb-000001
其中,C k为第一目标类别对应的样本数量,K为第一目标类别的类别数量,在本实 施例中,第一目标类别分为风险旅客和普通旅客两种。
其次,计算特征A对数据集D的经验条件熵H(D|A):
Figure PCTCN2018106009-appb-000002
其中,value(A)是特征A所有的取值集合,i是特征A的一个取值,D i是训练数据集D中特征A取值为i的样例集合,|D i|表示取值为i的样例集合的样本数量,|D|表示进行样例集合划分前样本的总数量,如性别特征参数对应的特征A所有的取值为男和女,如男可以用0表示,女可以用1表示,value(A)为(0,1)。
第三,计算信息增益:
g(D,A)=H(D)-H(D|A)     (1)
通过测试集数据在对预设规则模型进行验证时,若验证结果的偏差过大时,可以对所选取的特征参数进行调整,如将统计量进行调整等,重新构建决策树模型并进行验证直至验证结果在误差范围内,也可以从根节点开始对分支节点的特征选择进行调整,对决策树模型进行优化,在调整时,可以采用增加训练集的数据量等方式,直至优化的决策树模型的验证结果可以在误差范围内。
且根据特征信息评估结果获取到区分程度最大的特征(字段)可以包括计算第一特征参数对应的各特征参数的信息增益;选取信息增益最大的特征作为判断模块建立子节点;根据子节点对应的训练集数据划分为子集数据,对子集数据以递归方式进行分支直至所有分支节点对应的数据对应于相同的目标类别。通过将训练记录相继划分成较纯的子集,以递归方式建立决策树。设Dt是与节点t相关联的训练记录集,而y={y1,y2,…,yc}y={y1,y2,…,yc}是类标号,Hunt算法的递归定义如下:如果Dt中所有记录都属于同一个类,则t是叶节点,用yt标记。如果Dt中包含属于多个类的记录,则选择一个属性测试条件(attribute test condition),将记录划分成较小的子集。对于测试条件的每个输出,创建一个子女节点,并根据测试结果将Dt中的记录分布到子女节点中。然后,对于每个子女节点,递归地调用该算法。
服务器从测试集数据的各个样本中逐个提取出第二特征参数和第二目标类别。其中,第二特征参数与上述的第一特征参数的类别相同,可选地,可以是所选择的信息增益最大的特征,在此不再赘述。第二目标类别为安全检查结果的类别,第二目标类别分为风险旅客和普通旅客两类。
服务器根据测试数据集中各样本的第二特征参数和第二目标类别,从测试数据集中统计出与预设规则模型中各分类节点对应的特征参数组合匹配的负样本数据,即预设时间内连续过关预设次数所用时间间隔的统计量,计算统计的负样本数据在测试数据集中总的负样本数据中所占的比例,并根据计算出的比例对决策树模型进行验证。在验证时,服务器可以设定预设容错误差,当所计算出的绝对差值小于预设容错误差时,验证通过,当所计算出的绝对差值大于预设容错误差时,验证不通过。当验证不通过时,服务器可以将测试 数据集中的样本数据加入训练数据集中,扩大样本容量对预设规则模型进行训练,对预设规则模型进行调整。
上述实施例中,根据业务规则生成性的特征,从而使得特征多样化,进而在分析字段对于风险用户和普通用户的区分程度时,可以更加准确。
应该理解的是,虽然图2的流程图中的各个步骤按照箭头的指示依次显示,但是这些步骤并不是必然按照箭头指示的顺序依次执行。除非本文中有明确的说明,这些步骤的执行并没有严格的顺序限制,这些步骤可以以其它的顺序执行。而且,图2中的至少一部分步骤可以包括多个子步骤或者多个阶段,这些子步骤或者阶段并不必然是在同一时刻执行完成,而是可以在不同的时刻执行,这些子步骤或者阶段的执行顺序也不必然是依次进行,而是可以与其它步骤或者其它步骤的子步骤或者阶段的至少一部分轮流或者交替地执行。
在一个实施例中,如图3所示,提供了一种风险旅客识别装置,包括:第一接收模块100、过关频率计算模块200、预设规则模型获取模块300、第一统计参数计算模块400和输出模块500,其中:
第一接收模块100,用于接收输入的待识别旅客的身份信息,查询与身份信息对应的第一历史过关记录。
过关频率计算模块200,用于根据第一历史过关记录计算待识别旅客的过关频率。
预设规则模型获取模块300,用于根据过关频率判断待识别旅客是否为高频过关旅客,如果是,则获取预设规则模型。
第一统计参数计算模块400,用于根据预设规则模型以及第一历史过关记录计算待识别旅客的第一统计参数,第一统计参数为第一统计参数为预设时间内连续过关预设次数所用时间间隔的统计量。
及输出模块500,用于判断第一统计参数是否超过预设规则模型中的阈值范围,如果是,则输出待识别旅客为风险旅客。
在其中一个实施例中,上述风险旅客识别装置还可以包括:
第二接收模块,用于向服务器发送初始规则模型获取请求,并接收服务器返回的初始规则模型。
第三接收模块,用于接收输入的与初始规则模型对应的配置参数。
校验模块,用于获取与配置参数对应的校验规则,并通过所获取的校验规则对配置参数进行校验。
及第一生成模块,用于当配置参数校验成功时,则根据配置参数以及初始规则模型生成预设规则模型。
在其中一个实施例中,上述风险旅客识别装置还可以包括:
第四接收模块,用于向服务器发送初始规则模型获取请求,并接收服务器返回的初始规则模型。
过关记录获取模块,用于获取当前地域标识,并从服务器下载与当前地域标识对应的第二历史过关记录,第二历史过关记录中携带有风险旅客标签和普通旅客标签;获取第二历史过关记录中的高频旅客对应的第三历史过关记录。
第二统计参数计算模块,用于根据初始规则模型计算第三历史过关记录中风险旅客标签对应的旅客的时间段K内连续过关n次所用时间间隔t的不同类型的第二统计参数,以及普通旅客标签对应的旅客的时间段K内连续过关n次所用时间间隔t的不同类型的第三统计参数。
第一选取模块,用于根据第二统计参数和对应类型的第三统计参数选取时间段K、连续过关次数n的取值以及统计参数的类型。
及第二生成模块,用于根据所选取的统计参数的类型、K时间、连续过关次数n的取值生成预设规则模型。
在其中一个实施例中,上述风险旅客识别装置还可以包括:
第四历史过关记录获取模块,用于获取第四历史过关记录,第四过关记录携带有旅客标签。
第二选取模块,用于从第四过关记录中选取高频旅客对应的第五过关记录。
特征增益评估模块,用于从第五过关记录中提取初始特征参数以及与第五过关记录对应的旅客标签,并对初始特征参数进行特征增益评估。
第三选取模块,用于根据特征增益评估的评估结果从第一特征参数中选取目标特征参数。
设置模块,用于当所提取的目标特征参数为第一统计参数为预设时间内连续过关预设次数所用时间间隔的统计量时,则根据旅客标签设置统计量对应的阈值范围。
及第三生成模块,用于根据第一统计参数为预设时间内连续过关预设次数所用时间间隔的统计量以及阈值范围生成预设规则模型。
在其中一个实施例中,特征增益评估模块可以包括:
查询单元,用于获取当前业务规则,并查询与当前业务规则对应的筛选特征参数。
及初始特征参数计算单元,用于计算第五过关记录对应的筛选特征参数作为初始特征参数。
关于风险旅客识别装置的具体限定可以参见上文中对于风险旅客识别方法的限定,在此不再赘述。上述风险旅客识别装置中的各个模块可全部或部分通过软件、硬件及其组合来实现。上述各模块可以硬件形式内嵌于或独立于计算机设备中的处理器中,也可以以软件形式存储于计算机设备中的存储器中,以便于处理器调用执行以上各个模块对应的操作。
在一个实施例中,提供了一种计算机设备,该计算机设备可以是终端,其内部结构图可以如图4所示。该计算机设备包括通过***总线连接的处理器、存储器、网络接口、显示屏和输入装置。其中,该计算机设备的处理器用于提供计算和控制能力。该计算机设备 的存储器包括非易失性存储介质、内存储器。该非易失性存储介质存储有操作***和计算机可读指令。该内存储器为非易失性存储介质中的操作***和计算机可读指令的运行提供环境。该计算机设备的网络接口用于与外部的终端通过网络连接通信。该计算机可读指令被处理器执行时以实现一种风险旅客识别方法。该计算机设备的显示屏可以是液晶显示屏或者电子墨水显示屏,该计算机设备的输入装置可以是显示屏上覆盖的触摸层,也可以是计算机设备外壳上设置的按键、轨迹球或触控板,还可以是外接的键盘、触控板或鼠标等。
本领域技术人员可以理解,图4中示出的结构,仅仅是与本申请方案相关的部分结构的框图,并不构成对本申请方案所应用于其上的计算机设备的限定,具体的计算机设备可以包括比图中所示更多或更少的部件,或者组合某些部件,或者具有不同的部件布置。
一种计算机设备,包括存储器和一个或多个处理器,存储器中储存有计算机可读指令,计算机可读指令被处理器执行时,使得一个或多个处理器执行以下步骤:接收输入的待识别旅客的身份信息,查询与身份信息对应的第一历史过关记录;根据第一历史过关记录计算待识别旅客的过关频率;根据过关频率判断待识别旅客是否为高频过关旅客,如果是,则获取预设规则模型;根据预设规则模型以及第一历史过关记录计算待识别旅客的第一统计参数,第一统计参数为预设时间内连续过关预设次数所用时间间隔的统计量;及判断第一统计参数是否超过预设规则模型中的阈值范围,如果是,则输出待识别旅客为风险旅客。
在一个实施例中,处理器执行计算机可读指令时还实现以下步骤:向服务器发送初始规则模型获取请求,并接收服务器返回的初始规则模型;接收输入的与初始规则模型对应的配置参数;获取与配置参数对应的校验规则,并通过所获取的校验规则对配置参数进行校验;及当配置参数校验成功时,则根据配置参数以及初始规则模型生成预设规则模型。
在一个实施例中,处理器执行计算机可读指令时还实现以下步骤:向服务器发送初始规则模型获取请求,并接收服务器返回的初始规则模型;获取当前地域标识,并从服务器下载与当前地域标识对应的第二历史过关记录,第二历史过关记录中携带有风险旅客标签和普通旅客标签;获取第二历史过关记录中的高频旅客对应的第三历史过关记录;根据初始规则模型计算第三历史过关记录中风险旅客标签对应的旅客的时间段K内连续过关n次所用时间间隔t的不同类型的第二统计参数,以及普通旅客标签对应的旅客的时间段K内连续过关n次所用时间间隔t的不同类型的第三统计参数;根据第二统计参数和对应类型的第三统计参数选取时间段K、连续过关次数n的取值以及统计参数的类型;及根据所选取的统计参数的类型、时间段K、连续过关次数n的取值生成预设规则模型。
在一个实施例中,处理器执行计算机可读指令时所涉及的预设规则模型的生成方式,可以包括:获取第四历史过关记录,第四过关记录携带有旅客标签;从第四过关记录中选取高频旅客对应的第五过关记录;从第五过关记录中提取初始特征参数以及与第五过关记录对应的旅客标签,并对初始特征参数进行特征增益评估;根据特征增益评估的评估结果从第一特征参数中选取目标特征参数;当所提取的目标特征参数为第一统计参数为预设时间内连续过关预设次数所用时间间隔的统计量时,则根据旅客标签设置统计量对应的阈值 范围;及根据第一统计参数为预设时间内连续过关预设次数所用时间间隔的统计量以及阈值范围生成预设规则模型。
在一个实施例中,处理器执行计算机可读指令时所实现的从第五过关记录中提取初始特征参数,可以包括:获取当前业务规则,并查询与当前业务规则对应的筛选特征参数;及计算第五过关记录对应的筛选特征参数作为初始特征参数。
一个或多个存储有计算机可读指令的非易失性计算机可读存储介质,计算机可读指令被一个或多个处理器执行时,使得一个或多个处理器执行以下步骤:接收输入的待识别旅客的身份信息,查询与身份信息对应的第一历史过关记录;根据第一历史过关记录计算待识别旅客的过关频率;根据过关频率判断待识别旅客是否为高频过关旅客,如果是,则获取预设规则模型;根据预设规则模型以及第一历史过关记录计算待识别旅客的第一统计参数,第一统计参数为预设时间内连续过关预设次数所用时间间隔的统计量;及判断第一统计参数是否超过预设规则模型中的阈值范围,如果是,则输出待识别旅客为风险旅客。
在一个实施例中,计算机可读指令被处理器执行时还实现以下步骤:向服务器发送初始规则模型获取请求,并接收服务器返回的初始规则模型;接收输入的与初始规则模型对应的配置参数;获取与配置参数对应的校验规则,并通过所获取的校验规则对配置参数进行校验;及当配置参数校验成功时,则根据配置参数以及初始规则模型生成预设规则模型。
在一个实施例中,计算机可读指令被处理器执行时还实现以下步骤:向服务器发送初始规则模型获取请求,并接收服务器返回的初始规则模型;获取当前地域标识,并从服务器下载与当前地域标识对应的第二历史过关记录,第二历史过关记录中携带有风险旅客标签和普通旅客标签;获取第二历史过关记录中的高频旅客对应的第三历史过关记录;根据初始规则模型计算第三历史过关记录中风险旅客标签对应的旅客的时间段K内连续过关n次所用时间间隔t的不同类型的第二统计参数,以及普通旅客标签对应的旅客的时间段K内连续过关n次所用时间间隔t的不同类型的第三统计参数;根据第二统计参数和对应类型的第三统计参数选取时间段K、连续过关次数n的取值以及统计参数的类型;及根据所选取的统计参数的类型、时间段K、连续过关次数n的取值生成预设规则模型。
在一个实施例中,计算机可读指令被处理器执行时所涉及的预设规则模型的生成方式,可以包括:获取第四历史过关记录,第四过关记录携带有旅客标签;从第四过关记录中选取高频旅客对应的第五过关记录;从第五过关记录中提取初始特征参数以及与第五过关记录对应的旅客标签,并对初始特征参数进行特征增益评估;根据特征增益评估的评估结果从第一特征参数中选取目标特征参数;当所提取的目标特征参数为第一统计参数为预设时间内连续过关预设次数所用时间间隔的统计量时,则根据旅客标签设置统计量对应的阈值范围;及根据第一统计参数为预设时间内连续过关预设次数所用时间间隔的统计量以及阈值范围生成预设规则模型。
在一个实施例中,计算机可读指令被处理器执行时所实现的从第五过关记录中提取初始特征参数,可以包括:获取当前业务规则,并查询与当前业务规则对应的筛选特征参数; 及计算第五过关记录对应的筛选特征参数作为初始特征参数。
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,是可以通过计算机可读指令来指令相关的硬件来完成,所述的计算机可读指令可存储于一非易失性计算机可读取存储介质中,该计算机可读指令在执行时,可包括如上述各方法的实施例的流程。其中,本申请所提供的各实施例中所使用的对存储器、存储、数据库或其它介质的任何引用,均可包括非易失性和/或易失性存储器。非易失性存储器可包括只读存储器(ROM)、可编程ROM(PROM)、电可编程ROM(EPROM)、电可擦除可编程ROM(EEPROM)或闪存。易失性存储器可包括随机存取存储器(RAM)或者外部高速缓冲存储器。作为说明而非局限,RAM以多种形式可得,诸如静态RAM(SRAM)、动态RAM(DRAM)、同步DRAM(SDRAM)、双数据率SDRAM(DDRSDRAM)、增强型SDRAM(ESDRAM)、同步链路(Synchlink)DRAM(SLDRAM)、存储器总线(Rambus)直接RAM(RDRAM)、直接存储器总线动态RAM(DRDRAM)、以及存储器总线动态RAM(RDRAM)等。
以上实施例的各技术特征可以进行任意的组合,为使描述简洁,未对上述实施例中的各个技术特征所有可能的组合都进行描述,然而,只要这些技术特征的组合不存在矛盾,都应当认为是本说明书记载的范围。
以上所述实施例仅表达了本申请的几种实施方式,其描述较为具体和详细,但并不能因此而理解为对发明专利范围的限制。应当指出的是,对于本领域的普通技术人员来说,在不脱离本申请构思的前提下,还可以做出若干变形和改进,这些都属于本申请的保护范围。因此,本申请专利的保护范围应以所附权利要求为准。

Claims (20)

  1. 一种风险旅客识别方法,包括:
    接收输入的待识别旅客的身份信息,查询与所述身份信息对应的第一历史过关记录;
    根据所述第一历史过关记录计算待识别旅客的过关频率;
    根据所述过关频率判断所述待识别旅客是否为高频过关旅客,如果是,则获取预设规则模型;
    根据所述预设规则模型以及所述第一历史过关记录计算所述待识别旅客的第一统计参数,所述第一统计参数为预设时间内连续过关预设次数所用时间间隔的统计量;及
    判断所述第一统计参数是否超过所述预设规则模型中的阈值范围,如果是,则输出所述待识别旅客为风险旅客。
  2. 根据权利要求1所述的方法,其特征在于,所述方法还包括:
    向服务器发送初始规则模型获取请求,并接收服务器返回的初始规则模型;
    接收输入的与所述初始规则模型对应的配置参数;
    获取与所述配置参数对应的校验规则,并通过所获取的校验规则对所述配置参数进行校验;及
    当所述配置参数校验成功时,则根据所述配置参数以及所述初始规则模型生成预设规则模型。
  3. 根据权利要求1所述的方法,其特征在于,所述方法还包括:
    向服务器发送初始规则模型获取请求,并接收服务器返回的初始规则模型;
    获取当前地域标识,并从服务器下载与所述当前地域标识对应的第二历史过关记录,所述第二历史过关记录中携带有风险旅客标签和普通旅客标签;
    获取所述第二历史过关记录中的高频旅客对应的第三历史过关记录;
    根据所述初始规则模型计算所述第三历史过关记录中所述风险旅客标签对应的旅客的时间段K内连续过关n次所用时间间隔t的不同类型的第二统计参数,以及所述普通旅客标签对应的旅客的时间段K内连续过关n次所用时间间隔t的不同类型的第三统计参数;
    根据所述第二统计参数和对应类型的所述第三统计参数选取时间段K、连续过关次数n的取值以及统计参数的类型;及
    根据所选取的统计参数的类型、时间段K、连续过关次数n的取值生成预设规则模型。
  4. 根据权利要求1至3任意一项所述的方法,其特征在于,所述预设规则模型的生成方式,包括:
    获取第四历史过关记录,所述第四过关记录携带有旅客标签;
    从所述第四过关记录中选取高频旅客对应的第五过关记录;
    从所述第五过关记录中提取初始特征参数以及与所述第五过关记录对应的旅客标签,并对所述初始特征参数进行特征增益评估;
    根据所述特征增益评估的评估结果从所述第一特征参数中选取目标特征参数;
    当所提取的目标特征参数为第一统计参数为预设时间内连续过关预设次数所用时间间隔的统计量时,则根据所述旅客标签设置所述统计量对应的阈值范围;及
    根据所述第一统计参数为预设时间内连续过关预设次数所用时间间隔的统计量以及所述阈值范围生成预设规则模型。
  5. 根据权利要求4所述的方法,其特征在于,所述从所述第五过关记录中提取初始特征参数,包括:
    获取当前业务规则,并查询与所述当前业务规则对应的筛选特征参数;及
    计算所述第五过关记录对应的筛选特征参数作为初始特征参数。
  6. 一种风险旅客识别装置,包括:
    第一接收模块,用于接收输入的待识别旅客的身份信息,查询与所述身份信息对应的第一历史过关记录;
    过关频率计算模块,用于根据所述第一历史过关记录计算待识别旅客的过关频率;
    预设规则模型获取模块,用于根据所述过关频率判断所述待识别旅客是否为高频过关旅客,如果是,则获取预设规则模型;
    第一统计参数计算模块,用于根据所述预设规则模型以及所述第一历史过关记录计算所述待识别旅客的第一统计参数,所述第一统计参数为第一统计参数为预设时间内连续过关预设次数所用时间间隔的统计量;及
    输出模块,用于判断所述第一统计参数是否超过所述预设规则模型中的阈值范围,如果是,则输出所述待识别旅客为风险旅客。
  7. 根据权利要求6所述的方法,其特征在于,所述装置还包括:
    第二接收模块,用于向服务器发送初始规则模型获取请求,并接收服务器返回的初始规则模型;
    第三接收模块,用于接收输入的与所述初始规则模型对应的配置参数;
    校验模块,用于获取与所述配置参数对应的校验规则,并通过所获取的校验规则对所述配置参数进行校验;及
    第一生成模块,用于当所述配置参数校验成功时,则根据所述配置参数以及所述初始规则模型生成预设规则模型。
  8. 根据权利要求6所述的装置,其特征在于,所述装置还包括:
    第四接收模块,用于向服务器发送初始规则模型获取请求,并接收服务器返回的初始规则模型;
    过关记录获取模块,用于获取当前地域标识,并从服务器下载与所述当前地域标识对应的第二历史过关记录,所述第二历史过关记录中携带有风险旅客标签和普通旅客标签;获取所述第二历史过关记录中的高频旅客对应的第三历史过关记录;
    第二统计参数计算模块,用于根据所述初始规则模型计算所述第三历史过关记录中所 述风险旅客标签对应的旅客的时间段K内连续过关n次所用时间间隔t的不同类型的第二统计参数,以及所述普通旅客标签对应的旅客的时间段K内连续过关n次所用时间间隔t的不同类型的第三统计参数;
    第一选取模块,用于根据所述第二统计参数和对应类型的所述第三统计参数选取时间段K、连续过关次数n的取值以及统计参数的类型;及
    第二生成模块,用于根据所选取的统计参数的类型、K时间、连续过关次数n的取值生成预设规则模型。
  9. 根据权利要求6至8任意一项所述的装置,其特征在于,所述装置还包括:
    第四历史过关记录获取模块,用于获取第四历史过关记录,所述第四过关记录携带有旅客标签;
    第二选取模块,用于从所述第四过关记录中选取高频旅客对应的第五过关记录;
    特征增益评估模块,用于从所述第五过关记录中提取初始特征参数以及与所述第五过关记录对应的旅客标签,并对所述初始特征参数进行特征增益评估;
    第三选取模块,用于根据所述特征增益评估的评估结果从所述第一特征参数中选取目标特征参数;
    设置模块,用于当所提取的目标特征参数为第一统计参数为预设时间内连续过关预设次数所用时间间隔的统计量时,则根据所述旅客标签设置所述统计量对应的阈值范围;
    第三生成模块,用于根据所述第一统计参数为预设时间内连续过关预设次数所用时间间隔的统计量以及所述阈值范围生成预设规则模型。
  10. 根据权利要求9所述的装置,其特征在于,所述特征增益评估模块包括:
    查询单元,用于获取当前业务规则,并查询与所述当前业务规则对应的筛选特征参数;
    初始特征参数计算单元,用于计算所述第五过关记录对应的筛选特征参数作为初始特征参数。
  11. 一种计算机设备,包括存储器及一个或多个处理器,所述存储器中储存有计算机可读指令,所述计算机可读指令被所述一个或多个处理器执行时,使得所述一个或多个处理器执行以下步骤:
    接收输入的待识别旅客的身份信息,查询与所述身份信息对应的第一历史过关记录;
    根据所述第一历史过关记录计算待识别旅客的过关频率;
    根据所述过关频率判断所述待识别旅客是否为高频过关旅客,如果是,则获取预设规则模型;
    根据所述预设规则模型以及所述第一历史过关记录计算所述待识别旅客的第一统计参数,所述第一统计参数为预设时间内连续过关预设次数所用时间间隔的统计量;及
    判断所述第一统计参数是否超过所述预设规则模型中的阈值范围,如果是,则输出所述待识别旅客为风险旅客。
  12. 根据权利要求11所述的计算机设备,其特征在于,所述处理器执行所述计算机 可读指令时还执行以下步骤:
    向服务器发送初始规则模型获取请求,并接收服务器返回的初始规则模型;
    接收输入的与所述初始规则模型对应的配置参数;
    获取与所述配置参数对应的校验规则,并通过所获取的校验规则对所述配置参数进行校验;及
    当所述配置参数校验成功时,则根据所述配置参数以及所述初始规则模型生成预设规则模型。
  13. 根据权利要求11所述的计算机设备,其特征在于,所述处理器执行所述计算机可读指令时还执行以下步骤:
    向服务器发送初始规则模型获取请求,并接收服务器返回的初始规则模型;
    获取当前地域标识,并从服务器下载与所述当前地域标识对应的第二历史过关记录,所述第二历史过关记录中携带有风险旅客标签和普通旅客标签;
    获取所述第二历史过关记录中的高频旅客对应的第三历史过关记录;
    根据所述初始规则模型计算所述第三历史过关记录中所述风险旅客标签对应的旅客的时间段K内连续过关n次所用时间间隔t的不同类型的第二统计参数,以及所述普通旅客标签对应的旅客的时间段K内连续过关n次所用时间间隔t的不同类型的第三统计参数;
    根据所述第二统计参数和对应类型的所述第三统计参数选取时间段K、连续过关次数n的取值以及统计参数的类型;及
    根据所选取的统计参数的类型、时间段K、连续过关次数n的取值生成预设规则模型。
  14. 根据权利要求11至13任意一项所述的计算机设备,其特征在于,所述处理器执行所述计算机可读指令时所涉及的所述预设规则模型的生成方式,包括:
    获取第四历史过关记录,所述第四过关记录携带有旅客标签;
    从所述第四过关记录中选取高频旅客对应的第五过关记录;
    从所述第五过关记录中提取初始特征参数以及与所述第五过关记录对应的旅客标签,并对所述初始特征参数进行特征增益评估;
    根据所述特征增益评估的评估结果从所述第一特征参数中选取目标特征参数;
    当所提取的目标特征参数为第一统计参数为预设时间内连续过关预设次数所用时间间隔的统计量时,则根据所述旅客标签设置所述统计量对应的阈值范围;及
    根据所述第一统计参数为预设时间内连续过关预设次数所用时间间隔的统计量以及所述阈值范围生成预设规则模型。
  15. 根据权利要求14所述的计算机设备,其特征在于,所述处理器执行所述计算机可读指令时所实现的所述从所述第五过关记录中提取初始特征参数,包括:
    获取当前业务规则,并查询与所述当前业务规则对应的筛选特征参数;及
    计算所述第五过关记录对应的筛选特征参数作为初始特征参数。
  16. 一个或多个存储有计算机可读指令的非易失性计算机可读存储介质,所述计算机可读指令被一个或多个处理器执行时,使得所述一个或多个处理器执行以下步骤:
    接收输入的待识别旅客的身份信息,查询与所述身份信息对应的第一历史过关记录;
    根据所述第一历史过关记录计算待识别旅客的过关频率;
    根据所述过关频率判断所述待识别旅客是否为高频过关旅客,如果是,则获取预设规则模型;
    根据所述预设规则模型以及所述第一历史过关记录计算所述待识别旅客的第一统计参数,所述第一统计参数为预设时间内连续过关预设次数所用时间间隔的统计量;及
    判断所述第一统计参数是否超过所述预设规则模型中的阈值范围,如果是,则输出所述待识别旅客为风险旅客。
  17. 根据权利要求16所述的存储介质,其特征在于,所述计算机可读指令被所述处理器执行时还执行以下步骤:
    向服务器发送初始规则模型获取请求,并接收服务器返回的初始规则模型;
    接收输入的与所述初始规则模型对应的配置参数;
    获取与所述配置参数对应的校验规则,并通过所获取的校验规则对所述配置参数进行校验;及
    当所述配置参数校验成功时,则根据所述配置参数以及所述初始规则模型生成预设规则模型。
  18. 根据权利要求16所述的存储介质,其特征在于,所述计算机可读指令被所述处理器执行时还执行以下步骤:
    向服务器发送初始规则模型获取请求,并接收服务器返回的初始规则模型;
    获取当前地域标识,并从服务器下载与所述当前地域标识对应的第二历史过关记录,所述第二历史过关记录中携带有风险旅客标签和普通旅客标签;
    获取所述第二历史过关记录中的高频旅客对应的第三历史过关记录;
    根据所述初始规则模型计算所述第三历史过关记录中所述风险旅客标签对应的旅客的时间段K内连续过关n次所用时间间隔t的不同类型的第二统计参数,以及所述普通旅客标签对应的旅客的时间段K内连续过关n次所用时间间隔t的不同类型的第三统计参数;
    根据所述第二统计参数和对应类型的所述第三统计参数选取时间段K、连续过关次数n的取值以及统计参数的类型;及
    根据所选取的统计参数的类型、时间段K、连续过关次数n的取值生成预设规则模型。
  19. 根据权利要求16至18任意一项所述的存储介质,其特征在于,所述计算机可读指令被所述处理器执行时所涉及的所述预设规则模型的生成方式,包括:
    获取第四历史过关记录,所述第四过关记录携带有旅客标签;
    从所述第四过关记录中选取高频旅客对应的第五过关记录;
    从所述第五过关记录中提取初始特征参数以及与所述第五过关记录对应的旅客标签,并对所述初始特征参数进行特征增益评估;
    根据所述特征增益评估的评估结果从所述第一特征参数中选取目标特征参数;
    当所提取的目标特征参数为第一统计参数为预设时间内连续过关预设次数所用时间间隔的统计量时,则根据所述旅客标签设置所述统计量对应的阈值范围;及
    根据所述第一统计参数为预设时间内连续过关预设次数所用时间间隔的统计量以及所述阈值范围生成预设规则模型。
  20. 根据权利要求19所述的存储介质,其特征在于,所述计算机可读指令被所述处理器执行时所实现的所述从所述第五过关记录中提取初始特征参数,包括:
    获取当前业务规则,并查询与所述当前业务规则对应的筛选特征参数;及
    计算所述第五过关记录对应的筛选特征参数作为初始特征参数。
PCT/CN2018/106009 2018-07-18 2018-09-17 风险旅客方法、装置、计算机设备和存储介质 WO2020015139A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201810788333.3 2018-07-18
CN201810788333.3A CN109063984B (zh) 2018-07-18 2018-07-18 风险旅客方法、装置、计算机设备和存储介质

Publications (1)

Publication Number Publication Date
WO2020015139A1 true WO2020015139A1 (zh) 2020-01-23

Family

ID=64817006

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/106009 WO2020015139A1 (zh) 2018-07-18 2018-09-17 风险旅客方法、装置、计算机设备和存储介质

Country Status (2)

Country Link
CN (1) CN109063984B (zh)
WO (1) WO2020015139A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113240266A (zh) * 2021-05-11 2021-08-10 北京沃东天骏信息技术有限公司 一种风险管理方法和装置

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110348379B (zh) * 2019-07-10 2021-10-01 北京旷视科技有限公司 一种公共交通工具中目标对象确定方法、装置、***及存储介质
CN110377846A (zh) * 2019-07-25 2019-10-25 腾讯科技(深圳)有限公司 社交关系挖掘方法、装置、存储介质和计算机设备
CN110598995B (zh) * 2019-08-15 2023-08-25 中国平安人寿保险股份有限公司 智能客户评级方法、装置及计算机可读存储介质
CN111352171B (zh) * 2020-03-30 2023-01-24 重庆特斯联智慧科技股份有限公司 一种实现人工智能区域屏蔽安检方法和***

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107330584A (zh) * 2017-06-12 2017-11-07 中国联合网络通信集团有限公司 可疑人员识别方法及装置
CN107844805A (zh) * 2017-11-15 2018-03-27 中国联合网络通信集团有限公司 基于公交卡信息识别可疑人员的方法及装置
CN108198116A (zh) * 2016-12-08 2018-06-22 同方威视技术股份有限公司 用于安检中被检人员分级的方法及装置

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101194286A (zh) * 2005-03-24 2008-06-04 埃森哲全球服务有限公司 基于风险的数据评估
CA2867008A1 (en) * 2013-10-11 2015-04-11 Securitypoint Holdings, Inc. Methods and systems for efficient security screening
CN106251049A (zh) * 2016-07-25 2016-12-21 国网浙江省电力公司宁波供电公司 一种大数据的电费风险模型构建方法
US11222366B2 (en) * 2016-10-20 2022-01-11 Meta Platforms, Inc. Determining accuracy of a model determining a likelihood of a user performing an infrequent action after presentation of content
CN107133438B (zh) * 2017-03-03 2018-07-13 平安医疗健康管理股份有限公司 医疗行为监控方法及装置

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108198116A (zh) * 2016-12-08 2018-06-22 同方威视技术股份有限公司 用于安检中被检人员分级的方法及装置
CN107330584A (zh) * 2017-06-12 2017-11-07 中国联合网络通信集团有限公司 可疑人员识别方法及装置
CN107844805A (zh) * 2017-11-15 2018-03-27 中国联合网络通信集团有限公司 基于公交卡信息识别可疑人员的方法及装置

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113240266A (zh) * 2021-05-11 2021-08-10 北京沃东天骏信息技术有限公司 一种风险管理方法和装置

Also Published As

Publication number Publication date
CN109063984A (zh) 2018-12-21
CN109063984B (zh) 2023-09-05

Similar Documents

Publication Publication Date Title
WO2020015139A1 (zh) 风险旅客方法、装置、计算机设备和存储介质
WO2020015089A1 (zh) 身份信息风险评定方法、装置、计算机设备和存储介质
CN109858737B (zh) 基于模型部署的评分模型调整方法、装置和计算机设备
CN112148987B (zh) 基于目标对象活跃度的消息推送方法及相关设备
CN109002988B (zh) 风险旅客流量预测方法、装置、计算机设备和存储介质
CN109949154B (zh) 客户信息分类方法、装置、计算机设备和存储介质
CN110046889B (zh) 一种异常行为主体的检测方法、装置及服务器
CN109886719B (zh) 基于网格的数据挖掘处理方法、装置和计算机设备
CN110555164B (zh) 群体兴趣标签的生成方法、装置、计算机设备和存储介质
CN112329843B (zh) 基于决策树的呼叫数据处理方法、装置、设备及存储介质
US10489637B2 (en) Method and device for obtaining similar face images and face image information
WO2019061664A1 (zh) 电子装置、基于用户上网数据的产品推荐方法及存储介质
WO2020015140A1 (zh) 旅客评级模型生成方法、装置、计算机设备和存储介质
CN110647676B (zh) 基于大数据的兴趣属性挖掘方法、装置和计算机设备
CN110659396B (zh) 缺失属性信息补全方法、装置、计算机设备和存储介质
CN110941978B (zh) 一种未识别身份人员的人脸聚类方法、装置及存储介质
CN111400126B (zh) 网络服务异常数据检测方法、装置、设备和介质
CN109325868B (zh) 问卷数据处理方法、装置、计算机设备和存储介质
CN112365007B (zh) 模型参数确定方法、装置、设备及存储介质
WO2019095587A1 (zh) 人脸识别方法、应用服务器及计算机可读存储介质
CN115545103A (zh) 异常数据识别、标签识别方法和异常数据识别装置
EP3882825A1 (en) Learning model application system, learning model application method, and program
CN112990989B (zh) 价值预测模型输入数据生成方法、装置、设备和介质
CN114219664A (zh) 产品推荐方法、装置、计算机设备及存储介质
CN112529319A (zh) 基于多维特征的评分方法、装置、计算机设备及存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18926650

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18926650

Country of ref document: EP

Kind code of ref document: A1