CN113721896A - Optimization processing method and device for financial fraud modeling language - Google Patents

Optimization processing method and device for financial fraud modeling language Download PDF

Info

Publication number
CN113721896A
CN113721896A CN202110712728.7A CN202110712728A CN113721896A CN 113721896 A CN113721896 A CN 113721896A CN 202110712728 A CN202110712728 A CN 202110712728A CN 113721896 A CN113721896 A CN 113721896A
Authority
CN
China
Prior art keywords
node
type
event
module
sql
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110712728.7A
Other languages
Chinese (zh)
Inventor
范皓
赵曦滨
庞在余
万海
王一平
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Bond Jinke Information Technology Co ltd
Tsinghua University
Original Assignee
China Bond Jinke Information Technology Co ltd
Tsinghua University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Bond Jinke Information Technology Co ltd, Tsinghua University filed Critical China Bond Jinke Information Technology Co ltd
Priority to CN202110712728.7A priority Critical patent/CN113721896A/en
Publication of CN113721896A publication Critical patent/CN113721896A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/30Creation or generation of source code
    • G06F8/31Programming languages or programming paradigms
    • G06F8/315Object-oriented languages
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2282Tablespace storage structures; Management thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/242Query formulation
    • G06F16/2433Query languages
    • G06F16/2445Data retrieval commands; View definitions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • G06F16/24553Query execution of query operations
    • G06F16/24554Unary operations; Data partitioning operations
    • G06F16/24556Aggregation; Duplicate elimination
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/40Transformation of program code
    • G06F8/41Compilation
    • G06F8/42Syntactic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/40Transformation of program code
    • G06F8/41Compilation
    • G06F8/42Syntactic analysis
    • G06F8/425Lexical analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/40Transformation of program code
    • G06F8/41Compilation
    • G06F8/43Checking; Contextual analysis
    • G06F8/436Semantic checking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/40Transformation of program code
    • G06F8/41Compilation
    • G06F8/44Encoding
    • G06F8/447Target code generation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/04Trading; Exchange, e.g. stocks, commodities, derivatives or currency exchange

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Business, Economics & Management (AREA)
  • Accounting & Taxation (AREA)
  • Finance (AREA)
  • Mathematical Physics (AREA)
  • Development Economics (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Strategic Management (AREA)
  • Technology Law (AREA)
  • General Business, Economics & Management (AREA)
  • Computing Systems (AREA)
  • Devices For Executing Special Programs (AREA)

Abstract

The invention discloses an optimization processing method and device of a financial fraud modeling language, which comprises the following steps: generating an FFML abstract syntax tree according to a fraud detection rule written by using a financial fraud modeling language FFML; judging the node type of the node; if the node type of the node is singleCondition, generating target conversion data according to the left return value, the comparison return value and the right return value of the node; and generating SQL codes corresponding to the fraud detection rules according to the target conversion data. By applying the method and the device, the fraud detection rule written by using the FFML can be quickly converted into the SQL programming language which can be identified by the flow platform.

Description

Optimization processing method and device for financial fraud modeling language
Technical Field
The invention relates to the technical field of computers, in particular to an optimization processing method and device of a financial fraud modeling language.
Background
With the advancement of modern technologies such as the internet and mobile computers, the variety of financial fraud continues to increase. In order to deal with the novel financial fraud, an automatic financial fraud detection method adopting a computer technology is produced, and the automatic financial fraud detection method adopting the computer technology is divided into passive fraud and active fraud. Active fraud detection introduces real-time stream processing techniques into the field of financial fraud detection, enabling transaction request detection to be real-time.
Active fraud depends on detection rules formulated by domain experts, and in general, the domain experts propose and explain new fraud detection rules to IT encoding personnel, then the IT encoding personnel write actual codes of a stream platform, and finally the IT encoding personnel can be deployed to the stream processing platform for fraud real-time monitoring.
However, due to the fact that the field experts and the IT coding personnel have large industry differences, the communication efficiency is low, the misunderstanding rate is high, and the like, the new fraud detection rule needs a long time to achieve actual deployment, and large economic loss is possibly caused. How to convert the modeling language used by the domain experts for financial fraud into a programming language that can be recognized by the streaming platform is a problem to be solved.
Disclosure of Invention
The invention provides an optimization processing method and device of a financial fraud modeling language, which are used for overcoming at least one technical problem in the prior art.
According to a first aspect of the embodiments of the present invention, there is provided an optimization processing method of a financial fraud modeling language, including:
generating an FFML abstract syntax tree corresponding to Fraud detection rules according to the Fraud detection rules written by Financial Fraud Modeling Language (FFML);
judging the node type of each node by traversing each node in the FFML abstract syntax tree;
if the node type of the node is singleCondition, converting the Boolean expression in the data stream according to the left return value of the left expression sub-node of the node, the comparison return value of the comparison operator sub-node and the right return value of the right expression sub-node to generate target conversion data;
and generating a Structured Query Language (SQL) code corresponding to the fraud detection rule according to the target conversion data.
According to a second aspect of the embodiments of the present invention, there is provided an optimization processing apparatus of a financial fraud modeling language, including:
the device comprises a first generation module, a first judgment module, a third generation module and a fourth generation module;
the first generation module is used for generating an FFML abstract syntax tree corresponding to fraud detection rules according to the fraud detection rules written by using a financial fraud modeling language FFML;
the first judging module is used for judging the node type of each node by traversing each node in the FFML abstract syntax tree;
the third generation module is configured to, if the node type of the node is SingleCondition, convert the boolean expression in the data stream according to a left return value of a left expression sub-node of the node, a comparison return value of a comparison operator sub-node, and a right return value of a right expression sub-node, and generate target conversion data;
and the fourth generation module is used for generating an SQL code corresponding to the fraud detection rule according to the target conversion data.
The innovation points of the embodiment of the invention comprise:
1. the invention can generate the FFML abstract syntax tree corresponding to the fraud detection rule based on the fraud detection rule compiled by using the FFML, further generate corresponding conversion data according to the node type of each node in the FFML abstract syntax tree, and finally generate the SQL code corresponding to the fraud detection rule according to each conversion data.
2. The method can determine the processing flow of the nodes of each node type according to the node type of each node in the FFML abstract syntax tree so as to realize the accurate conversion of the FFML using the financial fraud modeling language, and is one of the innovation points of the embodiment of the invention.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
FIG. 1 is a schematic diagram of one embodiment of the present invention;
FIG. 2 is an overall block diagram of a back end design module of the present invention;
FIG. 3 is a first FFML abstract syntax tree in accordance with the present invention;
FIG. 4 is a schematic view of yet another embodiment of the present invention;
FIG. 5 is a flowchart illustrating the processing of sub-steps 511 in the present invention;
FIG. 6 is a second FFML abstract syntax tree of the present invention;
FIG. 7 is a schematic structural diagram of an optimization processing apparatus of the financial fraud modeling language according to the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be obtained by a person skilled in the art without inventive effort based on the embodiments of the present invention, are within the scope of the present invention.
With the advancement of modern technologies such as the internet and mobile computers, the variety of financial fraud continues to increase. In order to deal with the novel financial fraud, an automatic financial fraud detection method adopting a computer technology is produced, and the automatic financial fraud detection method adopting the computer technology is divided into passive fraud and active fraud. Active fraud detection introduces real-time stream processing techniques into the field of financial fraud detection, enabling transaction request detection to be real-time.
Active fraud depends on detection rules formulated by domain experts, and in general, the domain experts propose and explain new fraud detection rules to IT encoding personnel, then the IT encoding personnel write actual codes of a stream platform, and finally the IT encoding personnel can be deployed to the stream processing platform for fraud real-time monitoring.
However, due to the fact that the field experts and the IT coding personnel have large industry differences, the communication efficiency is low, the misunderstanding rate is high, and the like, the new fraud detection rule needs a long time to achieve actual deployment, and large economic loss is possibly caused.
In order to solve the problems, the invention provides an optimization processing method and device of a financial fraud modeling language, which can quickly convert fraud detection rules written by using the financial fraud modeling language FFML into an SQL programming language which can be identified by a flow platform, and has high processing efficiency and real-time performance.
The following describes a detailed description of the method and apparatus for optimizing financial fraud modeling language according to the present invention.
Referring to fig. 1, fig. 1 is a schematic diagram of an embodiment of the present invention. As shown in fig. 1, the optimization processing method of the financial fraud modeling language includes the following processing steps:
step 101, generating an FFML abstract syntax tree corresponding to fraud detection rules according to the fraud detection rules written by using a financial fraud modeling language FFML.
In the step, a domain expert uses a fraud detection rule written by financial fraud modeling language FFML, firstly symbol streams are generated by a lexical analyzer, the symbol streams generate syntax trees by syntax analysis, and the syntax trees can not be directly used as the input of semantic analysis, so that the syntax trees can be converted into an intermediate syntax representation, namely the FFML abstract syntax tree; and then, through subsequent steps, realizing the technical effects of carrying out semantic analysis based on the FFML abstract syntax tree, and generating a programming language which can be identified by a platform by using codes.
It should be noted that a bridge is needed between the parsing and the semantic analysis, and a parsing tree (also called a specific parsing tree) directly obtained in the parsing includes a lot of redundant syntactic structure information and cannot be directly used as an input of the speech analysis, so that an abstract syntax tree needs to be constructed in the parsing process as a syntactic intermediate representation connecting front and back ends.
Step 103, judging the node type of each node by traversing each node in the FFML abstract syntax tree; if the node type of the node is SingleCondition, go to step 107.
It should be noted that, in a specific implementation, step 101 may be implemented by a front-end design module of the optimization processing method of the financial fraud modeling language, and specifically, the front-end design module is configured to convert a fraud detection rule written by using the financial fraud modeling language FFML into an FFML abstract syntax tree corresponding to the fraud detection rule.
Steps 103 to 109 may be implemented by a back-end design module of the optimization processing method of the financial fraud modeling language.
General framework of the back end design module reference may be made to fig. 2, fig. 2 being a general framework diagram of the back end design module of the present invention. The visitor module is a main module in the whole back-end design module, traverses the FFML abstract syntax tree generated by the front-end design module, constructs code conversion logic in the traversal process, and then generates specific stream processing code by calling the template module. In the code conversion process, the cooperative work of the symbol table and the built-in function module is required, and the generated stream processing code is subjected to targeted optimization according to the code optimization module. The part outlined by the dashed line in fig. 2 is the specific composition of the entire post-conversion end.
The specific functions of each module in the back-end design module are as follows:
the visitor module: and the visitor module integrates semantic actions required by code conversion, traverses the FFML abstract syntax tree, and realizes specific semantic analysis in cooperation with other modules in the traversal process.
Symbol table: the symbol table is used to store some symbols encountered during semantic analysis and their attribute information, facilitating access to common information by different parts of the visitor.
A built-in function module: the FFML language allows a user to call some built-in functions, such as TOTALDEBIT, BADACCOUNT and the like, and the code conversion of the built-in functions is uniformly processed by the built-in function module.
A template module: when the target code is generated, in order to avoid errors and unify the output form, a built-in code template is adopted, and the visitor fills the corresponding template to generate the final code.
A code optimization module: the execution efficiency of the last generated algorithm graph of the stream processing system is different for different stream processing codes, and the code optimization template defines several different code optimization methods to guide visitors to generate efficient stream processing codes.
The focus of the present invention is on the visitor module.
The translation conversion methods of different languages mainly include a grammar guidance method, a rule-based method and a model-based method. The method based on the model is more flexible than a method based on grammar guidance, more efficient and easier to read than a method based on rules, and more universal in the industry.
The core of the model-based approach is by building an intermediate representation model of the grammar, around which all speech-related actions are then developed. The invention adopts the abstract syntax tree as an intermediate representation model, and then adopts a visitor to traverse the abstract syntax tree to complete the concrete semantic conversion action.
The visitor mode defines a single visitor, integrates semantic actions for different abstract syntax tree nodes together, takes the abstract syntax tree nodes as parameters, and executes different operations according to different types of the nodes. Compared with the method that the semantic action of the abstract syntax tree is directly embedded into the heterogeneous abstract syntax tree, the visitor mode is more flexible and easy to expand.
The structure of fraud detection rules written using the financial fraud modeling language FFML is shown in table 1. The device mainly comprises four parts: rule naming, event sequence, condition definition, and action definition. "rule naming" assigns an ID to a currently defined rule; "event sequence" means that when any event is detected, the following operation is performed; "condition definition" defines that when a trigger event is detected, whether a variable in the event is to be checked for compliance with the condition herein; if the condition is met, the related action defined by the action definition is triggered. The body part of fraud detection rules written using the financial fraud modeling language FFML is the event sequence and the condition definitions.
In the step, the visitor module judges the node type of each node by traversing each node in the FFML abstract syntax tree; if the node type of the node is SingleCondition, go to step 107.
It should be noted that, here, only the processing manners of the nodes of the two node types are described in detail, and do not represent the nodes that can only process the two node types, and the processing manners of the nodes of other node types are described later.
In a specific implementation, the visitor module judges the node type of each node by traversing each node in the FFML abstract syntax tree; if the node type of the node is a SingleEvent, an event meeting the parameter requirement can be screened from a preset data stream according to the parameter requirement of a child node of the SingleEvent type, and first conversion data is generated.
The parameter requirements may include a time parameter, event sequence parameters, and operation information, where the operation information includes a channel and an operation behavior on the channel.
It should be noted that the SingleEvent type node includes two child nodes.
The method specifically comprises the following steps:
in a first step, two child nodes of a single event type node are accessed, the return value of the first child node is saved as a first variable, and the return value of the second child node is saved as a second variable.
Specifically, the first variable may be denoted as channel and the second variable may be denoted as params.
And secondly, determining the event type defined by the node of the SingleEvent type according to the second variable.
Specifically, a SingleEvent type node defines two types of events, including a simple independent event and a responsible sequence event. The judgment can be carried out through the return value of the second child node, namely the type of the second variable params, if the second variable params is a character string, the event defined by the node of the current SingleEvent type is a simple independent event, then the processing flow corresponding to the simple independent event is entered, if the second variable params is a list, the event defined by the node of the current SingleEvent type is a responsible sequence event, and then the processing flow corresponding to the complex sequence event is entered.
And thirdly, screening target events meeting the parameter requirements from a preset data stream by executing a processing flow corresponding to the event type, and generating first conversion data.
Specifically, according to parameter requirements carried in the first variable channel or the second variable params, selecting a target event meeting conditions from all event lists, generating a new table corresponding to the target event, and recording the new table as first conversion data.
And 107, converting the Boolean expression in the data stream according to the left return value of the left expression sub-node of the node, the comparison return value of the comparison operator sub-node and the right return value of the right expression sub-node to generate target conversion data.
Wherein the boolean expression includes a comparison expression, for example, a >1, b < ═ 2, i.e., with comparison operators, the comparison operators including: a >! Is as follows.
It should be noted that the sub-nodes of the single condition node type node are in the form of fixed left expression, comparison operator, and right expression.
In this step, when the node type of the node is SingleCondition, first, a first sub-node, that is, a left expression node is accessed to obtain a return value lhs thereof, and the node is divided into three classes of sub-nodes after recursive downward processing: simple event variables (EventParam), queries (Query), historical queries (histstelement).
Directly returning the event and the variable of the node of the simple event variable (EventParam) class; for nodes of Query (Query), a stream window aggregation conversion method or a stream processing system custom function (UDF) conversion method is adopted; and for the nodes of the historical query (HistStatement) class, processing the nodes by adopting a processing mode corresponding to the nodes of the HistStatement class.
Next, a second sub-node, i.e. the compare operator node, is accessed to obtain its return value op.
Then, a third child node, i.e., the right expression node, is accessed to obtain its returned value rhs.
And finally, performing comparative expression code conversion through the lhs, the op and the rhs, wherein the specific conversion is realized through connection (Join) and condition selection (Where) in the SQL language, firstly, connecting the lhs and the rhs through the connection (Join), and then, performing condition judgment through a condition selection (Where) grammar.
For example, the FFML abstract syntax tree in fig. 3 is taken as an example for explanation:
as shown in fig. 3, for the first SingleCondition node, it corresponds to FFML code QUERY total digital (ATM,2) <500.
The first step, accessing the first sub-node, namely the left expression node, which is a query node, and adopting built-in function optimization or a stream type window aggregation conversion method, wherein if the stream type window aggregation conversion method is adopted, the specific flow is as follows:
(a) the total number function represents the total transaction amount of the latest n days of inquiry, here, the total transaction amount of the latest 2 days of inquiry through the ATM channel, and the transaction amount is firstly aggregated by taking two days as a window, namely:
CREATE TEMPORARY VIEW`procedure_1`AS(SELECT accountnumber,
SUM(`value`),AS totaldebit,TUMBLE_END(rowtime,INTERVAL`2` DAY)AS rowtime FROM event_8GROUP BY accountnumber, TUMBLE(rowtime,INTERVAL`2`DAY))
a new table procedure _1 is obtained.
(b) Since TOTALDEBIT requires only the last N days of data, the last entry in the table needs to be taken and the TOP _ N syntax is used, i.e.
CREATE TEMPORARY VIEW`procedure_2`AS(SELECT accountnumber,totaldebit,rowtime FROM(SELECT*,ROW_NUMBER() OVER(PARTITION BY accountnumber ORDER BY rowtime DESC)as rownum FROM produce_1)WHERE rownum<=1)
(c) The left operand lhs is returned as (procedure _2, totaldebit).
And in the second step, accessing a second subnode, namely, a comparison operator node to obtain that the op is < >.
And thirdly, accessing a third child node, namely the right expression node, and obtaining that the rhs is 500.
The fourth step, the comparison expression is transformed by using WHERE syntax, i.e.
CREATE TEMPORARY VIEW`comparison_1`AS(SELECT accountnumber,rowtime FROM procedure_2 WHERE`totaldebit`<=500.0)
And fifthly, selecting all information from the complete event table, namely:
CREATE TEMPORARY VIEW`condition_1`AS(SELECT*FROM event_7,comparison_1WHERE event_7.accountnumber=comparison_1. accountnumber AND event_7.rowtime>=comparison_1.rowtime)
as shown in fig. 3, for the second SingleCondition node, it corresponds to a transfer. "value > -500.
In the first step, the left expression node is accessed and is a simple variable node, and the event variable is directly returned, namely ("transfer", "value").
And secondly, accessing the comparison operator node to obtain the op of > -.
And thirdly, accessing the right expression node to obtain the rhs of 500.
And fourthly, directly selecting the events meeting the conditions through a SELECT grammar, namely:
CREATE TEMPORARY VIEW`comparison_2`AS(SELECT*FROM transfer WHERE`value`>=500.0)
fifthly, reading a current table in the symbol table, which is marked as condition _1, and intersecting condition _2 and condition _1, namely:
CREATE TEMPORARY VIEW`condition_2`AS(SELECT*FROM comparison_2 WHERE id IN(SELECT id FROM condition_1))。
as shown in fig. 3, for the third SingleCondition node, it is a historical data QUERY point corresponding to HISTORY (4) [ QUERY TOTALDEBIT (ONL) > (100) ] > (1), and its specific flow may refer to the relevant description of the node of HistState type.
And step 109, generating an SQL code corresponding to the fraud detection rule according to the target conversion data.
In the step, by aiming at the processing modes corresponding to different types of nodes, the fraud detection rule written by using the financial fraud modeling language FFML is converted into the SQL programming language which can be identified by the flow platform, so that the processing efficiency is high, and the real-time performance is realized.
In a specific implementation, the SQL code corresponding to the fraud detection rule may be generated by using the first conversion data and the target conversion data.
Therefore, in the optimization processing method of the financial fraud modeling language provided by the invention, the FFML abstract syntax tree corresponding to the fraud detection rule can be generated based on the fraud detection rule compiled by using the financial fraud modeling language FFML, the corresponding conversion data is further generated according to the node type of each node in the FFML abstract syntax tree, and finally the SQL code corresponding to the fraud detection rule is generated according to each conversion data, so that the fraud detection rule compiled by using the financial fraud modeling language FFML can be quickly converted into the SQL programming language which can be identified by the streaming platform, the processing efficiency is high, and the real-time performance is realized.
In one implementation, HistStation type nodes are used to query data from historical data that satisfies a condition, with two child nodes, one for the number of entries to query and the other for the query condition.
The processing mode for the nodes of the HistStatement type is as follows:
firstly, accessing a first child node of the HistStatement type, obtaining the number of entries needing to be inquired, recording the number as d, writing the number into a hist _ days position in a symbol table, and using the number when accessing the conditional node later.
And secondly, accessing a second child node of the HistStatement type, namely a condition node, and storing return values of t and k, wherein t is a newly generated table, and k is a key value corresponding to the query condition.
And thirdly, recovering that the hist _ days in the symbol table is 1.
And fourthly, performing Counting (COUNT) aggregation by using the same k of table entries in the t, taking the table entries as a new column, creating a new table and returning.
For example, referring to fig. 3, fig. 3 is a first FFML abstract syntax tree in the present invention. The FFML abstract syntax tree in fig. 3 is taken as an example for explanation:
as shown in fig. 3, in the first step, the first child node of the histstitement type is accessed, the number d of entries to be queried is obtained to be 4, and hist _ days in the symbol table is set to be 4.
In the second step, the second child node of the HistStatement type node, namely the condition node, is accessed, and the following three new tables are generated, wherein the functions are aggregation, TOPN selection and comparison expression data filtering.
CREATE TEMPORARY VIEW`procedure_3`AS(SELECT accountnumber,SUM(`value`)AS totaldebit,TUMBLE_END(rowtime, INTERVAL`1`DAY)AS rowtime FROM event_9GROUP BY accountnumber, TUMBLE(rowtime,INTERVAL`1`DAY))
CREATE TEMPORARY VIEW`procedure_4`AS(SELECT accountnumber,totaldebit,rowtime FROM(SELECT*,ROW_NUMBER() OVER(PARTITION BY accountnumber ORDER BY rowtime DESC)as rownum FROM procedure_3)WHERE rownum<=4)
CREATE TEMPORARY VIEW`comparison_3`AS(SELECT accountnumber,rowtime FROM procedure_4 WHERE`totaldebit`>=100.0)
And thirdly, recovering that the hist _ days in the symbol table is 1.
Fourthly, performing COUNT aggregation on the data in the compare _3 table, and generating a new table COUNT _1 as follows.
CREATE TEMPORARY VIEW`count_1`AS(SELECT accountnumber, COUNT(*)AS daycount,MAX(rowtime)AS rowtime FROM comparison_3 GROUP BY accountnumber)
In the specific implementation, the invention further provides an optimization processing method of the financial fraud modeling language.
Referring to fig. 4, fig. 4 is a schematic diagram of another embodiment of the present invention. As shown in fig. 4, the optimization processing method of the financial fraud modeling language includes the following processing steps:
step 501, generating an FFML abstract syntax tree corresponding to a fraud detection rule according to the fraud detection rule written by using a financial fraud modeling language FFML.
The detailed description of this step can refer to step 101 in the optimization processing method of the financial fraud modeling language shown in fig. 1.
Step 503, determining the node type of the node by traversing each node in the FFML abstract syntax tree; if the node type of the node is SingleEvent, executing step 505; if the node type of the node is SingleCondition, go to step 513; if the node type of the node is eventstelement, go to step 515; if the node type of the node is conditionstatus, step 517 is executed.
It should be noted that, here, only the processing manner of the nodes of the four node types is described in detail, and does not represent that only the nodes of the four node types can be processed.
Step 505, two child nodes of the single event type node are accessed, the return value of the first child node is saved as a first variable, the return value of the second child node is saved as a second variable, and step 507 is executed.
Step 507, judging whether the second variable is a character string or a list; if the second variable is a character string, determining that the event type is a simple independent event, and performing step 509; if the second variable is a list, it is determined that the event type is a complex sequence event, and step 511 is performed.
Step 509, when the event type is a simple independent event, a first processing flow corresponding to the simple independent event is executed to screen a target event meeting the parameter requirement from a preset data stream, so as to generate first conversion data.
In this step, if the second variable params is a character string, the event defined by the node of the current SingleEvent type is a simple independent event, and then a first processing flow corresponding to the simple independent event is entered, and a target event meeting the parameter requirement is screened from a preset data stream, so as to generate first conversion data.
The first processing flow includes directly returning the event and the variable. The simple independent event only defines a certain operation behavior a of the account on a certain channel c, so that the selection syntax can be directly adopted to SELECT all a operations of the account to be executed through the channel c.
Step 511, when the event type is a complex sequence event, a second processing flow corresponding to the complex sequence event is executed, a target event meeting the parameter requirement is screened from a preset data stream, first conversion data is generated, and step 521 is executed.
In this step, if the second variable params is a list, the event defined by the node of the current SingleEvent type is a responsible sequence event, and then a second processing flow corresponding to the complex sequence event is entered.
It should be noted that the complex sequence event is composed of two parts: sequence time and sequence event group. The sequence event defines the maximum time span allowed by the occurrence of the sequence event, and the sequence event group defines the sequence relation of the occurrence of the event.
The second processing flow comprises: firstly, acquiring the time span parameter time and the event sequence parameter events through a params list; then, combining tables corresponding to events in events through a UNION ALL grammar, and only combining according to a common value required by event judgment, wherein the common value required by the event judgment comprises an event ID, an account ID, an event type and an event time, and the combined table is ALL _ events; next, adopting a Complex Event Processing (CEP) MATCH syntax to generate a new table m according to the events of the sequence time and the sequence event group from the all _ events table; and finally, only the basic information of the hit event is stored in the new table m, the complete information of the hit event is selected from the corresponding event table through a SELECT syntax, and the target event table n is created and returned.
Complex sequence events include complex events, such as ONL SEQ (10) (password _ change, transfer) indicating that an account is connected to the ONL channel for password modification and transfer operations within 10 seconds/minute.
Optionally, referring to fig. 5, fig. 5 is a flowchart illustrating the processing of the sub-steps of step 511 in the present invention. As shown in fig. 5, step 611 specifically includes the following sub-steps:
and a substep 61 of obtaining the time span parameter time and the event sequence parameter events from the second parameter params.
And a substep 62, merging the tables corresponding to the events in the event sequence parameters events to generate a merged table all _ events, where the merged table all _ events includes the basic information of the event.
And a substep 63, selecting target events meeting the requirement of the time span parameter time from the events in the combination table all _ events, and generating a target event table.
For example, referring to fig. 3, fig. 3 is a first FFML abstract syntax tree in the present invention. The FFML abstract syntax tree in fig. 3 is taken as an example for explanation:
first, as shown in fig. 3, for a node of the first SingleEvent type, corresponding to ONL SQE (5) [ past _ change, transfer ] in the FFML rule, first, two child nodes thereof are accessed to obtain a variable channel and a params, which are respectively "ONL" and a list [5 "," past _ change "," transfer "], since the params is a list, it is a complex sequence event, and then complex sequence processing is performed.
(a) The time span parameter time and the event sequence parameter events are obtained by params as 5 and [ "past _ change", "transfer" ], respectively.
(b) Events were merged using the UNION ALL syntax, resulting in the following three new tables:
CREATE TEMPORARY VIEW`event_1`AS(SELECT*FROM `password_change`WHERE change=`ONL`)
CREATE TEMPORARY VIEW`event_2`AS(SELECT*FROM`transfer` WHERE change=`ONL`)
CREATE TEMPORARY VIEW`event_3`AS(SELECT id,accountnumber, rowtime,eventtype FROM`event_1`)UNION ALL(SELECT id, accountnumber,rowtime,eventtype FROM`event_2`))
event _1 selects a past _ change event of the ONL channel, event _2 selects a transfer event, and event _3 combines event related meta information common in the two tables into one table.
(c) Complex event processing is carried out through stream processing MATCH grammar, and the following codes are obtained:
Figure RE-GDA0003324934760000141
Figure RE-GDA0003324934760000151
(d) since only the basic information of the hit event is stored in the table event _4, all the information of the hit event is selected from the corresponding event table by the SELECT syntax, and the target event table event _5 is created.
(e) Return to target event table event _ 5.
In the second step, as shown in fig. 3, for the node of the second SingleEvent type, corresponding to ATM [ transfer ] in the FFML rule, first, two child nodes are accessed to obtain a variable channel ═ ATM ", params is a character string, and therefore, the variable channel is a simple independent event, and then, the simple independent event is processed.
(a) The simple independent event directly adopts the SELECT grammar to SELECT the channel event, namely CREATE TEMPORARY VIEW ' event _6 ' AS (SELECT FROM TRANSFER WHERE CHANNEL ═ ATM ')
(b) Get and return target event table event _6
Step 513, according to the left return value of the left expression sub-node of the node, the comparison return value of the comparison operator sub-node and the right return value of the right expression sub-node, converting the boolean expression in the data stream to generate target conversion data, and executing step 521.
The detailed description of this step can refer to step 107 in the optimization processing method of the financial fraud modeling language shown in fig. 1.
Step 515, executing the processing flow of the sub-nodes by traversing the sub-nodes of the node, obtaining the SQL table name of each sub-node, storing the SQL table name in the events list, and executing step 516.
It should be noted that the node support of the eventstantent type defines multiple or events, and the sub-node of the eventstantent type is of a SingleEvent type, that is, a single independent event or a sequence event.
In this step, when the node type of the node is eventstvent, first, traversing the child nodes of the node of the eventstvent type, executing a SingleEvent processing flow corresponding to the child nodes of each SingleEvent type, obtaining an SQL table of the child nodes of each SingleEvent type, and storing the SQL table in an events list.
And step 516, merging the contents of all SQL tables in the events list to generate third conversion data, and executing step 521.
Specifically, the contents of ALL SQL tables in the events list may be merged through a UNION ALL operator.
In this step, since the node of the eventstvent type only supports events, the contents of ALL SQL tables in the events list may be merged, that is, ALL the contents in the SELECT single table are merged by the UNION ALL operator, a new stream processing table is generated and written into the symbol table at the event _ table, which is needed for the processing of the following condition definition related node, and the new stream processing table is used as the third conversion data.
For example, referring to fig. 6, fig. 6 is a second FFML abstract syntax tree in the present invention. The FFML abstract syntax tree in fig. 6 is taken as an example for explanation:
as shown in fig. 6, in the first step, the child nodes of the node of the eventstanteent type, that is, the two child nodes of the SingleEvent type are traversed, and the child nodes of the SingleEvent type are accessed by calling the processing flow corresponding to the node of the SingleEvent type to obtain the return values thereof, which are event _5 and event _6, respectively.
Second, the two events event _5 and event _6 can be merged with UNION ALL, i.e.
CREATE TEMPORARY VIEW`event_7`AS((SELECT*FROM event_5)UNION ALL(SELECT*FROM event_6))。
And thirdly, setting the event _ table in the symbol table to be event _ 7.
Step 517, sequentially accessing each child node of the nodes of the ConditionStatement type, and judging whether the logic operation after each child node is an AND operation or an OR operation; if the logical operation is an AND operation, updating the current table in the symbol table to be a stack top element; if the logical operation is an or operation, the current table in the symbol table is updated to the value corresponding to the event _ table in the symbol table, and step 519 is executed.
Wherein the nodes of the ConditionStationtype include a plurality of SingleCondition type nodes connected by a logical symbol AND and a logical symbol OR.
In this step, when the node type of the node is the ConditionStatement, first, each child node of the ConditionStatement type is sequentially accessed until all child nodes are accessed, and the processing flow for each child node is as follows:
step one, taking the return value of the child node as a stack top element, and judging the logic operation behind the child node; if the logical operation is an and operation, the "second step" is performed, and if the logical operation is an or operation, the "third step" is performed.
And secondly, updating the stack top element of the current table in the symbol table, and popping up the stack top element.
And thirdly, updating the current table in the symbol table to a value corresponding to the event _ table in the symbol table.
For example, the FFML abstract syntax tree in fig. 3 is taken as an example for explanation:
as shown in fig. 3, in the first step, a first child node of a node of the ConditionStatement type is accessed, a return value condition _1 is obtained by calling an access function of SingleCondition, and is pressed to the top of the stack, and then a logical operation of a second child node is obtained as an and operation by accessing the second child node.
And secondly, updating the current table in the symbol table to condition _1, and popping up the stack top element.
And thirdly, accessing the second child node, obtaining the return value condition _2 of the second child node, pressing the second child node to the top of the stack, and then determining that the logical operation of the third child node is an OR operation.
And fourthly, updating the current table in the symbol table to be a value corresponding to the event _ table in the symbol table, namely, event _ 7.
And step five, accessing a third child node to obtain the return value condition _3 of the third child node.
And sixthly, merging the two remaining tables in the stack, namely:
CREATE TEMPORARY VIEW`condition_4`AS((SELECT*FROM condition_2)UNION ALL(SELECT*FROM condition_3))
step 519, after all child nodes of the node of the ConditionStatement type complete access, all tables in the stack are merged to generate fourth conversion data.
In this step, after ALL child nodes of the node of the conditionstateful type have completed access, ALL tables in the stack are merged by UNION ALL to obtain a new table, and the new table is written into the condition _ table in the symbol table.
Step 521, generating an SQL code corresponding to the fraud detection rule according to the first conversion data, the target conversion data, the third conversion data, and the fourth conversion data.
In the step, by aiming at the processing modes corresponding to different types of nodes, the fraud detection rule written by using the financial fraud modeling language FFML is converted into the SQL programming language which can be identified by the flow platform, so that the processing efficiency is high, and the real-time performance is realized.
Therefore, in the optimization processing method of the financial fraud modeling language provided by the invention, the FFML abstract syntax tree corresponding to the fraud detection rule can be generated based on the fraud detection rule compiled by using the financial fraud modeling language FFML, the corresponding conversion data is further generated according to the node type of each node in the FFML abstract syntax tree, and finally the SQL code corresponding to the fraud detection rule is generated according to each conversion data, so that the fraud detection rule compiled by using the financial fraud modeling language FFML can be quickly converted into the SQL programming language which can be identified by the streaming platform, the processing efficiency is high, and the real-time performance is realized.
In one implementation, fraud detection rules written using the financial fraud modeling language FFML can be quickly translated into the Flink-based SQL programming language that the platform can recognize. The invention can optimize the performance of the generated SQL code according to the characteristics of the Flink stream processing system, and specifically comprises the following four aspects:
first, UNION ALL optimization.
The UNION ALL operation of the stream processing system is different in nature from the merge operation of the database table, and requires special processing. UNION ALL is actually inside a stream processing system simply merging two streams of data together into the next operator. Since the stream processing system operator operation is time driven, e.g. a window operation is triggered only when a watermark exceeding the window end time reaches the current operator, the merging and streaming of the time watermark requires extra attention for the merging of the data streams. For an operator with a plurality of input streams, the operator time of the Flink stream processing system takes the minimum value of the input stream time, which results in that if one input stream has no data to arrive, that is, no new watermark arrives, no matter how other input streams advance, and the concurrent operator time does not advance, that is, no new time watermark is sent downstream, the stream processing system time will be blocked at the operator, and the operation triggered by the subsequent operator by the time will not be executed.
In the invention, the generated code does not adopt UNION ALL to merge the data streams, but a subsequent operator is configured for each data stream, namely, the merging is avoided by a double operator mode.
For example, in the optimization processing method of financial fraud modeling language shown in fig. 4, the example in "example" in step 516 is taken as an example to illustrate the change of the processing flow after "UNION ALL optimization" is used, and specifically, what is changed is the "second step" and the "third step".
Specifically, after using "UNION ALL optimization", the following is exemplified:
as shown in fig. 6, in the first step, the child nodes of the node of the eventstanteent type, that is, the two child nodes of the SingleEvent type are traversed, and the child nodes of the SingleEvent type are accessed by calling the processing flow corresponding to the node of the SingleEvent type to obtain the return values thereof, which are event _5 and event _6, respectively.
In the second step, the event _ table in the symbol table is set to a list [ event _5, event _6 ].
For another example, in the optimization processing method of the financial fraud modeling language shown in fig. 4, taking the example in "example" in step 517 as an example, after the "UNION ALL optimization" is used, tables with different names but the same actual content may be merged into one table without creating the table event _3, so that the number of tables can be greatly reduced, and further, the number of finally generated operators can be reduced.
The invention can check each newly created table by a method of constructing the global view information table during conversion, and directly return the ID of the created table if the newly created table exists. The key value of the global view information table is formed by combining the template name for creating the table and the value of the filling item of the template name, so that the specific meaning of the table can be accurately and uniquely expressed.
Specifically, after using "UNION ALL optimization", the "sixth step" is modified, and the improvement procedure is as follows:
the FFML abstract syntax tree in fig. 3 is taken as an example for explanation:
as shown in fig. 3, in the first step, a first child node of a node of the ConditionStatement type is accessed, a return value condition _1 is obtained by calling an access function of SingleCondition, and is pressed to the top of the stack, and then a logical operation of a second child node is obtained as an and operation by accessing the second child node.
And secondly, updating the current table in the symbol table to condition _1, and popping up the stack top element.
And thirdly, accessing the second child node, obtaining the return value condition _2 of the second child node, pressing the second child node to the top of the stack, and then determining that the logical operation of the third child node is an OR operation.
And fourthly, updating the current table in the symbol table to be a value corresponding to the event _ table in the symbol table, namely, event _ 7.
And step five, accessing a third child node to obtain the return value condition _3 of the third child node.
And sixthly, performing the first step to the fifth step on all tables in the event _ table in the symbol table to obtain 4 new tables, namely, condition _1, condition _2, condition _3 and condition _4.
For another example, taking the example in "illustration" under sub-step 63 in the optimization processing method of financial fraud modeling language shown in fig. 4 as an example, after "UNION ALL optimization" is used, the "UNION ALL operation" is removed in "(b)" in "first step", that is, table event _3 is not created, and tables with different names but the same actual content are merged into one table, so that the number of tables can be greatly reduced, and further, the number of finally generated operators can be reduced.
Second, table deduplication optimization.
Table deduplication optimization involves the merging of two tables with the same definition, all of which are involved in the access of each node.
For example, CREATE TEMPORARY VIEW 'event _ 4' AS (SELECT FROM TRANSFER WHERE CHANNEL `, ` ATM `)
CREATE TEMPORARY VIEW 'event _ 5' AS (SELECT FROM TRANSFER WHERE CHANNEL: 'ATM'); since event _4 and event _5 are identical, after opening the table deduplication optimization, the two tables will be merged into one, i.e. only event _4.
And thirdly, optimizing a built-in function.
Although the built-in function is required to be directly realized by the window function of the stream processing system, the efficiency of the window operator is not necessarily high, and the influence factors are many, such as the configuration of the stream processing system, the characteristics of the data flowing in and the like, and the window operator needs to maintain a large number of states and consumes more resources; meanwhile, most data required to be inquired by the built-in function is simple, for example, the transfer sum of a certain account in the last day is sensitive in practical application, and the data can be recorded by the original database system, so that when the built-in function is processed, a method for inquiring an external database can be directly adopted instead of a method for stream processing, namely a corresponding process is established by adopting a stream processing bottom layer API, and the external database is directly inquired in the process, and a result is returned.
For example, based on the example in sub-step 63 of the sub-step process flow of step 511 shown in fig. 5, the process flow after "built-in function optimization" is explained, and compared with the example in sub-step 63, the process flow is improved in "first step" here:
firstly, accessing a first child node, namely a left expression node, wherein the node is a query node, and the specific flow is as follows:
(a) performing local JOIN by using a stream processing built-in function syntax, namely:
CREATE TEMPORARY VIEW`procedure_2`AS(SELECT S.id,S. rowtime,T.v AS totaldebit FROM event_4 AS S,LATERAL TABLE(TOTALDEBIT(accountnumber,`ATM`,2,1))AS T(v))
(b) the left operand lhs is returned as (procedure _2, totaldebit).
And in the second step, accessing a second subnode, namely, a comparison operator node to obtain that the op is < >.
And thirdly, accessing a third child node, namely the right expression node, and obtaining that the rhs is 500.
The fourth step, the comparison expression is transformed by using WHERE syntax, i.e.
CREATE TEMPORARY VIEW`comparison_1`AS(SELECT accountnumber,rowtime FROM procedure_2 WHERE`totaldebit`<=500.0)
And fifthly, selecting all information from the complete event table, namely:
CREATE TEMPORARY VIEW`condition_1`AS(SELECT*FROM event_7,comparison_1WHERE event_7.accountnumber=comparison_1. accountnumber AND event_7.rowtime>=comparison_1.rowtime)
as shown in fig. 3, for the second SingleCondition node, it corresponds to a transfer. "value > -500.
In the first step, the left expression node is accessed and is a simple variable node, and the event variable is directly returned, namely ("transfer", "value").
And secondly, accessing the comparison operator node to obtain the op of > -.
And thirdly, accessing the right expression node to obtain the rhs of 500.
And fourthly, directly selecting the events meeting the conditions through a SELECT grammar, namely:
CREATE TEMPORARY VIEW`comparison_2`AS(SELECT*FROM transfer WHERE`value`>=500.0)
fifthly, reading a current table in the symbol table, which is marked as condition _1, and intersecting condition _2 and condition _1, namely:
CREATE TEMPORARY VIEW`condition_2`AS(SELECT*FROM comparison_2 WHERE id IN(SELECT id FROM condition_1))。
as shown in fig. 3, for the third SingleCondition node, it is a historical data QUERY point corresponding to HISTORY (4) [ QUERY TOTALDEBIT (ONL) > (100) ] > (1), and its specific flow may refer to the relevant description of the node of HistState type.
Fourthly, table updating optimization: for a database system, the table update only needs to rewrite the data in the table, but for a stream processing system, the table cannot be rewritten because the table is actually a single data stream, and when the table entry is updated, a new piece of data needs to be retransmitted to the stream and an update identifier is attached, obviously, the updating operation is not efficient. If the table is updated very frequently, a large number of stream elements will appear in the stream processing system, degrading system performance. Thus, table update optimization translates code that generates table updates in code into code that does not require table updates.
The table updating optimization is mainly embodied in the processing flow of the nodes of HistStatement type.
For example, taking the example of "example" in the "processing method for a node of the histstable type" as an example, the flow improvement after the "table update optimization" is performed is described, the improvement point is mainly the "fourth step", and the processing flow after the improvement is as follows:
as shown in fig. 3, in the first step, the first child node of the histstitement type is accessed, the number d of entries to be queried is obtained to be 4, and hist _ days in the symbol table is set to be 4.
In the second step, the second child node of the HistStatement type node, namely the condition node, is accessed, and the following three new tables are generated, wherein the functions are aggregation, TOPN selection and comparison expression data filtering.
CREATE TEMPORARY VIEW`procedure_3`AS(SELECT accountnumber,SUM(`value`)AS totaldebit,TUMBLE_END(rowtime, INTERVAL`1`DAY)AS rowtime FROM event_9 GROUP BY accountnumber, TUMBLE(rowtime,INTERVAL`1`DAY))
CREATE TEMPORARY VIEW`procedure_4`AS(SELECT accountnumber,totaldebit,rowtime FROM(SELECT*,ROW_NUMBER() OVER(PARTITION BY accountnumber ORDER BY rowtime DESC)as rownum FROM procedure_3)WHERE rownum<=4)
CREATE TEMPORARY VIEW`comparison_3`AS(SELECT accountnumber,rowtime FROM procedure_4 WHERE`totaldebit`>=100.0)
And thirdly, recovering that the hist _ days in the symbol table is 1.
Fourthly, rolling window aggregation is carried out on the data in the compare _3 table, the window time is set to be 1 second, and the global COUNT aggregation is not adopted directly, namely
CREATE TEMPORARY VIEW`count_1`AS(SELECT id,MAX (rowtime)AS rowtime,COUNT(*)AS daycount FROM comparison_3 GROUP BY id,TUMBLE(rowtime,INTERVAL`1`SECOND))
The invention also provides an optimization processing device of the financial fraud modeling language. Referring to fig. 7, fig. 7 is a schematic structural diagram of an optimization processing apparatus of the financial fraud modeling language of the present invention.
As shown in fig. 7, the apparatus 80 includes: a first generation module 801, a first judgment module 802, a third generation module 804 and a fourth generation module 805;
the first generating module 801 is configured to generate an FFML abstract syntax tree corresponding to a fraud detection rule according to the fraud detection rule written by using a financial fraud modeling language FFML;
the first determining module 802 is configured to determine a node type of each node by traversing each node in the FFML abstract syntax tree;
the third generating module 804 is configured to, if the node type of the node is SingleCondition, convert the boolean expression in the data stream according to a left return value of a left expression sub-node of the node, a comparison return value of a comparison operator sub-node, and a right return value of a right expression sub-node, and generate target conversion data;
the fourth generating module 805 is configured to generate an SQL code corresponding to the fraud detection rule according to the target conversion data.
Optionally, the apparatus further comprises: an execution module and a fifth generation module;
the execution module is used for executing the processing flow of the sub-nodes by traversing the sub-nodes of the node if the node type of the node is eventstvent, obtaining the SQL table of each sub-node and storing the SQL table in an events list;
and the fifth generation module is used for merging the contents of all SQL tables in the events list to generate third conversion data.
Optionally, the fifth generating module is specifically configured to merge contents of ALL SQL tables in the events list through a UNION ALL operator.
Optionally, the apparatus further comprises: the device comprises a second judgment module, a first updating module, a second updating module and a merging module;
the second judging module is used for sequentially accessing each child node of the type of ConditionStatement if the node type of the node is the ConditionStatement, and judging whether the logic operation after each child node is AND operation or OR operation;
the first updating module is used for updating the current table in the symbol table to be a stack top element if the logic operation is an AND operation;
the second updating module is configured to update the current table in the symbol table to a value corresponding to the event _ table in the symbol table if the logical operation is an or operation;
the merging module is used for merging all tables in a stack to generate fourth conversion data after all child nodes of the node of the ConditionStatement type finish access; wherein the nodes of the ConditionStationtype include a plurality of SingleCondition type nodes connected by a logical symbol AND and a logical symbol OR.
Optionally, the fourth generating module 805 is specifically configured to generate, according to the target conversion data, the third conversion data, and the fourth conversion data, an SQL code corresponding to the fraud detection rule.
Therefore, the optimized processing device of the financial fraud modeling language provided by the invention can generate the FFML abstract syntax tree corresponding to the fraud detection rule based on the fraud detection rule compiled by using the financial fraud modeling language FFML, further generate corresponding conversion data according to the node type of each node in the FFML abstract syntax tree, finally generate the SQL code corresponding to the fraud detection rule according to each conversion data, can quickly convert the fraud detection rule compiled by using the financial fraud modeling language FFML into the SQL programming language which can be identified by a flow platform, and has high processing efficiency and real-time property.
It is to be noted that the terms "comprises" and "comprising" and any variations thereof in the embodiments and drawings of the present invention are intended to cover non-exclusive inclusions. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not limited to only those steps or elements listed, but may alternatively include other steps or elements not listed, or inherent to such process, method, article, or apparatus.
Those of ordinary skill in the art will understand that: the figures are merely schematic representations of one embodiment, and the blocks or flow diagrams in the figures are not necessarily required to practice the present invention. Those of ordinary skill in the art will understand that: modules in the devices in the embodiments may be distributed in the devices in the embodiments according to the description of the embodiments, or may be located in one or more devices different from the embodiments with corresponding changes. The modules of the above embodiments may be combined into one module, or further split into multiple sub-modules.
Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims (10)

1. An optimization processing method for a financial fraud modeling language is characterized by comprising the following steps:
generating an FFML abstract syntax tree corresponding to fraud detection rules according to the fraud detection rules written by using a financial fraud modeling language FFML;
judging the node type of each node by traversing each node in the FFML abstract syntax tree;
if the node type of the node is singleCondition, converting the Boolean expression in the data stream according to the left return value of the left expression sub-node of the node, the comparison return value of the comparison operator sub-node and the right return value of the right expression sub-node to generate target conversion data;
and generating a Structured Query Language (SQL) code corresponding to the fraud detection rule according to the target conversion data.
2. The method of claim 1, further comprising:
if the node type of the node is EventStatement, executing the processing flow of the sub-node by traversing the sub-nodes of the node to obtain an SQL table of each sub-node, and storing the SQL table in an events list;
and merging the contents of all SQL tables in the events list to generate third conversion data.
3. The method of claim 2, wherein the step of merging the contents of all SQL tables in the events list comprises:
the contents of ALL SQL tables in the events list are merged through a UNION ALL operator.
4. The method of claim 1, further comprising:
if the node type of the node is ConditionStatement, sequentially accessing each child node of the ConditionStatement type, and judging whether the logic operation behind each child node is AND operation or OR operation;
if the logical operation is an AND operation, updating the current table in the symbol table to be a stack top element;
if the logic operation is an OR operation, updating the current table in the symbol table to a value corresponding to the event _ table in the symbol table;
after all child nodes of the node of the ConditionStatement type are completely accessed, merging all tables in a stack to generate fourth conversion data;
wherein the nodes of the ConditionStationtype include a plurality of SingleCondition type nodes connected by a logical symbol AND and a logical symbol OR.
5. The method according to any one of claims 1 to 4, wherein the step of generating a Structured Query Language (SQL) code corresponding to the fraud detection rule according to the target translation data includes:
and generating an SQL code corresponding to the fraud detection rule according to the target conversion data, the third conversion data and the fourth conversion data.
6. An optimization processing apparatus of a financial fraud modeling language, the apparatus comprising: the device comprises a first generation module, a first judgment module, a third generation module and a fourth generation module;
the first generation module is used for generating an FFML abstract syntax tree corresponding to fraud detection rules according to the fraud detection rules written by using a financial fraud modeling language FFML;
the first judging module is used for judging the node type of each node by traversing each node in the FFML abstract syntax tree;
the third generation module is configured to, if the node type of the node is SingleCondition, convert the boolean expression in the data stream according to a left return value of a left expression sub-node of the node, a comparison return value of a comparison operator sub-node, and a right return value of a right expression sub-node, and generate target conversion data;
and the fourth generation module is used for generating an SQL code corresponding to the fraud detection rule according to the target conversion data.
7. The apparatus of claim 6, further comprising: an execution module and a fifth generation module;
the execution module is used for executing the processing flow of the sub-nodes by traversing the sub-nodes of the node if the node type of the node is eventstvent, obtaining the SQL table of each sub-node and storing the SQL table in an events list;
and the fifth generation module is used for merging the contents of all SQL tables in the events list to generate third conversion data.
8. The apparatus of claim 7,
the fifth generating module is specifically configured to merge the contents of ALL SQL tables in the events list through a UNION ALL operator.
9. The apparatus of claim 6,
the device further comprises: the device comprises a second judgment module, a first updating module, a second updating module and a merging module;
the second judging module is used for sequentially accessing each child node of the type of ConditionStatement if the node type of the node is the ConditionStatement, and judging whether the logic operation after each child node is AND operation or OR operation;
the first updating module is used for updating the current table in the symbol table to be a stack top element if the logic operation is an AND operation;
the second updating module is configured to update the current table in the symbol table to a value corresponding to the event _ table in the symbol table if the logical operation is an or operation;
the merging module is used for merging all tables in a stack to generate fourth conversion data after all child nodes of the node of the ConditionStatement type finish access; wherein the nodes of the ConditionStationtype include a plurality of SingleCondition type nodes connected by a logical symbol AND and a logical symbol OR.
10. The apparatus of claim 6,
the fourth generating module is specifically configured to generate an SQL code corresponding to the fraud detection rule according to the target conversion data, the third conversion data, and the fourth conversion data.
CN202110712728.7A 2021-06-25 2021-06-25 Optimization processing method and device for financial fraud modeling language Pending CN113721896A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110712728.7A CN113721896A (en) 2021-06-25 2021-06-25 Optimization processing method and device for financial fraud modeling language

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110712728.7A CN113721896A (en) 2021-06-25 2021-06-25 Optimization processing method and device for financial fraud modeling language

Publications (1)

Publication Number Publication Date
CN113721896A true CN113721896A (en) 2021-11-30

Family

ID=78673069

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110712728.7A Pending CN113721896A (en) 2021-06-25 2021-06-25 Optimization processing method and device for financial fraud modeling language

Country Status (1)

Country Link
CN (1) CN113721896A (en)

Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
AU2003204824A1 (en) * 2002-06-20 2004-01-22 Canon Kabushiki Kaisha Methods for Interactively Defining Transforms and for Generating Queries by Manipulating Existing Query Data
CN101561817A (en) * 2009-06-02 2009-10-21 天津大学 Conversion algorithm from XQuery to SQL query language and method for querying relational data
AU2012201466A1 (en) * 2005-06-27 2012-04-05 Csc Technology Singapore Pte Ltd Code Transformation
JP2012252594A (en) * 2011-06-03 2012-12-20 Fujitsu Ltd Name identification rule generating method, apparatus and program
CN103927473A (en) * 2013-01-16 2014-07-16 广东电网公司信息中心 Method, device and system for detecting source code safety of mobile intelligent terminal
US20150234642A1 (en) * 2013-01-29 2015-08-20 ArtinSoft Corporation User Interfaces of Application Porting Software Platform
CN106293653A (en) * 2015-05-19 2017-01-04 深圳市腾讯计算机***有限公司 Code process method and device
CN107704382A (en) * 2017-09-07 2018-02-16 北京信息科技大学 Towards Python function call path generating method and system
CN107766107A (en) * 2017-10-31 2018-03-06 四川长虹电器股份有限公司 The analytic method of xml document universal parser based on Xpath language
CN109697201A (en) * 2018-12-27 2019-04-30 清华大学 A kind of method of query processing, system, equipment and computer readable storage medium
CN110597502A (en) * 2019-08-20 2019-12-20 北京东方国信科技股份有限公司 Single-step debugging method for realizing PL/SQL language based on java
CN111324344A (en) * 2020-02-28 2020-06-23 深圳前海微众银行股份有限公司 Code statement generation method, device, equipment and readable storage medium
CN111638883A (en) * 2020-05-14 2020-09-08 四川新网银行股份有限公司 Decision engine implementation method based on decision tree

Patent Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
AU2003204824A1 (en) * 2002-06-20 2004-01-22 Canon Kabushiki Kaisha Methods for Interactively Defining Transforms and for Generating Queries by Manipulating Existing Query Data
AU2012201466A1 (en) * 2005-06-27 2012-04-05 Csc Technology Singapore Pte Ltd Code Transformation
CN101561817A (en) * 2009-06-02 2009-10-21 天津大学 Conversion algorithm from XQuery to SQL query language and method for querying relational data
JP2012252594A (en) * 2011-06-03 2012-12-20 Fujitsu Ltd Name identification rule generating method, apparatus and program
CN103927473A (en) * 2013-01-16 2014-07-16 广东电网公司信息中心 Method, device and system for detecting source code safety of mobile intelligent terminal
US20150234642A1 (en) * 2013-01-29 2015-08-20 ArtinSoft Corporation User Interfaces of Application Porting Software Platform
CN106293653A (en) * 2015-05-19 2017-01-04 深圳市腾讯计算机***有限公司 Code process method and device
CN107704382A (en) * 2017-09-07 2018-02-16 北京信息科技大学 Towards Python function call path generating method and system
CN107766107A (en) * 2017-10-31 2018-03-06 四川长虹电器股份有限公司 The analytic method of xml document universal parser based on Xpath language
CN109697201A (en) * 2018-12-27 2019-04-30 清华大学 A kind of method of query processing, system, equipment and computer readable storage medium
CN110597502A (en) * 2019-08-20 2019-12-20 北京东方国信科技股份有限公司 Single-step debugging method for realizing PL/SQL language based on java
CN111324344A (en) * 2020-02-28 2020-06-23 深圳前海微众银行股份有限公司 Code statement generation method, device, equipment and readable storage medium
CN111638883A (en) * 2020-05-14 2020-09-08 四川新网银行股份有限公司 Decision engine implementation method based on decision tree

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
张春生, 王秀美: "实时性欺诈检测***的设计与实现", 计算机工程与应用, no. 16, pages 224 - 226 *
陈亚睿, 赵曦滨, 顾明: "《基于XML的通用目录服务检索引擎设计与实现》", 《计算机应用研究》, vol. 5, no. 12, pages 190 - 193 *

Similar Documents

Publication Publication Date Title
Kumar et al. Design and management of flexible process variants using templates and rules
US7653545B1 (en) Method of developing an interactive system
US20210224275A1 (en) Query classification and processing using neural network based machine learning
US7617230B2 (en) Finding similarity among sets of coordinated tasks
US11281862B2 (en) Significant correlation framework for command translation
CN113010547B (en) Database query optimization method and system based on graph neural network
US20090144229A1 (en) Static query optimization for linq
CN109614413B (en) Memory flow type computing platform system
CN110309289A (en) Sentence generation method, sentence generation device and intelligent equipment
WO2002005088A1 (en) Method and apparatus for extracting knowledge from software code or other structured data
CN111176656B (en) Complex data matching method and medium
CN101872449A (en) Customer information screening method
US20210034365A1 (en) Method for task orchestrating, orchestrator, device and readable storage medium
CN116745758A (en) Intelligent query editor using neural network-based machine learning
CN115641092A (en) Method and system for realizing automatic generation of logic check by importing data check plan
CN108932225B (en) Method and system for converting natural language requirements into semantic modeling language statements
WO2022213345A1 (en) Method and system for generating annotation of code segment, and readable storage medium
CN110309214A (en) A kind of instruction executing method and its equipment, storage medium, server
CN110008448B (en) Method and device for automatically converting SQL code into Java code
CN113391793B (en) Processing method and device of financial fraud modeling language for stream processing
CN113721896A (en) Optimization processing method and device for financial fraud modeling language
CN113391793A (en) Processing method and device of financial fraud modeling language for stream processing
GB2366402A (en) Syntax validation using syntax trees
CN115935943A (en) Analysis framework supporting natural language structure calculation
Saini et al. Domobot: An ai-empowered bot for automated and interactive domain modelling

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination