CN115034305A

CN115034305A - Method, system and storage medium for identifying fraudulent users in a speech network using a human-in-loop neural network

Info

Publication number: CN115034305A
Application number: CN202210652309.3A
Authority: CN
Inventors: 杨洋; 柯腾
Original assignee: Zhejiang University ZJU
Current assignee: Zhejiang University ZJU
Priority date: 2022-06-09
Filing date: 2022-06-09
Publication date: 2022-09-09

Abstract

The invention discloses a method, a system and a storage medium for identifying fraud users in a call network by utilizing a human-in-loop neural network, belonging to the technical field of communication information security. The method comprises the steps of obtaining user information and a call relation between users, using each user as a node in a call network, and establishing an edge between the nodes of the two users if the call relation exists between the two users. Modeling the call network by using a graph neural network model fusing node characteristics and edge characteristics, guiding a model training process by a human-in-loop method in the modeling process, completing training by combining known node types in the call network, and obtaining unknown node types in the call network by using the trained graph neural network model; the node types include normal users and fraudulent users. The method is based on the distribution and characteristic features of the fraud molecules in the call network, and combines the graph neural network technology and the human in-loop learning technology to realize accurate prediction of the fraud molecules in the call network.

Description

Method, system and storage medium for identifying fraudulent users in a speech network using a human-in-loop neural network

Technical Field

The present invention relates to the field of communication information security technology, and more particularly, to a method, system and storage medium for identifying fraudulent users in a communication network by using a human-in-circuit diagram neural network.

Background

With the development of the telecommunication industry and the popularization of telecommunication equipment, the task of detecting telecommunication fraud users becomes a hot spot researched by a large number of scholars and workers in related fields. Tseng [ Tseng 2015] represents the call duration and call frequency between users by constructing a network with weighted edges and executing a weighted HITS algorithm [ Kleinberg 1999] on the network to learn the trust values of telephone numbers and detect fraudulent calls from the trust values. Yang [ Yang 2019] utilizes the factor graph to construct the factor relationship between the call networks and uses the factor graph to identify fraudulent users in the call networks.

Graph structures in telephony networks are widely used to help identify fraudulent users. In recent years, deep learning has been applied to maps very rapidly, and a neural network (GNN) has achieved excellent performance in tasks such as node classification and link prediction. By treating users as nodes on the telephony network we can treat the telecom fraud user detection task as a node classification problem on the telephony network. The general paradigm of GNN is that the transformation of node features alternates with the aggregation of neighboring node features. Kipf [ Kipf 2017]And Li [ Li 2016]A method for averagely aggregating the characteristics of surrounding neighbor nodes is provided to smooth the embedded vectors of the neighbor nodes so as to make the neighbor nodes more similar.

[

2018]、Thekumparampil[Thekumparampil 2018]、 Brody[Brody 2021]And Kim [ Kim 2020]An attention mechanism is introduced in aggregating neighbor node features to selectively aggregate valid information. These methods simply aggregate node features and do not effectively handle edge features. Gong [ Gong 2019]And Jiang [ Jiang 2019]Respectively proposes to aggregate node features while fusing edge features. However, on a large-scale graph such as a call network, the complexity of the two methods for merging the node feature and the edge feature is very high, which results in low efficiency. Meanwhile, in a call network scene containing fraud users, because the fraud users are dispersedly hidden in common users, the traditional GNN is very easy to aggregate features between normal users and abnormal usersAnd (5) smoothing. Therefore, it is difficult to achieve good performance in this scenario using conventional GNNs.

Telecommunication fraud detection requires interpretability as necessary evidence of conviction. However, the conventional learning process of GNN is purely estimating the model parameters that achieve the best performance, ignoring interpretable goals. The human-in-loop method [ Li 2017, Zhang 2018] satisfies the expectation of human beings in a specific task by human beings participating in the learning process of the deep learning model, so that the model satisfies the interpretability of the reasoning process while having excellent performance. The invention aims to combine human presence and GNN to make the prediction process interpretable.

Disclosure of Invention

In order to solve the problems of poor model interpretability and low detection precision of the telecommunication fraud detection method in the prior art, the invention provides a method, a system and a storage medium for identifying fraud users in a call network by utilizing a human-in-loop neural network, and the fraud molecules in the call network are accurately predicted by combining a graph neural network technology and a human-in-loop learning technology based on the distribution and characteristic features of the fraud molecules in the call network.

The invention adopts the following technical scheme:

in a first aspect, the present invention provides a method for identifying fraudulent users in a speech network by using a human-in-loop neural network, comprising:

acquiring user information and a call relation between users, taking each user as a node in a call network, and establishing an edge between nodes corresponding to the two users if the call relation exists between the two users; acquiring initial characteristics of each node and each edge, and constructing a call network G (V, E, X, S), wherein V represents a node set, E represents an edge set, X represents a node initial characteristic set, and S represents an edge initial characteristic set;

modeling the call network by using a graph neural network model fusing node characteristics and edge characteristics, guiding a model training process by a human-in-loop method in the modeling process, completing training by combining known node types in the call network, and obtaining unknown node types in the call network by using the trained graph neural network model; the node types include normal users and fraudulent users.

In a second aspect, the present invention provides a system for identifying fraudulent users in a speech network by using a human-in-loop neural network, for implementing the above-mentioned method for identifying fraudulent users in a speech network by using a human-in-loop neural network.

In a third aspect, the present invention provides a computer-readable storage medium, having a program stored thereon, which, when being executed by a processor, is for implementing the above-mentioned method for identifying fraudulent users in a speech network by using a human-in-loop neural network.

Compared with the prior art, the invention has the beneficial effects that: the method combines the graph neural network and the human-in-loop learning method, models the recognition framework of the behavior characteristics of the fraud users, can realize the node characteristics, the edge characteristics and the network structure of the users in the given communication network, outputs the probability of whether each user is a fraud molecule or not through modeling the user characteristics, and realizes accurate prediction of the fraud molecules.

Drawings

FIG. 1 is an overall structural schematic diagram illustrating a method of identifying fraudulent users in a telephony network using a human-in-loop neural network, according to an exemplary embodiment;

FIG. 2 is an illustration of ablation experimental results, according to an exemplary embodiment;

FIG. 3 is a schematic diagram illustrating a human-in-loop learning framework parameter analysis according to an exemplary embodiment.

Detailed Description

The invention is further illustrated with reference to the following figures and examples. The figures are only schematic illustrations of the invention, some of the block diagrams shown in the figures are functional entities, which do not necessarily have to correspond to physically or logically separate entities, which may be implemented in software, or in one or more hardware modules or integrated circuits, or in different networks and/or processor devices and/or microcontroller devices.

In this embodiment, a model used in a method for identifying a fraud user in a call network by using a human-in-loop neural network is denoted as "GTF", and is a deep learning framework based on a graph neural network and human-in-loop learning. The overall framework of the invention consists of four main modules: the system comprises a local self-adaptive node aggregation module for aggregating node features, a masking attention edge aggregation module for aggregating edge features, a prediction module and a sub-graph level human-in-loop training module. Specifically, the method analyzes fraud molecules and common users in the call network to obtain the characteristics of the fraud molecules; then, taking the user as a node, and modeling the user by combining the designed graph neural network fusing node characteristics and edge characteristics; meanwhile, in the learning process of the model, the sub-graph learning direction is guided by a person in a loop method to improve the interpretability of the model; and finally, the trained model is used for providing a prediction result of the fraudulent user.

Fig. 1 shows an overall structure diagram of a method for identifying a fraudulent user in a call network by using a human-in-loop neural network, wherein fig. 1(a) shows a specific structure of the call network. Each node of the call network represents a user, the network establishes edges according to calls among the users, and each node and each edge contain characteristics. The ultimate goal of the present invention is to predict the identity of an unknown user from the identity of known users in the network. Fig. 1(b) and (c) are a graph neural network classification model (including a local adaptive node aggregation module for aggregating node features, a masked attention edge aggregation module for aggregating edge features, and a prediction module) containing fusion point edge features and a sub-graph human-in-loop training module in the invention, respectively. Specifically, the invention designs a local self-adaptive attention mechanism in a node aggregation module, which is used for effectively aggregating the characteristics of neighbor nodes for each node; a masked attention mechanism is designed in an edge aggregation module and is used for selectively aggregating adjacent edge characteristics for each node, and finally two obtained characteristics are connected and then final prediction is carried out. In the sub-graph level person loop training module, local sub-graphs of fraud users which are difficult to judge in the training process are labeled by experts, and are fed back to a graph neural network classification model with the fusion point edge characteristics for iterative training, so that the learning process of the model is guided to be understood by people, and finally the model has interpretability.

The four modules are explained below.

The method comprises the steps of (A) locally adaptive node convergence module.

The invention designs a local sensing node to adaptively aggregate each specific neighborhood, thereby coding the local context into node embedding. In order to adaptively aggregate different neighbors for each node, a unique weight vector is generated specifically for the self-centering network of each user, and the calculation formula is as follows:

wherein,

is node v _j Node embedding in layer l-1 in a locally adaptive node aggregation module,

is node v _i The set of neighbor nodes, including itself, MLP (-) is a multi-layer perceptron,

is node v _i Weight vectors in layer l. A weight vector specific to each node may be obtained for different local contexts of the node. Furthermore, attention coefficients are calculated using different weight vectors, the formula being as follows:

wherein, ω is ^(l) Is the l-th layer weight parameterThe matrix is a matrix of a plurality of pixels,

and

is node v _i And v _j Embedded in the nodes at level l-1, LeakyReLU (·) represents the activation function,

is node v in the l-th layer _i And v _j Attention coefficient between, i.e. user node level aggregation weight; and | represents a splicing operation. To derive the aggregation weight, the attention coefficient is normalized using the softmax function as follows:

then, the node information of the neighbors is aggregated by using the aggregation weight, as follows:

wherein,

an embedded representation of node information representing a layer l aggregation neighbor;

since there may be a phenomenon of lack of target neighbors in the neighborhood, which leads to the problem of over-smoothing, in this embodiment, the node embedding and the aggregation result of the neighborhood are connected to avoid the problem of over-smoothing, as shown below:

finally, obtained

I.e. node v _i And embedding nodes in the l-th layer, and embedding the nodes in the last layer as node-level characteristics.

It should be noted that the self-centering network is composed of a single center node and neighbor nodes of the center node, and the edges include only edges between the center node and the neighbor nodes and edges between the neighbor nodes and the neighbor nodes.

And (II) covering the attention edge aggregation module.

The invention first calculates the attention weight of each user for different calling behaviors. Then, several representative call behaviors with high coefficients are selected, and the call information is encoded into the edge embedding. Thus, some calls made by fraudsters in order to disguise their call behavior as normal behavior can be masked (dropped), and the model of the present invention can avoid aggregating misleading features from these disguised fraudsters to extract distinguishable features.

More specifically, to obtain initialized edge embedding, the present invention first pre-trains the call log (time series) corresponding to each edge as edge embedding s through the CPC model _i And then the side information is aggregated. In order to avoid the influence of the masquerading call behavior on feature aggregation, it is first necessary to identify the masquerading call behavior. The present invention proposes that abnormal call behavior of phase change can be detected by referring to the overall call pattern in each subscriber's self-centric network.

In this embodiment, first, the edge embedding s by aggregating the users _j To capture the embedding of the overall call pattern

To capture the interactive information in the call behavior, as shown in the following formula:

wherein s is _j Is an edge e _j Is embedded with an edge of [ epsilon ] (i) represents the node v _i A set of connected edges. Node-based initial features and regularizationEmbedding of body talk mode

The important coefficients for edge embedding are calculated as follows:

where a is the shared learnable attention weight vector, ω _n Is a weight parameter matrix, omega, of the node features _e Is an edge-embedded weight parameter matrix, x _i Is the node initial characteristic, | | represents the splicing operation, LeakyReLU (.) is the activation function, the superscript T is the transposition,

is and node v _i Connected edge e _j The edge of (2) is embedded with significant coefficients.

And node v _i Edge embedding of connected edges obtains higher importance factor

It is stated that the edge is an important edge. The lower the importance coefficient, the lower the importance of the edge. Therefore, the important coefficients embedded in the edges of all the edges of each node form an important coefficient set of the node, and the important coefficients in the set are used for forming the important coefficients of the node

Arranged in descending order as r _i Then for each node v _i Selection of r _i The top k edges, i.e., each node gets a set

Indicates the j-th selected _k A side; other edges than the k edges are masked because of the possibility of masquerading the call.

We then normalized the selected significant coefficients for each node as the attention mechanism:

wherein alpha is _i,j Is and node v _i Connected edge e _j The attention score of (1), i.e., the user edge-level aggregation weight;

finally, according to the attention score α _i,j The selected edges were polymerized as follows:

z _i ＝∑ _j∈Ω(i) α _i,j ω _e e _j

wherein z is _i Is node v _i Edge level features of (1).

And (III) a prediction module.

Given a telephony network G ═ V, E, X, S as input, V denotes a set of nodes, E denotes a set of edges, X denotes a set of node initial features, and S denotes a set of edge initial features (edge embedding). User node level embedding is obtained after the L-layer local self-adaptive node aggregation module and the mask attention edge aggregation module are respectively used

And user edge level embedding Z ═ Z ₁ ,…,z _m ) And m represents the number of nodes representing the user in the call network. To fully exploit these two aspects of information, the two embeddings are concatenated for final prediction, as follows:

wherein o is _i Is node v _i Is finally embedded.

In order to identify the fraudster, the final embedding is sent to a prediction module for classification, in this embodiment, the prediction module is composed of a linear transformation layer and a softmax layer, and a calculation formula of the prediction module is as follows:

wherein,

is node v _i Represents the probability that the user represented by the node is a fraudster; omega _f 、b _f Respectively, a learnable matrix and an offset.

During training, the loss function is defined as the cross-entropy loss with regularization:

wherein, y _i Is user v _i Theta is the learnable parameter set of the model, lambda ₁ Is a parameter of the regularization that,

is a training set.

And (IV) a sub-graph level human-in-loop training module.

Telecommunication fraud is a criminal activity that requires evidence to commit crimes. Thus, the fraud detection task requires interpretability of the prediction results. To present the detection process, a natural way is to show the subgraph aggregation decision process for each node, i.e. the higher the aggregation weight, the more likely it is to select nodes or edges to aggregate and provide useful information. Therefore, the aggregate weight of each subgraph is a key indicator of the model interpretability of the invention. However, although GNNs can achieve good performance with large amounts of data, the aggregate weights for each egocentric network trained by GNNs can be confusing. In other words, a large number of combinations of all aggregate weights may result in satisfactory performance, but there are few human-like intuitions. In order to solve the problems, the invention introduces a sub-graph human-in-loop training module in the training process of the model. Unlike most people in the loop method, the invention integrates pointsAggregation weights for user-centric networks predicted by graph neural network model of edge features

α _i,j And providing the information to a domain expert. To assist the human expert in understanding, the present embodiment discretely aggregates weights so that the expert only obtains information about whether to aggregate a particular node or edge. In this way, the expert can predict the aggregation strategy of the user identity according to the domain knowledge guidance module. The trained model can aggregate subgraphs and make predictions using a more interpretable strategy. Details are described next.

In the training process, from the training set

Some nodes (users) are extracted for marking. In the prediction of the graph neural network model of the fusion point edge characteristics, the sampling probability of the nodes is in direct proportion to the information entropy of the nodes, namely, users with more uncertain identities are favored. After sampling, the self-centric networks of these fraudsters are fed back to the expert. For a self-centric network, each node (and edge) connected to a central node is marked with a binary symbol to indicate whether the aggregate weight of the node (edge) is above a predefined threshold. The expert is then asked to flip some binary symbols that they consider incorrect, for example, an abnormal call from a neighbor marked as fraudulent would provide information if the edge was marked as "useless" by the model, i.e., the user edge-level aggregation weight α _i,j With a value of 0, the expert can provide his advice by simply flipping the label of the edge.

The self-centric network with the adjustment flag is then sent back to the model of the invention to improve its interpretability, i.e. to return the modified aggregation weights

α _i,j . To make the aggregation mode more human intuitive by computing these markers M _sample Aggregate weight P derived from model _sample Similarity between them to define a lossThe loss function, as follows:

wherein M is _i Is a label of the i-th sampling user's self-centric network, P _i Is the aggregate weight of the i-th sampling user's self-centric network, | V _sample I denotes the number of sampled users, | denotes a 2-norm,

indicating human loss in the circuit.

And finally, combining the graph neural network model of the fusion point edge characteristics and the sub-graph-level human in-loop training module, wherein the loss function of the whole training process is as follows:

after the training is finished, only the graph neural network model of the point-edge features needs to be fused to predict whether unknown users in the call network are fraudulent users. In this embodiment, the prediction process is: given a call network, aiming at unknown users in the call network G ═ V, E, X and S, obtaining user node level characteristics by a local adaptive node aggregation module, obtaining user edge level characteristics by a masking attention edge aggregation module, and outputting the probability of whether the users are fraud molecules by a prediction module by combining the user node level characteristics and the edge level characteristics, thereby realizing accurate prediction of the fraud molecules.

In one embodiment of the present invention, a method for identifying fraudulent users in a speech network by using a human-in-loop neural network comprises the following steps:

step 1, acquiring user information and a call relation between users, taking each user as a node in a call network, and establishing an edge between nodes corresponding to the two users if the call relation exists between the two users; acquiring initial characteristics of each node and each edge, and constructing a call network G (V, E, X, S), wherein V represents a node set, E represents an edge set, X represents a node initial characteristic set, and S represents an edge initial characteristic set;

step 2, modeling the call network by using a graph neural network model fusing node characteristics and edge characteristics, guiding a model training process by a human-in-loop method in the modeling process, completing training by combining known node types in the call network, and obtaining unknown node types in the call network by using the trained graph neural network model; the node types include normal users and fraudulent users.

Step 2 is the key point of the present invention, in step 2, the utilized graph neural network model comprises a local adaptive node aggregation network, a masking attention edge aggregation network and a prediction network, the local adaptive node aggregation network is used for obtaining user node level characteristics, the masking attention edge aggregation network is used for obtaining user edge level characteristics, and the prediction network is used for predicting node types and probabilities thereof according to the user node level characteristics and the user edge level characteristics. The local adaptive node aggregation network and the attention hiding edge aggregation network are used as two branches in the graph neural network model and are calculated simultaneously.

In this step, the locally adaptive node aggregation network is an L-layer structure, and the calculation process of each layer includes: firstly, calculating attention coefficients among nodes; calculating user node level aggregation weight according to the attention coefficient, and aggregating neighbor node information; recording an embedded representation of node information for layer l aggregation neighbors

The user node level features are:

wherein,

is node v _i Embedding nodes in the l-th layer, and embedding the last layerNode embedding of

As a user node level feature.

The calculation process of the hiding attention edge aggregation network comprises the following steps: firstly, calculating the important coefficient of each edge according to the initial characteristics of the nodes and the edges

Sorting the important coefficients of all edges in the edge set connected with each node, selecting the edge with the top importance, and normalizing the important coefficients by alpha _i,j (ii) a And aggregating edges in the omega (i) set as edge-level characteristics of the nodes according to the normalized important coefficients:

z _i ＝∑ _j∈Ω(i) α _i,j ω _e e _j

wherein z is _i Is a user edge level feature, Ω (i) is a node v _i Set of corresponding k edges of leading importance, α _i,j Is and node v _i Connected edge e _j I.e. the user edge-level aggregation weight, ω _e Is a matrix of weight parameters for the edge features.

The model training process is guided by a human-in-loop method in the modeling process, and comprises the following steps: randomly extracting a part of nodes representing users from a call network for sampling, wherein the sampled probability of the nodes is in direct proportion to the information entropy of the nodes; and modifying the node-level aggregation weight and the edge-level aggregation weight of each layer corresponding to the sampled node, and feeding the modified aggregation weights back to the local adaptive node aggregation network and the attention-covering edge aggregation network respectively, so that the local adaptive node aggregation network and the attention-covering edge aggregation network calculate user node-level characteristics and user edge-level characteristics according to the modified aggregation weights.

In this embodiment, a system for identifying fraudulent users in a call network by using a human-in-loop neural network is also provided, and the system is used for implementing the above embodiments, which have already been described and are not described again. The terms "module," "unit," and the like as used below may implement a combination of software and/or hardware of predetermined functions. Although the system described in the following embodiments is preferably implemented in software, an implementation in hardware, or a combination of software and hardware, is also possible.

The system comprises:

the communication network module is used for acquiring user information and communication relations between users, each user is used as a node in the communication network, and if the communication relations exist between the two users, an edge is established between the nodes corresponding to the two users; acquiring initial characteristics of each node and edge, and constructing a call network;

the graph neural network model module is used for modeling the call network by utilizing a graph neural network model fusing node characteristics and edge characteristics, finishing training by combining known node types in the call network and obtaining unknown node types in the call network by utilizing the trained graph neural network model; the node types comprise normal users and fraudulent users;

and the sub-graph level human-in-loop training module is used for guiding the model training process of the graph neural network model module in the modeling process.

In this embodiment, the model module of the neural network of the graph may further be divided into a local adaptive node aggregation module, an attention masking aggregation module, and a prediction module, where the local adaptive node aggregation module is configured to obtain a user node-level feature, the attention masking aggregation module is configured to obtain a user edge-level feature, and the prediction module is configured to predict a node type and a probability thereof according to the user node-level feature and the user edge-level feature.

The implementation process of the functions and actions of each module in the system is specifically described in the implementation process of the corresponding step in the method, and is not described herein again. For the system embodiment, since it basically corresponds to the method embodiment, reference may be made to the partial description of the method embodiment for relevant points. The above described system embodiments are merely illustrative, wherein the modules illustrated as separate components may or may not be physically separate, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules can be selected according to actual needs to achieve the purpose of the scheme of the invention. One of ordinary skill in the art can understand and implement without inventive effort. Embodiments of the present invention also provide a computer readable storage medium, on which a program is stored, which, when being executed by a processor, implements the above-mentioned method for identifying fraudulent users in a speech network by using a human-in-loop neural network.

The computer readable storage medium may be an internal storage unit, such as a hard disk or a memory, of any data processing device described in any previous embodiment. The computer readable storage medium may also be any external storage device of a device with data processing capabilities, such as a plug-in hard disk, a Smart Media Card (SMC), an SD Card, a Flash memory Card (Flash Card), etc. provided on the device. Further, the computer readable storage medium may include both an internal storage unit and an external storage device of any data processing capable device. The computer-readable storage medium is used for storing the computer program and other programs and data required by the arbitrary data processing capable device, and may also be used for temporarily storing data that has been output or is to be output.

The embodiment verifies the implementation effect of the invention through a specific experiment.

(1) And (6) data acquisition.

The data set of the experiment comprises telecommunication records of 2016, 9 month and 1 to 9 month and 30 days, which are provided by China main mobile service providers China telecommunication, wherein each record comprises an anonymous calling number, an anonymous called number, starting time, ending time and the like. Since one subscriber can only have one telephone number for chinese telecommunications, one telephone number is considered as one subscriber. Meanwhile, the experiment also accesses some personal information of all phone number owners, such as sex, age, place of birth and the like. Thereafter, providing unusual phone numbers according to the hectometer and the tiger 360 marks the user as a fraudster and a general user. Specifically, given a telephone number, it may be checked whether the number is abnormal according to the services provided by the Baidu and the tiger 360. Because these services are derived from a large amount of user feedback, there is a high degree of confidence.

The experiment constructs a directed graph according to the records and the personal information. Each node represents a user, and each edge represents that two users have calling behavior at least once. The initial characteristics of each node represent the user's personal information that is processed by the characteristic engineering process. Accordingly, the features of each edge are extracted from all call records between two users.

The overall statistics of the data set are summarized in table 1. The features on the graph are based on the personal information and call information of the user, so the graph can be generalized to a general telecommunication network without considering the influence of the mobile operator.

Table 1 data set statistics

Index (I)	Statistical value
		Number of users	290499
Number of call relations	1575701
		Number of calls	9599878
User feature dimension	261
		Characteristic dimension of call	37
Ratio of fraudsters	4.9％

(2) And (4) preprocessing data.

In order to meet the requirements of actual scenes, 60% of ordinary users and fraudsters are respectively extracted for training, and the rest of users are tested. The experiment also uses users in the training set 1/3 as the validation set to avoid overfitting. The proportion of fraudsters to normal users is the same in the training set, validation set and test set. The following evaluation indices for the imbalance classification task were used: precision, Recall, and F1 score. Considering the imbalance of the tags, the F1score evaluation index is more of a concern in this experiment.

(3) And (4) carrying out comparative experiments.

To fully validate the model of the invention, this experiment compared it to several different types of baseline models.

Baseline models to which the present invention model is compared include:

conventional classifiers include MLP and XGboost, where MLP _N And XGboost _N Identifying fraudsters using personal information as input, and MLP _E And XGboost _E The call information of each user is uniformly aggregated as input.

The basic graph neural network GNN uniformly aggregates personal information of neighbors and comprises three classical models: GCN [ Kipf 2017], SGC [ Wu 2019] and GIN [ Xu 2018 ].

The graphical neural network GNN based on the attention mechanism, which calculates attention coefficients to aggregate with weights, comprises three classical models: GAT [ 2 ]

2018]、GATv2[Brody 2021]And AGNN [ Thekumpampil 2018]。

The graph neural network GNN, which resolves the heterogeneity, is a graph neural network with heterogeneous graphs, and selects GraphSage [ Hamilton 2017], FAGCN [ Bo 2021] and H2GCN [ Zhu 2020], which attempt to mitigate smoothing of neighbor-to-node features with different labels.

And (4) selecting GAT with edge characteristics by using the graph neural network GNN with edge characteristics.

And selecting the existing telecom fraud detection method FFD [ Yang 2019] by the telecom fraud detection model.

The test results are shown in table 2.

TABLE 2 Performance of the identified fraud molecules

As shown in table 2, the model of the present invention achieved better performance than all baseline methods and increased the F1score by 1.32. Simple methods of personal feature classification, including MLP and XGBoost, perform well, indicating that the personal features extracted by observation are very effective in distinguishing fraudsters from ordinary users. Furthermore, although the classification performance of the average call characteristics of the users is inferior to that of the individual characteristics, in the unbalanced data set in which the fraudster accounts for only a small part of the total users, F1 for distinguishing the fraudster using the call information can reach about 40, which indicates that the call information is an important characteristic for detecting the fraudster.

The basic GNN model, including GCN, SGC, and GIN, performs poorly because the neighbors of the fraudsters are mostly ordinary users, and the average aggregation can over-smooth the personal characteristics of the fraudsters.

Attention-based GNNs, including GAT, GATv2, and AGNN, achieve better performance through an attention mechanism (F1 average boost +4.51) because they can selectively aggregate valid neighbor information. However, in the aspect of F1, the performance was still on average 5.91 lower than that obtained by the model of the present invention. The reason is that global attention cannot adaptively aggregate the rogue neighbors for each rogue because of the mutual masquerading among the rogues.

GraphSage, FAGCN, and H2GCN consider processing neighbor information rather than direct aggregation, which mitigates the over-smoothing of neighbor nodes to node features. These approaches achieve higher performance than the basic GNN. Among them, H2GCN is a method of linking multi-layer aggregation information rather than direct aggregation, and the highest F1score is obtained in all baselines. However, the H2GCN does not consider the different importance of the neighbors.

The GNN aggregating edge features utilizes edge features during aggregation, which performs worse than normal GAT (F1 drops on average by-5.3), indicating that randomly putting personal information and call information together to compute the aggregation weights may even result in learning wrong weights.

As a traditional telecommunication fraud detection method, FFD is inferior to GNN because it manually extracts features of graph structures and the performance is difficult to further improve.

(4) Ablation experiment

The present embodiment performs ablation experiments to verify the validity of each of the major modules in the model. More specifically, a person on the conservation sublevel trains each module separately under the loop training module to see how they affect performance separately: a locally adaptive node aggregation module and a masked attention edge aggregation module.

As shown in fig. 2, the locally adaptive node clustering module has some performance degradation compared to the entire model, which proves that the edge features have significant improvement in the model performance. Furthermore, the locally adaptive node aggregation module achieves the best performance compared to all other GNN baselines. The masking attention edge aggregation module has the best performance compared to all other baselines in table 2, which use only edge features. The node and edge aggregation method of the invention is proved to be capable of aggregating more effective information.

(5) Human-in-loop framework

In the embodiment, the sub-graph level person is subjected to parameter analysis in the loop training module to verify the effectiveness of the sub-graph level person. Human-to-human loop loss function lambda ₂ Different values of the parameters were tested. As shown in fig. 3, the human in the loop frame did improve the performance of the model (F1 average boost + 0.34). With a ₂ Gradually increasing, F1 of the model continuously increasing at lambda ₂ Taken at 0.1 toA maximum value. When lambda is ₂ At values greater than 0.2, the model performance is instead degraded, which may be due to the human penalty function becoming larger in value, thereby affecting the optimizer optimization cross entropy penalty.

The above-mentioned embodiments only express several implementation modes of the present application, and the description thereof is specific and detailed, but not construed as limiting the scope of the patent protection. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present application shall be subject to the appended claims.

Claims

1. A method for identifying fraudulent users in a telephony network using a human-in-loop neural network, comprising:

2. The method as claimed in claim 1, wherein said graph neural network model comprises a locally adaptive node aggregation network for obtaining user node-level features, a masked attention edge aggregation network for obtaining user edge-level features, and a prediction network for predicting node types and their probabilities based on user node-level features and user edge-level features.

3. The method of claim 2, wherein said locally adaptive node aggregation network and said masked attention edge aggregation network are computed as two branches of a graph neural network model, both being computed at the same time.

4. The method as claimed in claim 2, wherein said locally adaptive node aggregation network is an L-layer structure, and the calculation process of each layer thereof comprises:

calculating attention coefficients between nodes:

wherein,

is node v in the l-th layer _i And v _j The coefficient of attention in between (a) and (b),

are respectively a node v _i 、v _j Node embedding in layer l-1 in a locally adaptive node aggregation network,

is a nodev _i A set of neighbor nodes including itself, MLP (-) is a multi-layer perceptron,

is node v _i Weight vectors in layer l; omega ^(l) Is the l-th layer weight parameter matrix, LeakyReLU (-) represents the activation function, and superscript T represents the transposition;

calculating the user node level aggregation weight, and aggregating the neighbor node information:

wherein,

represents a node v in the l-th layer _i And v _j Attention coefficient in between, i.e. the node-level aggregation weights of the two users at the l-th level;

is an embedded representation of node information of the layer l aggregation neighbors;

calculating user node level characteristics:

wherein,

is node v _i Embedding nodes in the l-th layer, and embedding nodes in the last layer

As a user node level feature.

5. The method of claim 2, wherein said process of computing to mask an attention-edge aggregation network comprises:

calculating the important coefficient of each edge according to the initial characteristics of the nodes and the edges:

wherein,

representing a node v _i Connected edge e _j S is an important coefficient of _j Represents an edge e _j Is represented by epsilon (i) and node v _i The set of edges that are connected to each other,

is the embedding of the initial characteristics of the nodes and the whole communication mode, a is a shared learnable attention weight vector, the upper corner mark T represents transposition, omega _n Is a weight parameter matrix, ω, of the node features _e Is a weight parameter matrix, x, of the edge feature _i Is the initial characteristic of the node, | | represents the splicing operation, LeakyReLU (.) is the activation function;

sorting the important coefficients of all edges in the edge set connected with each node, selecting the edge with the top importance, and normalizing the important coefficients:

wherein alpha is _i,j Is and node v _i Connected edge e _j The attention score of (1), i.e., the user edge-level aggregation weight; Ω (i) is node v _i A set of k edges with a corresponding importance at the top;

and aggregating edges in the omega (i) set as edge-level characteristics of the nodes according to the normalized important coefficients:

z _i ＝∑ _j∈Ω(i) α _i,j ω _e e _j

wherein z is _i Is a user edge level feature.

6. The method as claimed in claim 2, wherein said prediction network uses the concatenation of user node level features and user edge level features as input to obtain the probability of the user represented by the node being a fraudster through the linear transformation layer and softmax layer in turn.

7. The method as claimed in claim 2, wherein said modeling process is conducted by a human-in-loop method to guide a model training process, comprising:

randomly extracting a part of nodes representing users from a call network for sampling, wherein the sampled probability of the nodes is in direct proportion to the information entropy of the nodes; and modifying the node-level aggregation weight and the edge-level aggregation weight of each layer corresponding to the sampled node, and feeding the modified aggregation weights back to the local adaptive node aggregation network and the attention-covering edge aggregation network respectively, so that the local adaptive node aggregation network and the attention-covering edge aggregation network calculate user node-level characteristics and user edge-level characteristics according to the modified aggregation weights.

8. The method as claimed in claim 7, wherein the human-in-loop loss is introduced into said human-in-loop method, and the weighted sum of the loss of the training module and the human-in-loop loss is used as the final loss for model training;

the human return loss is as follows:

indicating human loss in the circuit.

9. A system for identifying fraudulent users in a speech network using a human-in-circuit neural network, for carrying out the method of claim 1, said system comprising:

10. A computer-readable storage medium, on which a program is stored, which, when being executed by a processor, is adapted to implement the method for identifying a fraudulent user in a speech network using a human-in-loop neural network as claimed in any one of claims 1 to 8.