CN116796254A - Telecommunication anti-fraud method, device, electronic equipment and storage medium - Google Patents

Telecommunication anti-fraud method, device, electronic equipment and storage medium Download PDF

Info

Publication number
CN116796254A
CN116796254A CN202210238944.7A CN202210238944A CN116796254A CN 116796254 A CN116796254 A CN 116796254A CN 202210238944 A CN202210238944 A CN 202210238944A CN 116796254 A CN116796254 A CN 116796254A
Authority
CN
China
Prior art keywords
fraud
determining
model
short message
fitting function
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210238944.7A
Other languages
Chinese (zh)
Inventor
张政
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Mobile Communications Group Co Ltd
China Mobile Xiongan ICT Co Ltd
China Mobile System Integration Co Ltd
Original Assignee
China Mobile Communications Group Co Ltd
China Mobile Xiongan ICT Co Ltd
China Mobile System Integration Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Mobile Communications Group Co Ltd, China Mobile Xiongan ICT Co Ltd, China Mobile System Integration Co Ltd filed Critical China Mobile Communications Group Co Ltd
Priority to CN202210238944.7A priority Critical patent/CN116796254A/en
Publication of CN116796254A publication Critical patent/CN116796254A/en
Pending legal-status Critical Current

Links

Landscapes

  • Debugging And Monitoring (AREA)

Abstract

The application relates to the technical field of communication and provides a telecommunication anti-fraud method, a device, electronic equipment and a storage medium. The method comprises the following steps: determining word vectors of training data samples, wherein the training data samples are short messages to be identified, which are obtained through labeling; determining a fitting function according to the word vector and a preset anti-fraud model; determining the accuracy of the preset anti-fraud model according to the fitting function; constructing an anti-fraud model according to the accuracy and the fitting function; and performing fraud diagnosis on the short message to be identified according to the anti-fraud model. The word vector provided by the embodiment of the application replaces word comparison, and has more characteristics than the comparison by using characters, so that the comparison result is more comprehensive and accurate, and the accuracy of distinguishing the fraud short message by the model is improved.

Description

Telecommunication anti-fraud method, device, electronic equipment and storage medium
Technical Field
The present application relates to the field of communications technologies, and in particular, to a telecommunication anti-fraud method, a device, an electronic apparatus, and a storage medium.
Background
Telecommunication fraud refers to criminals that make false information, set fraud, and conduct remote, contactless fraud on victims by telephone, network, or text messaging, enticing the victims to make money or transfer.
At present, the identification modes of the fraud short messages comprise: mode one: extracting keywords in the short message, comparing the keywords with words and sentences in a fraud keyword database, and judging that the short message is suspected of fraud if the keywords hit sensitive words in the database. Mode two: inquiring whether the telephone sending the short message is in a fraud blacklist, and if so, judging that the short message is suspected to be fraud. The above identification method has the following disadvantages: 1. in the process of language development, in order to avoid keyword hit, fraud personnel can write short messages in homophones, pictograms and even metaphors, so that the difficulty of keyword sentence comparison is increased, and therefore, only words in the short messages are compared, and the accuracy of judging fraud short messages is low; 2. the fraud personnel can frequently replace the issued numbers to avoid the telephone numbers in the blacklist, and the fraud blacklist cannot be updated in real time, so that the accuracy of identifying the fraud short messages is low. Therefore, the existing identification mode of the fraud short messages has the problem of low identification accuracy of the fraud short messages.
Disclosure of Invention
The embodiment of the application provides a telecommunication anti-fraud method, a device, electronic equipment and a storage medium, which are used for solving the technical problem of low accuracy in distinguishing fraud short messages.
In a first aspect, an embodiment of the present application provides a telecommunications anti-fraud method, including:
determining word vectors of training data samples, wherein the training data samples are short messages to be identified, which are obtained through labeling;
determining a fitting function according to the word vector and a preset anti-fraud model;
determining the accuracy of the preset anti-fraud model according to the fitting function;
constructing an anti-fraud model according to the accuracy and the fitting function;
and performing fraud diagnosis on the short message to be identified according to the anti-fraud model.
In an embodiment, said determining a fitting function from said word vector and a preset anti-fraud model comprises:
determining the characteristics and the characteristic quantity of the word vector;
and determining the fitting function according to the characteristics, the characteristic quantity and the preset anti-fraud model.
In an embodiment, said determining the accuracy of said preset anti-fraud model according to said fitting function comprises:
determining an output result according to the verification data sample, the fitting function and the preset anti-fraud model;
determining and judging the correct short message quantity according to the output result;
determining a preset index value according to the total number of the verification data samples and the short message number judged to be correct;
and determining the accuracy according to the preset index value index.
In one embodiment, between said determining word vectors of training data samples and said determining fitting functions according to said word vectors and a preset anti-fraud model, comprising:
determining a comparison result of the word vector of the training data sample and the word vector of the fraud short message sample;
determining a threshold value of the preset anti-fraud model according to the comparison result;
and updating the preset anti-fraud model according to the threshold value.
In an embodiment, said constructing an anti-fraud model from said accuracy and said fitting function comprises:
determining a fitting function with highest accuracy;
and constructing the anti-fraud model according to the fitting function with the highest accuracy.
In one embodiment, the method further comprises:
constructing a fraud short message sample model;
and updating a fraud short message database according to the fraud short message sample model.
In one embodiment, after the fraud diagnosis is performed on the short message to be identified according to the anti-fraud model, the method includes:
scoring the diagnosis result according to a preset evaluation function;
determining fraud short messages according to the scoring result;
and storing the fraud short message to the fraud short message database.
In a second aspect, an embodiment of the present application provides a telecommunication anti-fraud device, including:
the first determining module is used for determining word vectors of training data samples, wherein the training data samples are short messages to be identified, which are obtained through labeling processing;
the second determining module is used for determining a fitting function according to the word vector and a preset anti-fraud model;
the third determining module is used for determining the accuracy of the preset anti-fraud model according to the fitting function;
the construction module is used for constructing an anti-fraud model according to the accuracy and the fitting function;
and the diagnosis module is used for carrying out fraud diagnosis on the short message to be identified according to the anti-fraud model.
In a third aspect, an embodiment of the present application provides an electronic device, including a processor and a memory storing a computer program, where the processor implements the steps of the telecommunication anti-fraud method according to the first aspect when executing the program.
In a fourth aspect, embodiments of the present application provide a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the telecommunication anti-fraud method of the first aspect.
According to the telecommunication anti-fraud method, the device, the electronic equipment and the storage medium, the word vector of the training data sample is determined, model training is carried out by adopting the word vector of the training data sample, the fitting function is determined in the training process, then the accuracy of the model is determined according to the fitting function, the anti-fraud model is further constructed according to the accuracy and the fitting function, and finally fraud diagnosis is carried out on the short message to be identified according to the anti-fraud model. The word vector is used for replacing word comparison, and more features are included than the word vector is used for comparison, so that a comparison result is more comprehensive and accurate, and the accuracy of distinguishing fraud short messages by the model is improved.
Drawings
In order to more clearly illustrate the application or the technical solutions of the prior art, the following description will briefly explain the drawings used in the embodiments or the description of the prior art, and it is obvious that the drawings in the following description are some embodiments of the application, and other drawings can be obtained according to the drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic flow chart of a telecommunication anti-fraud method according to an embodiment of the present application;
FIG. 2 is a second flowchart of a telecommunication anti-fraud method according to an embodiment of the present application;
FIG. 3 is a third flow chart of a telecommunication anti-fraud method according to the embodiment of the present application;
FIG. 4 is a flowchart of a telecommunication anti-fraud method according to an embodiment of the present application;
FIG. 5 is a schematic diagram of a telecommunication anti-fraud device according to an embodiment of the present application;
fig. 6 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the present application more apparent, the technical solutions of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application, and it is apparent that the described embodiments are some embodiments of the present application, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
Fig. 1 is a schematic flow chart of a telecommunication anti-fraud method according to an embodiment of the present application. Referring to fig. 1, an embodiment of the present application provides a telecommunication anti-fraud method, which may include:
step S10, determining word vectors of training data samples, wherein the training data samples are short messages to be identified, which are obtained through labeling processing;
it will be appreciated that Word vector (Word embedding), a collective term for a set of language modeling and feature learning techniques in Word embedded Natural Language Processing (NLP), is a vector in which words or phrases from a vocabulary are mapped to real numbers, involving mathematical embedding from space in one dimension for each Word to space in a continuous vector with lower dimensions.
Specifically, a short message sample to be identified and a fraud short message sample are extracted from a short message center to be identified and a fraud short message database respectively, and then the extracted short message sample data is preprocessed, for example, the short message sample to be identified and the fraud short message sample are labeled, and it is understood that the labeling refers to labeling each short message as a fraud short message or a non-fraud short message.
Further, inquiring and matching a short message with related information from a short message center to be identified, labeling the short message with the related information, constructing short message sample data to be identified by the short message sample to be identified and the short message with the related information, and meanwhile, randomly dividing the short message sample data to be identified into training data samples and verification data samples according to a proportion (such as 3:7), wherein the training data samples are used for training a model, and the verification data samples are used for verifying the accuracy of the model after model training.
It can be understood that the purpose of inquiring and matching the short message with the related information with the short message sample to be identified from the short message center to be identified is to enrich the training data, because the patterns of the fraud short messages are similar, if the training data are less, the discrimination accuracy of the anti-fraud model is low, and enriching the training data can improve the discrimination accuracy of the anti-fraud model.
Further, word segmentation and word vector extraction are carried out on the training data samples and the fraud short message samples, and the text is converted into a vector format. Specifically, word segmentation and Word vector extraction means that a Word segmentation method is used to divide a short message into a sequence composed of nouns, verbs, auxiliary words and other phrases, word2vec networks used for the sequence are used to convert the text of the short message into a vector format, that is, all the features of the text of the short message, such as internal meaning, word sequence, pronunciation and the like, are converted into Word vectors, so that the features among the vectors can be conveniently extracted, the features can be used for adjusting model parameters, and meanwhile, one or more features can be determined to classify and judge whether the short message is a fraud short message.
Step S20, determining a fitting function according to the word vector and a preset anti-fraud model;
training a model by adopting training data samples with vector formats, determining a fitting function in the training process, specifically, inputting word vectors of the training data samples into a preset anti-fraud model for training, acquiring features and feature quantity among the word vectors in the training process, and then setting the fitting function according to the features, the feature quantity and the preset anti-fraud model, for example, if the feature quantity is 1, namely a single feature, the fitting function can be set as follows:
f(x)=a n X n +a n-1 X n-1 -1+...+a 1 X+a 0
the features among the x word vectors are short message features.
In the iterative process, the fitting function is calculated by using the clamping theorem to calculate the coefficient a in the fitting function n ,a n-1 .. and so on. If the number of features is multiple, the above-mentioned unitary nth power equation can be set as a plurality of nth power equations to obtain a fitting function.
It is understood that the preset anti-fraud model refers to a model obtained based on training of a neural network model.
Step S30, determining the accuracy of the preset anti-fraud model according to the fitting function;
specifically, after determining the coefficient of the fitting function, the fitting function is used as the main function of the preset anti-fraud model, then, the short message sample is input into the preset anti-fraud model to obtain an output result, and the accuracy of the preset anti-fraud model is determined according to the output result, specifically, as described in steps S31 to S34.
S40, constructing an anti-fraud model according to the accuracy and the fitting function;
specifically, the training data samples are input into a preset anti-fraud model in batches, the preset anti-fraud model is trained for multiple times, so that a plurality of fitting functions can be obtained, the accuracy corresponding to each fitting function after being used as a main function of the preset anti-fraud model is determined, the fitting function with the highest accuracy is further determined, and the anti-fraud model is obtained by taking the highest fitting function as a main function construction model of the preset anti-fraud model.
And S50, performing fraud diagnosis on the short message to be identified according to the anti-fraud model.
Specifically, after training to obtain the anti-fraud model, the anti-fraud model is applied to a client, such as a mobile phone, and when the client receives a short message, the short message is automatically diagnosed through the anti-fraud model so as to judge whether the short message is a fraud short message, if so, prompt information is output to inform a user.
According to the method, the word vector of the training data sample is determined, model training is conducted by adopting the word vector of the training data sample, a fitting function is determined in the training process, accuracy of the model is determined according to the fitting function, an anti-fraud model is further built according to the accuracy and the fitting function, and finally fraud diagnosis is conducted on the short message to be identified according to the anti-fraud model. The word vector is used for replacing word comparison, and more features are included than the word vector is used for comparison, so that a comparison result is more comprehensive and accurate, and the accuracy of distinguishing fraud short messages by the model is improved.
Further, referring to fig. 2, fig. 2 is a second flowchart of a telecommunication anti-fraud method according to an embodiment of the present application, and between step S10 and step S20, includes:
step S11, determining a comparison result of the word vector of the training data sample and the word vector of the fraud short message sample;
step S12, determining a threshold value of the preset anti-fraud model according to the comparison result;
and step S13, updating the preset anti-fraud model according to the threshold value.
In this embodiment, a model is trained according to training data samples and parameters of a preset anti-fraud model are adjusted, specifically, training data samples with vector format are used as input of a linear classifier, a hierarchical softmax is used to calculate probability that the training data samples belong to each category, and a huffman binary tree is constructed according to frequency for all the categories, two categories are classified each time, for example, two categories are classified by using logistic regression (Logistic Regression, LR), wherein the logistic regression is a classification model, and a left subtree or a right subtree is determined by logistic regression, so that the complexity of the model can be reduced from linearity to logarithm, thereby reducing judgment times and improving performance.
In the process of training the model, the parameters of the model need to be adjusted, and it is understood that the process of adjusting the parameters refers to the process of fitting the change by a function, namely, the process of adjusting the coefficients of the function. For example, assuming that the adjusted function coefficient is a, the parameters automatically tried by the program are different in the parameter adjustment process according to the used frame, such as a convolutional neural network or a deep neural network.
Further, word vectors of training data samples are compared with word vectors of fraud short message samples to obtain comparison results, stability of the anti-fraud model is judged according to the comparison results, a threshold of a preset anti-fraud model is established, and the preset anti-fraud model is further updated according to the threshold. It will be appreciated that stability is mainly used to determine whether the fit of the model is correct, preventing overfitting, so that the function falls into a locally optimal solution. The threshold is the severity of short message detection, for example, after an anti-fraud model is preset, the error between the fraud short message and the model result is found to be 5%, then the threshold can be set to be 5%, and the last threshold is replaced, so that the threshold updating is realized.
According to the embodiment, the threshold value of the preset anti-fraud model is determined through the comparison result of the word vectors between the training data sample and the fraud message sample, then the preset anti-fraud model is updated based on the threshold value, the judgment of the stability of the model can be carried out through the threshold value, the overfitting is prevented, and meanwhile, the judgment accuracy of the fraud message can be improved by updating the threshold value of the model.
Further, referring to fig. 3, fig. 3 is a third flow chart of a telecommunication anti-fraud method according to an embodiment of the present application, and specific descriptions of step S31 to step S34 are as follows:
step S31, determining an output result according to the verification data sample, the fitting function and the preset anti-fraud model;
step S32, determining and judging the correct short message quantity according to the output result;
step S33, determining a preset index value according to the total number of the verification data samples and the number of the short messages judged to be correct;
and step S34, determining the accuracy according to the preset index value index.
Specifically, after determining a fitting function, the fitting function is used as a main function of a preset anti-fraud model, then verification data samples are input into the preset anti-fraud model one by one, meanwhile, an output result of the preset anti-fraud model is obtained, the number of wrong short messages, the number of correct short messages and the total number of verification data samples are further counted according to the output result, a preset index value is determined according to the total number of verification data samples and the number of correct short messages, and then the accuracy of the model is determined according to the preset index value index. It will be understood that the preset index value refers to an index value that can determine the accuracy of the model, such as a recall rate and a comprehensive evaluation index, i.e., an F-Measure, where the larger the recall rate and the F-Measure, the higher the model accuracy is. The formula of recall and f is as follows:
recall = correct number of bars/total number of bars judged;
f value = [ (number of correct bars/number of selected data bars) x 2 ]/(number of correct bars/number of selected data bars) + (number of correct bars/total number of correct bars).
The correct number is the number of the correct short messages, and the verification data samples are input into the preset anti-fraud model batch by batch, so that the selected number of the data is the number of the short messages judged by a certain batch of input models.
According to the embodiment, the output result is determined according to the verification data samples, the fitting function and the preset anti-fraud model, then the correct short message quantity is determined according to the output result, the preset index value is further determined according to the total number of the verification data samples and the correct short message quantity, and the accuracy is determined according to the preset index value index, so that the accuracy of the model can be judged by determining the preset index value, and the accuracy of judging the fraud short messages by the preset anti-fraud model is improved.
Further, referring to fig. 4, fig. 4 is a flowchart of a telecommunication anti-fraud method according to an embodiment of the present application, and the method further includes:
s60, constructing a fraud short message sample model;
and step S70, updating a fraud short message database according to the fraud short message sample model.
After the anti-fraud model is obtained through training, the short message to be identified can be output into two types of fraud short messages and non-fraud short messages through the anti-fraud model, and further, the fraud short message database needs to be optimized through reinforcement learning, so that the anti-fraud sample in the database can be automatically updated.
It should be noted that the present embodiment uses a markov decision process (Markov Decision Process, MDP) to perform model reinforcement learning, wherein one markov decision process is formed by one quadruple m= (S, a, P) sa R), wherein:
s: watch (watch)Showing state sets (states), with S ε S, S i Indicating the status of step i.
A: represents a group of actions, a is E A, a i Indicating the action of step i.
P sa : representing the state transition probability. P (P) sa Representing the probability distribution of other states to which transition occurs after action a e a in the current S e S state, for example, the probability of transition to S 'in state S is expressed as p (S' |s, a).
R is a return function (return function), and some return functions are functions of state S, which can be simplified to +.>The reward function may be noted as r (s ' |s, a) if a group (s, a) transitions to the next state s ', and r (s, a) if the corresponding next state s ' is unique.
The dynamic process of MDP is as follows: an agent (agent) has an initial state s 0 Then select an action a from A 0 Executing, and after executing, agent presses P sa Probability randomly transits to the next s 1 The state of the device is that,then execute an action a 1 Then shift to s 2 Next, a is performed again 2 And so on.
A fraud short message sample model is built based on an reinforcement learning method, specifically, judgment of a short message by each model is taken as Action, a result is taken as State, an updated fraud short message database is taken as Reward (report), a Markov decision model is built as model parameters used when the judgment is carried out before an Environment (Environment) is stored, namely, the Action is taken to change the State so as to obtain a circulation process that the Reward (report) interacts with the Environment (Environment), and the circulation process is abbreviated as M= < S, A, E and R >. Setting a Bellman optimal equation in a Markov decision solving process, solving model parameters by using dynamic programming, and constructing a fraud short message sample model based on the model parameters after determining the model parameters so as to improve the model accuracy.
Further, the diagnosis result is scored according to a preset evaluation function, then the fraud short message is determined according to the scoring result, and the fraud short message is stored in a fraud short message database. For example, the score is set as the evaluation function, each time the model judgment process can score each time the message is judged, the correct score is judged, the wrong score is judged, the judgment standard with the highest score is obtained after each time the judgment is finished, and the message judged to be fraud is stored in the fraud message database so as to enrich the negative samples in the fraud message database. Meanwhile, word vector comparison results between training data samples and fraud short message samples are obtained, and fraud short messages are screened out and updated into a fraud short message database to serve as negative samples for next comparison.
It can be understood that, because the forms of the fraud messages are various, such as homophones, pictograms and even metaphors are used for writing the messages, i.e. the forms of the fraud messages are continuously changed, the anti-fraud model also needs to enhance learning to improve the accuracy of distinguishing the fraud messages. For example, assuming that no fraud short message written in a metaphor mode exists at present, training data samples of homophones and pictograms are used for model training, and the obtained anti-fraud model can well identify the fraud short message in the form of homophones and pictograms, and for the fraud short message written in the metaphor mode, the accuracy of identification is lower, and at the moment, an enhanced learning method is needed to enable the anti-fraud model to learn the short messages in other fraud modes. For example, based on the Markov decision process MDP, reinforcement learning is performed, rewards are given to the short messages with correct judgment of the anti-fraud model, such as rewards of 1 score, negative rewards are given to the short messages with incorrect judgment of the anti-fraud model, such as rewards of-1 score, and the short messages with fraud are updated to the short message database after being confirmed, so that the anti-fraud model can be automatically updated and optimized without retraining the anti-fraud model, and the judgment efficiency of the short messages with fraud is improved.
According to the method, the fraud short message sample model is built, the fraud short message database is updated according to the fraud short message sample model, the real-time updating of the fraud short message database is achieved, meanwhile, the reinforced learning capacity is increased, timely and autonomous model upgrading and optimization can be conducted through data collected in a later use process, and model accuracy is maintained continuously.
The following describes a telecommunication anti-fraud device provided by an embodiment of the present application, and the telecommunication anti-fraud device described below and the telecommunication anti-fraud method described above may be referred to correspondingly.
Referring to fig. 5, fig. 5 is a schematic structural diagram of a telecommunication anti-fraud device according to an embodiment of the present application, where the telecommunication anti-fraud device includes a first determining module 501, a second determining module 502, a third determining module 503, a constructing module 504, and a diagnosing module 505, where:
the first determining module 501 is configured to determine a word vector of a training data sample, where the training data sample is a short message to be identified obtained through labeling;
a second determining module 502, configured to determine a fitting function according to the word vector and a preset anti-fraud model;
a third determining module 503, configured to determine an accuracy of the preset anti-fraud model according to the fitting function;
a construction module 504, configured to construct an anti-fraud model according to the accuracy and the fitting function;
and the diagnosis module 505 is configured to perform fraud diagnosis on the short message to be identified according to the anti-fraud model.
In one embodiment, the second determining module 502 is specifically configured to:
determining the characteristics and the characteristic quantity of the word vector;
and determining the fitting function according to the characteristics, the characteristic quantity and the preset anti-fraud model.
In one embodiment, the third determining module 503 is specifically configured to:
determining an output result according to the verification data sample, the fitting function and the preset anti-fraud model;
determining and judging the correct short message quantity according to the output result;
determining a preset index value according to the total number of the verification data samples and the short message number judged to be correct;
and determining the accuracy according to the preset index value index.
In one embodiment, the first determining module 501 is specifically configured to:
determining a comparison result of the word vector of the training data sample and the word vector of the fraud short message sample;
determining a threshold value of the preset anti-fraud model according to the comparison result;
and updating the preset anti-fraud model according to the threshold value.
In one embodiment, the building block 504 is specifically configured to:
determining a fitting function with highest accuracy;
and constructing the anti-fraud model according to the fitting function with the highest accuracy.
In one embodiment, the method further comprises:
constructing a fraud short message sample model;
and updating a fraud short message database according to the fraud short message sample model.
In one embodiment, the diagnostic module 505 is specifically configured to:
scoring the diagnosis result according to a preset evaluation function;
determining fraud short messages according to the scoring result;
and storing the fraud short message to the fraud short message database.
Fig. 6 is a schematic structural diagram of an electronic device according to an embodiment of the present application, as shown in fig. 6, the electronic device may include: processor 610, communication interface (Communication Interface) 620, memory 630, and communication bus 640, wherein processor 610, communication interface 620, and memory 630 communicate with each other via communication bus 640. Processor 610 may invoke computer programs in memory 630 to perform the steps of the telecommunication anti-fraud method, including, for example:
determining word vectors of training data samples, wherein the training data samples are short messages to be identified, which are obtained through labeling;
determining a fitting function according to the word vector and a preset anti-fraud model;
determining the accuracy of the preset anti-fraud model according to the fitting function;
constructing an anti-fraud model according to the accuracy and the fitting function;
and performing fraud diagnosis on the short message to be identified according to the anti-fraud model.
Further, the logic instructions in the memory 630 may be implemented in the form of software functional units and stored in a computer-readable storage medium when sold or used as a stand-alone product. Based on this understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, randomAccess Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
In another aspect, embodiments of the present application further provide a non-transitory computer readable storage medium, on which a computer program is stored, the computer program being capable of performing the steps of the telecommunication anti-fraud method provided in the above embodiments, for example, including:
determining word vectors of training data samples, wherein the training data samples are short messages to be identified, which are obtained through labeling;
determining a fitting function according to the word vector and a preset anti-fraud model;
determining the accuracy of the preset anti-fraud model according to the fitting function;
constructing an anti-fraud model according to the accuracy and the fitting function;
and performing fraud diagnosis on the short message to be identified according to the anti-fraud model.
The apparatus embodiments described above are merely illustrative, wherein the elements illustrated as separate elements may or may not be physically separate, and the elements shown as elements may or may not be physical elements, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. Those of ordinary skill in the art will understand and implement the present application without undue burden.
From the above description of the embodiments, it will be apparent to those skilled in the art that the embodiments may be implemented by means of software plus necessary general hardware platforms, or of course may be implemented by means of hardware. Based on this understanding, the foregoing technical solution may be embodied essentially or in a part contributing to the prior art in the form of a software product, which may be stored in a computer readable storage medium, such as ROM/RAM, a magnetic disk, an optical disk, etc., including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method described in the respective embodiments or some parts of the embodiments.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present application, and are not limiting; although the application has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present application.

Claims (10)

1. A telecommunications anti-fraud method, comprising:
determining word vectors of training data samples, wherein the training data samples are short messages to be identified, which are obtained through labeling;
determining a fitting function according to the word vector and a preset anti-fraud model;
determining the accuracy of the preset anti-fraud model according to the fitting function;
constructing an anti-fraud model according to the accuracy and the fitting function;
and performing fraud diagnosis on the short message to be identified according to the anti-fraud model.
2. The telecommunications anti-fraud method of claim 1, wherein said determining a fitting function from the word vector and a preset anti-fraud model comprises:
determining the characteristics and the characteristic quantity of the word vector;
and determining the fitting function according to the characteristics, the characteristic quantity and the preset anti-fraud model.
3. The telecommunications anti-fraud method of claim 1, wherein the determining an accuracy of the preset anti-fraud model according to the fitting function includes:
determining an output result according to the verification data sample, the fitting function and the preset anti-fraud model;
determining and judging the correct short message quantity according to the output result;
determining a preset index value according to the total number of the verification data samples and the short message number judged to be correct;
and determining the accuracy according to the preset index value index.
4. The telecommunications anti-fraud method of claim 1, wherein between said determining word vectors of training data samples and said determining fitting functions according to said word vectors and a preset anti-fraud model, comprising:
determining a comparison result of the word vector of the training data sample and the word vector of the fraud short message sample;
determining a threshold value of the preset anti-fraud model according to the comparison result;
and updating the preset anti-fraud model according to the threshold value.
5. The telecommunications anti-fraud method of claim 1, wherein said constructing an anti-fraud model from said accuracy and said fitting function comprises:
determining a fitting function with highest accuracy;
and constructing the anti-fraud model according to the fitting function with the highest accuracy.
6. The telecommunications anti-fraud method of claim 1, wherein said method further comprises:
constructing a fraud short message sample model;
and updating a fraud short message database according to the fraud short message sample model.
7. The telecommunication anti-fraud method of claim 6, wherein after performing fraud diagnosis on the short message to be identified according to the anti-fraud model, comprising:
scoring the diagnosis result according to a preset evaluation function;
determining fraud short messages according to the scoring result;
and storing the fraud short message to the fraud short message database.
8. A telecommunications anti-fraud device, comprising:
the first determining module is used for determining word vectors of training data samples, wherein the training data samples are short messages to be identified, which are obtained through labeling processing;
the second determining module is used for determining a fitting function according to the word vector and a preset anti-fraud model;
the third determining module is used for determining the accuracy of the preset anti-fraud model according to the fitting function;
the construction module is used for constructing an anti-fraud model according to the accuracy and the fitting function;
and the diagnosis module is used for carrying out fraud diagnosis on the short message to be identified according to the anti-fraud model.
9. An electronic device comprising a processor and a memory storing a computer program, characterized in that the processor implements the steps of the telecommunication anti-fraud method of any of claims 1 to 7 when executing the computer program.
10. A non-transitory computer readable storage medium having stored thereon a computer program, wherein the computer program when executed by a processor implements the telecommunication anti-fraud method of any of claims 1 to 7.
CN202210238944.7A 2022-03-11 2022-03-11 Telecommunication anti-fraud method, device, electronic equipment and storage medium Pending CN116796254A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210238944.7A CN116796254A (en) 2022-03-11 2022-03-11 Telecommunication anti-fraud method, device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210238944.7A CN116796254A (en) 2022-03-11 2022-03-11 Telecommunication anti-fraud method, device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN116796254A true CN116796254A (en) 2023-09-22

Family

ID=88046516

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210238944.7A Pending CN116796254A (en) 2022-03-11 2022-03-11 Telecommunication anti-fraud method, device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN116796254A (en)

Similar Documents

Publication Publication Date Title
CN113220886A (en) Text classification method, text classification model training method and related equipment
CN112016313B (en) Spoken language element recognition method and device and warning analysis system
CN116127953B (en) Chinese spelling error correction method, device and medium based on contrast learning
CN116663525B (en) Document auditing method, device, equipment and storage medium
CN116956835B (en) Document generation method based on pre-training language model
CN112101042A (en) Text emotion recognition method and device, terminal device and storage medium
CN111091809B (en) Regional accent recognition method and device based on depth feature fusion
CN116542297A (en) Method and device for generating countermeasure network based on text data training
CN116662555B (en) Request text processing method and device, electronic equipment and storage medium
CN114119191A (en) Wind control method, overdue prediction method, model training method and related equipment
CN111507849A (en) Authority guaranteeing method and related device and equipment
CN113722477B (en) Internet citizen emotion recognition method and system based on multitask learning and electronic equipment
CN115688789A (en) Entity relation extraction model training method and system based on dynamic labels
CN116796254A (en) Telecommunication anti-fraud method, device, electronic equipment and storage medium
CN115310449A (en) Named entity identification method and device based on small sample and related medium
CN111159360B (en) Method and device for obtaining query topic classification model and query topic classification
CN113448860A (en) Test case analysis method and device
CN116431779B (en) FAQ question-answering matching method and device in legal field, storage medium and electronic device
CN113284498B (en) Client intention identification method and device
CN112434516B (en) Self-adaptive comment emotion analysis system and method for merging text information
CN117668562B (en) Training and using method, device, equipment and medium of text classification model
US11755570B2 (en) Memory-based neural network for question answering
CN117313943A (en) Test question accuracy prediction method, system, equipment and storage medium
CN116776873A (en) Event compliance evaluation method and system based on artificial intelligence
CN118035808A (en) News release method, news release device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination