CN110839242A

CN110839242A - Abnormal number identification method and device

Info

Publication number: CN110839242A
Application number: CN201810940504.XA
Authority: CN
Inventors: 涂锋; 崔志顺; 徐睿; 余刚; 陈辉; 张晓川; 顾学伟; 王建宏; 刘钰柏; 黄志豪; 刘忱
Original assignee: China Mobile Communications Group Co Ltd; China Mobile Group Guangdong Co Ltd
Current assignee: China Mobile Communications Group Co Ltd; China Mobile Group Guangdong Co Ltd
Priority date: 2018-08-17
Filing date: 2018-08-17
Publication date: 2020-02-25
Anticipated expiration: 2038-08-17
Also published as: CN110839242B

Abstract

The embodiment of the invention provides an abnormal number identification method and device. The method comprises the following steps: acquiring a newly added picture of a first newly added number in a preset period, inputting the newly added picture into a preset feature extraction model, and extracting a target feature of the newly added picture; the preset feature extraction model corresponds to the feature type of the target feature; acquiring a similar picture of the newly added picture, and determining a similar number to which the similar picture belongs; the similar picture is a picture which meets the preset requirement corresponding to the feature type with the target feature; and when the number of the similar numbers reaches a preset threshold value corresponding to the characteristic type, identifying the first newly added number and the similar numbers as abnormal numbers. The embodiment of the invention realizes the automatic search of the abnormal number of the suspected unknown account or the name borrowing account without manual examination.

Description

Abnormal number identification method and device

Technical Field

The embodiment of the invention relates to the technical field of mobile communication, in particular to an abnormal number identification method and device.

Background

With the development of mobile communication technology, Long Term Evolution (LTE) of the universal mobile communication technology has occupied a large amount of user market share due to its superior characteristics, competition among operators is more severe due to rapid development of 4G services, and newly added 4G users have become one of key assessment indicators of the operators. However, in the business promotion of the social channel to which the operator belongs, in order to complete the business index, some illegal agents open an account for the user to handle the number without the user knowing by taking a plurality of pictures of the user; or purchase some real name information and photos for account opening in remote areas, which results in the virtual height of the newly added user amount; such numbers are not used after the account is opened, and more seriously, the partial numbers can become important sources in some dark gray industry chains, and are used for card maintenance in the dark gray industry chains, even for illegal acts such as fraud and the like. The above actions lead to the increase of complaints of operators, and the difficulty of pursuit of anti-fraud work, which has great influence.

In the prior art, aiming at the identification of abnormal users, the identification of behavior characteristics in daily use of the users is mainly focused, the auditing mode of the user photos with the newly added numbers is manual auditing, the prior system only audits the photos with single real-name numbers, but cannot perform associated auditing on all the photos to find similar photos, and further finds the abnormal numbers suspected of unwilling to open an account or borrowing an account, so that lawbreakers can have loopholes and can drill the loopholes; even if the quantity of newly-added users is large every day, the real-name photos are huge in comparison and analysis in a manual mode, if full-quantity auditing is carried out through pure manual work, a large amount of labor cost needs to be invested, and the accuracy rate is low.

Disclosure of Invention

The embodiment of the invention provides an abnormal number identification method and device, which are used for solving the problem that in the prior art, an abnormal number suspected of being opened in an unknown way or opening an account by name is difficult to find by adopting a manual examination mode for a user photo of a newly added number.

In one aspect, an embodiment of the present invention provides an abnormal number identification method, where the method includes:

acquiring a newly added picture of a first newly added number in a preset period, inputting the newly added picture into a preset feature extraction model, and extracting a target feature of the newly added picture; the preset feature extraction model corresponds to the feature type of the target feature;

acquiring a similar picture of the newly added picture, and determining a similar number to which the similar picture belongs; the similar picture is a picture which meets the preset requirement corresponding to the feature type with the target feature;

and when the number of the similar numbers reaches a preset threshold value corresponding to the characteristic type, identifying the first newly added number and the similar numbers as abnormal numbers.

On the other hand, an embodiment of the present invention provides an abnormal number identification apparatus, where the apparatus includes:

the feature extraction module is used for acquiring a newly-added picture of a first newly-added number in a preset period, inputting the newly-added picture into a preset feature extraction model and extracting a target feature of the newly-added picture; the preset feature extraction model corresponds to the feature type of the target feature;

the number acquisition module is used for acquiring the similar pictures of the newly added pictures and determining the similar numbers of the similar pictures; the similar picture is a picture which meets the preset requirement corresponding to the feature type with the target feature;

and the number identification module is used for identifying the first newly added number and the similar number as abnormal numbers when the number of the similar numbers reaches a preset threshold value corresponding to the characteristic type.

On the other hand, the embodiment of the present invention further provides an electronic device, which includes a memory, a processor, a bus, and a computer program that is stored in the memory and can be executed on the processor, and when the processor executes the computer program, the steps in the above-mentioned abnormal number identification method are implemented.

In still another aspect, an embodiment of the present invention further provides a non-transitory computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the steps in the above-mentioned abnormal number identification method.

According to the abnormal number identification method and device provided by the embodiment of the invention, the newly added picture of the first newly added number in the preset period is acquired, the newly added picture is input into the preset feature extraction model, and the target feature of the newly added picture is extracted; acquiring a similar picture which meets the preset requirement corresponding to the feature type with the target feature, and determining a similar number to which the similar picture belongs; the method comprises the steps that a plurality of similar numbers of the same user can be identified through facial features, a plurality of similar numbers of pictures (or the similar numbers for handling network access) taken at the same place can be identified through background features and/or panoramic features, and when the number of the similar numbers reaches a preset threshold value corresponding to the feature type, the first newly added number and the similar numbers are identified to be abnormal numbers, so that the method realizes automatic searching of abnormal numbers suspected unknown account opening or account opening by name without manual examination, and avoids the abnormal numbers from being used in illegal behaviors such as card maintenance and the like; the preset feature extraction model is a deep-learning convolutional neural network, features extracted through convolution calculation and convolution feature similarity are comprehensively calculated, and accuracy is high.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and those skilled in the art can also obtain other drawings according to the drawings without creative efforts.

Fig. 1 is a schematic flow chart of an abnormal number identification method according to an embodiment of the present invention;

fig. 2 is a schematic flow chart of creating a preset feature extraction model according to an embodiment of the present invention;

FIG. 3 is a diagram illustrating a first exemplary process of creating a predetermined feature extraction model according to an embodiment of the present invention;

FIG. 4 is a schematic structural diagram of a convolutional neural network of a first example of an embodiment of the present invention;

fig. 5 is a schematic process diagram of abnormal number identification according to a second example of the embodiment of the present invention;

fig. 6 is a schematic structural diagram of an abnormal number identification apparatus according to an embodiment of the present invention;

fig. 7 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.

Detailed Description

In order to make the technical problems, technical solutions and advantages of the present invention more apparent, the following detailed description is given with reference to the accompanying drawings and specific embodiments. In the following description, specific details such as specific configurations and components are provided only to help the full understanding of the embodiments of the present invention. Thus, it will be apparent to those skilled in the art that various changes and modifications may be made to the embodiments described herein without departing from the scope and spirit of the invention. In addition, descriptions of well-known functions and constructions are omitted for clarity and conciseness.

It should be appreciated that reference throughout this specification to "an embodiment" or "an embodiment" means that a particular feature, structure or characteristic described in connection with the embodiment is included in at least one embodiment of the present invention. Thus, the appearances of the phrase "in an embodiment" or "in an embodiment" in various places throughout this specification are not necessarily all referring to the same embodiment. Furthermore, the particular features, structures, or characteristics may be combined in any suitable manner in one or more embodiments.

In various embodiments of the present invention, it should be understood that the sequence numbers of the following processes do not mean the execution sequence, and the execution sequence of each process should be determined by its function and inherent logic, and should not constitute any limitation to the implementation process of the embodiments of the present invention.

In the embodiments provided herein, it should be understood that "B corresponding to a" means that B is associated with a from which B can be determined. It should also be understood that determining B from a does not mean determining B from a alone, but may be determined from a and/or other information.

Fig. 1 is a flowchart illustrating an abnormal number identification method according to an embodiment of the present invention.

As shown in fig. 1, the abnormal number identification method provided in the embodiment of the present invention specifically includes the following steps:

step 101, acquiring a newly added picture of a first newly added number in a preset period, inputting the newly added picture into a preset feature extraction model, and extracting a target feature of the newly added picture; wherein the preset feature extraction model corresponds to a feature type of the target feature.

And respectively inputting the user photos of the newly added number in a preset period into a preset feature extraction model, extracting the features of the newly added picture, namely the target features, and identifying abnormal users.

Specifically, the types of target features include: facial features, background features, and/or panoramic features.

The facial features are face features, and the face features are feature data which is helpful for face classification and obtained according to shape description of face organs and distance characteristics between the face features, and the feature components of the face features generally comprise euclidean distance, curvature, angle and the like between feature points. The human face is composed of parts such as eyes, a nose, a mouth and a chin, geometric description of the parts and the structural relationship among the parts can be used as important features for recognizing the human face, and the human face features can be used for recognizing whether two pictures are the same user or different users.

The background feature is the feature of the scenery within a range within a preset number of pixel points by taking a main body in the picture as a center, the panoramic feature is the feature of the scenery outside the range, and both the panoramic feature and the background feature are used for identifying whether the newly-added picture is shot at the same place.

Optionally, in an embodiment of the present invention, the preset feature extraction model is a deep-learning Convolutional Neural Network (CNN), and a basic structure of the CNN includes two layers, one of which is a feature extraction layer, and an input of each neuron is connected to a local acceptance domain of a previous layer, and extracts a feature of the local acceptance domain. After the local feature is extracted, the position relation between the local feature and other features is determined; the other is a feature mapping layer, each calculation layer of the network is composed of a plurality of feature mappings, each feature mapping is a plane, and the weights of all neurons on the plane are equal.

The preset feature extraction models correspond to feature types of the target features, that is, each feature type corresponds to at least one preset feature extraction model.

102, acquiring a similar picture of the newly added picture, and determining a similar number to which the similar picture belongs; the similar picture is a picture which meets the preset requirement corresponding to the feature type with the target feature.

Specifically, the preset requirements are requirements for similarity, and the type of each target feature corresponds to a respective preset requirement; when the target feature of the newly added picture is extracted, searching for a similar picture meeting the similarity requirement according to the type of the target feature and the corresponding preset requirement; for example, for facial features, a similar picture of the facial features of the subject similar to the target features in the database is searched; and searching similar pictures similar to the scene features in the newly-added picture for the background features and/or the panoramic features.

Specifically, in the process of searching for the similar picture, similarity calculation is performed on the features of the picture to be identified and the extracted target features to determine whether the picture to be identified is the similar picture of the newly added picture.

And after the similar picture is found, determining the number to which the similar picture belongs, wherein the number is the similar number of the first newly added number. For the facial features, the object to which the similar picture is directed is a user, the similar number is a number suspected to be the same user, it should be noted that the user mentioned in the embodiment of the present invention refers to a natural person, and the picture mentioned refers to a photo; for background features and/or panoramic features, the object targeted by similar pictures is a place, and pictures of similar numbers are suspected to be taken at the same place.

It should be understood that, in the embodiment of the present invention, in the field of mobile communications, a number refers to a Subscriber Identity Module (SIM); in other fields, a number may also refer to an account of a user, such as an account of logging on a platform or an application.

Step 103, when the number of the similar numbers reaches a preset threshold corresponding to the feature type, identifying the first newly added number and the similar numbers as abnormal numbers.

And when the number of the similar numbers reaches the preset threshold value, identifying the first newly added number and the similar numbers as abnormal numbers, wherein the abnormal numbers are suspected unknown numbers for opening an account or giving an account by borrowing the name.

For example, for facial features, in the specification of a certain operator, the threshold value of the number of similar numbers is usually 4, if the total number of the numbers of a certain user exceeds 5, the user is determined to be an abnormal user, and all the numbers of the user are abnormal numbers; for the background feature and/or the panoramic feature, the preset threshold is determined according to historical experience data of a place to which the feature is directed, for example, for a certain place a, if the number of the newly added similar numbers in the current preset period exceeds the historical experience data, the place is identified to be abnormal, and numbers (including the first newly added number and the similar numbers) of pictures taken at the place are all abnormal numbers.

Optionally, after the abnormal number is identified, an abnormal warning for the abnormal number may also be output.

In the embodiment of the invention, the new picture of the first new number in the preset period is acquired, and the new picture is input into the preset feature extraction model to extract the target feature of the new picture; acquiring a similar picture which meets the preset requirement corresponding to the feature type with the target feature, and determining a similar number to which the similar picture belongs; the method comprises the steps that a plurality of similar numbers of the same user can be identified through facial features, a plurality of similar numbers of pictures (or transacting network access) shot at the same place can be identified through background features and/or panoramic features, when the number of the similar numbers reaches a preset threshold value corresponding to the feature type, the first newly-added number and the similar numbers are identified to be abnormal numbers, the purpose of automatically searching the abnormal numbers of suspected unknown accounts or borrowed accounts is achieved, manual examination is not needed, and the abnormal numbers are prevented from being used in illegal behaviors such as card keeping and the like; the preset feature extraction model is a deep-learning convolutional neural network, features extracted through convolution calculation and convolution feature similarity are comprehensively calculated, and accuracy is high. The embodiment of the invention solves the problem that the abnormal number suspected of unwittingly opening an account or opening an account by borrowing a name is difficult to find by adopting a manual examination mode for the user photos of the newly added number in the prior art.

Further, in the embodiment of the present invention, in the process of identifying an abnormal number, the feature type required in a single identification process may be configured, and the configured feature type includes at least one of a facial feature, a background feature, and a panoramic feature, which is not limited herein; the method can be implemented by corresponding configured CNNs, for example, if the feature types configured in the recognition process at a certain time include a panoramic feature and a facial feature, the preset feature extraction models corresponding to the panoramic feature and the facial feature are configured respectively, and the recognition results of the panoramic feature and the facial feature are comprehensively considered in step 103 to obtain a final recognition conclusion; for another example, if the feature type is configured to include only facial features in a certain recognition process, only the preset feature extraction model corresponding to the facial features is configured, and the recognition result of the facial features is used as the final recognition conclusion in step 103.

Optionally, in this embodiment of the present invention, the step of obtaining a similar picture of the newly added picture includes:

when the feature type is a facial feature, acquiring a first similar picture which meets a first preset similarity requirement with the target feature; and

determining a user identification number of a user to which the first newly added number belongs, and acquiring a second similar picture of the number included in the user identification number;

and/or

And when the feature type is a background feature or a panoramic feature, acquiring a similar picture meeting a second preset similarity requirement with the target feature.

When the type of the target feature is a facial feature, the similar pictures comprise a first similar picture and a second similar picture, the first similar picture meeting a first preset similarity requirement is searched according to the target feature of the newly added picture, a user identification number of a user to which the first newly added number belongs is determined, the second similar picture of the number included by the user identification number is obtained, wherein the user identification number is the unique identification number of the user and can be an identity card number and the like, and the second similar pictures of all numbers under the name of the user identification number are further searched through the user identification number.

And when the characteristic type is a background characteristic or a panoramic characteristic, acquiring a similar picture meeting a second preset similarity requirement with the target characteristic of the scenery in the newly added picture, wherein the similar picture is a picture shot at the same place.

Optionally, in this embodiment of the present invention, when the feature type is a background feature or a panoramic feature, the similar number is a new number in the preset period;

the step of identifying the first newly added number and the similar number as an abnormal number when the number of the similar numbers reaches a preset threshold corresponding to the feature type includes:

determining a place corresponding to the target feature;

if the place is a newly added place, when the number of the similar numbers reaches a first preset threshold value, identifying the first newly added number and the similar numbers as abnormal numbers; and/or

If the location is a non-newly added location, acquiring a location identification number of the location and a second preset threshold corresponding to the location identification number; the second preset threshold is a preset upper limit value of the newly added number in a single preset period of the location identification number;

and when the number of the similar numbers reaches the second preset threshold value, identifying the first newly added number and the similar numbers as abnormal numbers.

That is, when the feature type is a background feature or a panoramic feature, the search range of the similar number is the new number in the current preset period; firstly, determining a place corresponding to the target feature, and judging whether the place is a newly added place according to the target feature, namely judging whether the scenery in the newly added picture appears in the picture before the current period, if the scenery does not appear in the picture before the current period, the feature similar to the target feature cannot be matched in the picture library before the current period, and the place in the newly added picture is the newly added place.

If the location is a newly added location, when the number of the similar numbers reaches a first preset threshold value, the first newly added number and the similar numbers can be identified as abnormal numbers, and the first preset threshold value is a preset fixed value.

If the location is a non-newly added location, when the number of the similar numbers reaches the second preset threshold, identifying the first newly added number and the similar numbers as abnormal numbers, wherein the location identification number of the location is firstly obtained, and the preset upper limit value of the newly added number in a single preset period of the location recorded in advance, namely the second preset threshold, is determined according to the location identification number, and the second preset threshold is historical experience data.

Optionally, in the embodiment of the present invention, before the step of obtaining the newly added picture of the first newly added number in the preset period, the method further includes:

creating a preset feature extraction model;

before acquiring a newly added picture of a first newly added number in a preset period, a preset feature extraction model needs to be established in advance; the preset feature extraction models at least comprise three models which are respectively in one-to-one correspondence with the three feature types.

Specifically, referring to fig. 2, the step of creating the preset feature extraction model includes:

step 201, obtaining a first preset number of history pictures.

The historical picture is a picture of the historical added number in a preset number of periods before the current period.

Alternatively, a directory may be digitally encoded for each user in the history of pictures, with all the pictures of that user stored in the directory.

Step 202, performing data preprocessing on the historical pictures, and dividing the historical pictures into a training picture set and a test picture set according to a preset proportion.

Firstly, performing data preprocessing on historical pictures to obtain input data meeting the requirements of a convolutional neural network, and grouping the historical pictures at the moment into a training picture set and a test picture set; the training data set is used for training a preset convolutional neural network to obtain a preset feature extraction model, and the test picture set is used for testing the accuracy of the preset feature extraction model.

And 203, circularly inputting the pictures in the training picture set to a preset convolutional neural network to obtain an output result and reversely optimizing the preset convolutional neural network.

And circularly inputting the pictures in the training picture set to a preset convolutional neural network, comparing an output result with a known result, and reversely optimizing the preset convolutional neural network according to a comparison result to finally obtain the convolutional neural network meeting the requirement.

The circulation in the step refers to that for each picture unit which is input to the preset convolutional neural network once, after an output result is obtained, the preset convolutional neural network is reversely optimized to obtain a new preset convolutional neural network; and inputting the next picture unit into the new preset convolutional neural network, and continuously performing reverse optimization after an output result is obtained.

Optionally, the preset convolutional neural network may include a plurality of different convolutional neural network models, such as vgg, initiation, or respet, to adapt to different application situations; and initially training the configured convolution network through a training picture set in the early stage to achieve the optimal effect, and inputting newly-added picture data to perform feature calculation after training is completed.

And 204, when the number of times of circulation reaches a preset training turn, testing the current convolutional neural network through the test picture set.

And when the number of times of the circulation reaches a preset training round, inputting the test picture set into the current convolutional neural network for testing, and if the test result meets a preset accuracy requirement, passing the test.

And step 205, when the test is passed, determining that the current convolutional neural network is a preset feature extraction model.

After the test is passed, the current convolutional neural network is a final preset feature extraction model.

Optionally, in this embodiment of the present invention, the step of performing data preprocessing on the history picture includes:

the method comprises the following steps that firstly, historical pictures are formed into a second preset number of homologous picture pairs and a third preset number of heterologous picture pairs; the image pair of the same source is two images from the same object, the image pair of the different source is two images from different objects, and the object is the object aimed at by the target feature.

The historical pictures are formed into a plurality of homologous picture pairs and heterologous picture pairs, wherein the homologous picture pairs are two pictures from the same object, and the heterologous picture pairs are two pictures from different objects; for example, taking facial features as an example, the homologous image pair is an image from the same user, and the heterologous image pair is an image from different users; taking the background feature and/or the panoramic feature as an example, the pair of homologous pictures are pictures from the same place, and the pair of heterologous pictures are pictures from different places.

And secondly, respectively converting the homologous picture pair and the heterologous picture pair into floating point matrix data.

The pictures are converted into floating-point matrix data according to a preset data conversion algorithm, and each picture corresponds to one floating-point matrix and is convenient to input into a preset feature extraction model.

And thirdly, respectively carrying out normalization processing on the floating-point matrix data to obtain sample matrix data.

The floating-point matrix data are respectively subjected to normalization processing to simplify calculation, reduce magnitude values and finally obtain sample matrix data, so that the data preprocessing process is completed.

Alternatively, in the embodiment of the present invention, when the feature type is a facial feature,

before the step of converting the pair of homologous pictures and the pair of heterologous pictures into floating-point matrix data, the method further includes:

and respectively cutting and/or centering the homologous image pair and the heterologous image pair.

The method comprises the steps of carrying out processing such as photo size cutting, photo panorama centering, face cutting and/or background cutting on a homologous image pair and a heterologous image according to model parameter configuration requirements, and finally converting the photos into floating point matrix data.

Optionally, in an embodiment of the present invention, the step 203 includes:

respectively sending the sample matrix data corresponding to the two pictures in the homologous picture pair or the heterologous picture pair to a preset convolutional neural network to obtain two output results;

calculating a distance parameter between the two output results, wherein the distance parameter is similarity;

and if the distance parameter does not meet the requirement of third preset similarity, reversely adjusting the weight in the preset convolutional neural network.

Taking a pair of homologous pictures as an example, if the sample matrix data corresponding to the two pictures are respectively a1 and a2, respectively inputting a1 and a2 to a preset convolutional neural network to obtain output results B1 and B2; and calculating the similarity between B1 and B2, namely a distance parameter, and if the distance parameter is smaller than a similarity threshold value of a homologous photo pair specified in a third preset similarity requirement, reversely adjusting the weight in the preset convolutional neural network for optimization.

Taking a heterogeneous image pair as an example, if the sample matrix data corresponding to the two images are respectively C1 and C2, respectively inputting C1 and C2 to a preset convolution neural network to obtain output results D1 and D2; and calculating the similarity between D1 and D2, namely a distance parameter, and if the distance parameter is greater than a similarity threshold value of a heterogeneous image pair specified in a third preset similarity requirement, reversely adjusting the weight in the preset convolutional neural network for optimization.

And circularly executing the process, and testing the accuracy of the current convolutional neural network through the test picture set when the circulating times reach the preset training turns.

As a first example, referring to fig. 3, a process of creating a preset feature extraction model in an embodiment of the present invention is described by taking a preset feature extraction model for training facial features as an example.

(1) Training data, comprising:

the method comprises the steps of obtaining pictures (on-site panoramic pictures) provided when a plurality of new users transact number network access within a preset number of periods, numbering the new users according to 0-N numbers, creating a file directory corresponding to each new user, and storing the pictures of the same user into a corresponding directory address, wherein the directory address is used as a user identifier of the user in training data.

(2) The data preprocessing specifically comprises the following steps:

a) reading a user picture path in a user picture storing directory according to a number, storing the user picture path according to a dit dictionary data type, and naming a variable as train _ imgpaths, such as:

{“0”:“path0_1”，”path0_2”…}，

wherein "0" is a user number, "path 0_ 1", "path 0_ 2" is an a picture of the user;

b) the loop dictionary train _ imgpaths judges the number of pictures of each user, and if the number is greater than 1, a homologous picture pair is formed, for example:

{X1:“path0_1”，X2:“path0_2”，Y:1，}，

meaning that X1, X2 are the same user;

and simultaneously forming a heterogeneous picture pair, namely:

{X1:“path0_1”，X2:”path1_1”，Y:0，}，

meaning that X1, X2 are different users;

if the number is equal to 1, another user picture is randomly found to form a picture pair, and so on, wherein Y-1 represents the homologous source of the pair, and Y-0 represents the heterologous source.

c) After the picture pairs are formed, the picture data are read according to the picture path, shape cutting, centering and normalization processing are carried out on the pictures, and finally the data are stored in three variables, namely data _ X1, data _ X2 and data _ Y.

d) The data is divided into two sets of data, a training picture set and a test picture set, according to the ratio of training data to test data, e.g., 7: 3.

(3) Initializing the convolutional neural network, specifically:

a) starting the convolutional neural network adapter, reading the network configuration parameters, and loading the corresponding network structure as usenet, for example: if the convolution network configuration is inceptontionV 4 or mycnnnet, loading a corresponding network structure model;

wherein inceptionN V4 is an open-source network model, and mycnnnet is a self-defined network model;

in addition, open source network models such as vgg, initiation, resnet or initiation v1-v3 are integrated into the convolutional neural network adapter by default, and a self-defined model such as mycnnnet can be selected and used according to an application scene, so that the application can be conveniently expanded.

b) Defining two inputs input _ X1, input _ X2 and multiplexing two usenets, namely convolutional neural network 1 and convolutional neural network 2 in fig. 3, which are the same network usenet, and sharing network parameters, namely: usenet1 (convolutional neural network 1), usenet2 (convolutional neural network 2).

Y is defined to indicate whether X1, X2 are homologous, Y ═ 1 indicates the pair of homologues, Y ═ 0 indicates heterology, and Y is known.

The two inputs, input _ X1 and input _ X2, are respectively input into usenet1 and usenet2, the outputs of the two networks are defined as output1 and output2, and output1 and output2 are the results of extracting features from the two convolutional networks, usenet1 and usenet 2.

c) Defining a loss function loss, calculating distance parameters dist of output1 and output2, namely the similarity of pictures X1 and X2, wherein,

if Y is 1, the loss is smaller and smaller, which means that the two pictures are similar;

if Y is equal to 0, the loss is larger, i.e. the distance between X1 and X2 is larger, indicating that the two pictures are different.

d) Defining a gradient optimizer as Adam, wherein the learning rate is 0.001, and the Adam is used for reversely adjusting the weight in the convolutional neural network, and the gradient optimizer and the learning rate thereof can be set as required.

(4) Training starts, specifically:

a) setting a model training result storage path, and judging whether the model is directly loaded if the model has the last training result or not;

b) acquiring a number of training data from a training picture set according to the set single training data input number batch _ size, and inputting the training data to input _ X1 and input _ X2;

c) inputting a test picture set for testing after the set training round circulation training data is finished; and displaying the test result.

(5) Model storage

And if the test is passed, storing a model training file according to a set storage path, and determining the current convolutional neural network as a preset feature extraction model.

Specifically, the customized mycnnnet structure in this example is shown in FIG. 4,

(1) input: the input data is processed picture data, 299 x 3 floating point matrix data.

(2) conv1, conv2, … and conv6 are convolutional layers respectively, each layer has 6 convolutional layers, the convolution and pooling kernel step size of each layer is 1, the convolution kernel size of each layer is shown by the inner number of the box in FIG. 4, and each pooling layer uses the maximum pooling maxpool.

(3) output: after 6 layers of convolution, the matrix data of 1 × 256 is finally output, namely the characteristic data of the input pictures is used for calculating the similarity between the pictures.

As a second example, referring to fig. 5, the process of performing anomaly number recognition using the preset feature extraction model includes:

(1) reading of newly added pictures

The method specifically comprises the following steps: appointing a user real-name picture storage catalog every day, and reading newly-added pictures in the storage catalog in a preset period; wherein, the pictures can be stored in the img _ paths variable in an array mode, the naming mode of the picture file is [ num ] _ [ timestamp ]. jpg,

where [ num ] may be a user ID unique to the user, such as a cell phone number, and [ timestamp ] is a timestamp.

(2) The data preprocessing specifically comprises the following steps:

a) reading a picture path in img _ paths, reading picture data through the path, centering and cutting the picture according to the network configuration size so as to be suitable for convolutional network input, and storing the picture in a panoramic picture array variable person _ data.

b) Reading each image data to perform face recognition, acquiring face data in the image, cutting and scaling the face data according to the configured size, storing the face data in face image array variable face _ data, and if a plurality of faces exist in the image, selecting the face with the largest middle size.

And acquiring background part data, cutting and scaling the data according to the configured size, and storing the data in a background picture array variable bg _ data.

Person _ data corresponds to the same user as each array element of face _ data and bg _ data, i.e.

person_datas[0]/face_datas[0]

Corresponding to panoramic pictures and face pictures of the same user, and [0] is the unique index ID of the user.

And acquiring a user ID from the picture file name, storing the user ID into an array variable user _ nums, wherein the index ID of the array corresponds to the same user with person _ data and the like.

3) Initializing a preset feature extraction model, specifically:

a) starting a panoramic convolution neural network model according to configuration, loading trained model parameters, and defining the parameters as person _ model;

starting a face convolution neural network model, loading trained model parameters, and defining the model parameters as face _ model;

and starting a background convolution neural network model, and loading the trained model parameters, wherein the parameters are defined as bg _ model.

b) Inputting the panoramic picture data person _ data into a panoramic convolution neural network person _ model to calculate feature data person _ features of the picture;

inputting the face picture data face _ data into a face convolutional neural network face _ model to calculate face feature data face _ features;

inputting the background picture data bg _ data into a background convolutional neural network bg _ model to calculate the characteristic data bg _ features of the picture.

Similarly, person _ features [0, 1, 2 … n ] and face _ features [0, 1, 2 … n ]/bg _ features [0, 1, 2 … n ] correspond to panoramic features and facial features of the same user.

Each feature outputs 512-dimensional data, namely a data matrix with person _ features/face _ features/bg _ features being N × 512.

4) And calculating the panoramic similarity, specifically:

and (3) circulating a panoramic feature array person _ features, acquiring feature data each time, and performing distance calculation on all features, wherein each feature data variable is defined as:

person_feats_one＝person_feats[0]，

calculate its distance to all features:

dist_one＝(person_feats_one-person_feats)²

at this time, dist _ one is an n-dimensional array, that is, the distance between person [0] and all persons, the dist _ one is sorted from small to large according to the distance, and the dist and the user index id of the top10 of the rank are output, which is defined as:

[person_dist_top10，person_top10_index]。

5) calculating the face similarity, specifically:

and circulating the face feature array face _ features, acquiring feature data each time, performing distance calculation on the feature data and all the features, and defining each feature data variable as follows:

face_feats_one＝face_feats[0]，

calculate its distance to all features:

facedist_one＝(face_feats_one-face_feats)²；

at this time, the face _ one is an n-dimensional array, that is, the distance between the face [0] and all the faces, the face _ one is sorted from small to large according to the distance, and the face and the user index id of the top10 of the rank are output, which is defined as:

[face_dist_top10，face_top10_index]。

(6) calculating the background similarity, specifically:

the method specifically comprises the following steps: circulating the scene feature arrays bg _ features of the network-accessing pictures of each channel, wherein the number of the scene feature arrays bg _ features is n, obtaining one feature data each time, carrying out distance calculation on the feature data and all the features, and defining each feature data variable as:

bg_feats_one＝bg_feats[0]，

calculate its distance to all features:

bgdist_one＝(bg_feats_one-bg_feats)²；

at this time, bg _ one is an n-dimensional array, that is, the distance between bg [0] and all bg, and the distance between each background feature and all other features forms a distance matrix bg _ dist _ mat, which is n × n.

And clustering the matrix according to a set threshold value by using the DBSCAN to obtain the number of the similar networking background classification and the number under each category, and outputting abnormal channels and numbers according to the classification number threshold value and the number threshold value.

(7) The calculation of the suspected abnormal user specifically comprises the following steps:

a) and circulating the face distance of the previous 10 faces _ dist _ top10, if the distance is less than a specified threshold value of 0.5, acquiring the ID in the face _ top10_ index, judging whether the ID is in the previous 10person _ dist _ top10 of the panorama, and if the ID exists, identifying the number as a suspected abnormal number.

b) The suspected abnormal number is early-warned, an early-warning threshold value can be set, the threshold value is the number of single users opening accounts at the same time, and when the set early-warning number is reached, the number is output for early warning;

and (4) early warning suspected abnormal channels, setting a threshold value of the number of abnormal channel numbers, accumulating the account opening channel information of the abnormal channel numbers, and outputting channel information for early warning when the accumulated value of a certain channel reaches an early warning value.

(8) Outputting an abnormal number, specifically:

the abnormal user number has two output modes, one is output as a file book and stored as a file; and the other is picture output, namely pictures before the panorama 10 and the human face 10 can be output, and suspected users are marked in the pictures.

In the embodiment of the invention, the new picture of the first new number in the preset period is acquired, and the new picture is input into the preset feature extraction model to extract the target feature of the new picture; acquiring a similar picture which meets the preset requirement corresponding to the feature type with the target feature, and determining a similar number to which the similar picture belongs; the method comprises the steps that a plurality of similar numbers of the same user can be identified through facial features, a plurality of similar numbers of pictures (or transacting network access) shot at the same place can be identified through background features and/or panoramic features, when the number of the similar numbers reaches a preset threshold value corresponding to the feature type, the first newly-added number and the similar numbers are identified to be abnormal numbers, the purpose of automatically searching the abnormal numbers of suspected unknown accounts or borrowed accounts is achieved, manual examination is not needed, and the abnormal numbers are prevented from being used in illegal behaviors such as card keeping and the like; the preset feature extraction model is a deep-learning convolutional neural network, features extracted through convolution calculation and convolution feature similarity are comprehensively calculated, and accuracy is high.

The abnormal number identification method provided by the embodiment of the present invention is described above, and the abnormal number identification apparatus provided by the embodiment of the present invention will be described below with reference to the accompanying drawings.

Referring to fig. 6, an embodiment of the present invention provides an abnormal number identification apparatus, where the apparatus includes:

the feature extraction module 601 is configured to acquire a newly added picture of a first newly added number in a preset period, input the newly added picture to a preset feature extraction model, and extract a target feature of the newly added picture; wherein the preset feature extraction model corresponds to a feature type of the target feature.

A number obtaining module 602, configured to obtain a similar picture of the newly added picture, and determine a similar number to which the similar picture belongs; the similar picture is a picture which meets the preset requirement corresponding to the feature type with the target feature.

A number identification module 603, configured to identify the first newly added number and the similar number as an abnormal number when the number of the similar numbers reaches a preset threshold corresponding to the feature type.

Optionally, in this embodiment of the present invention, the number obtaining module 602 includes:

the first obtaining sub-module is used for obtaining a first similar picture which meets a first preset similarity requirement with the target feature when the feature type is a facial feature;

determining a user identification number of a user to which the first newly added number belongs, and acquiring a second similar picture of the number included in the user identification number; and/or

And the second obtaining sub-module is used for obtaining a similar picture which meets a second preset similarity requirement with the target feature when the feature type is a background feature or a panoramic feature.

the number recognition module 603 includes:

the place identification submodule is used for determining a place corresponding to the target feature;

Optionally, in an embodiment of the present invention, the apparatus further includes:

the model creating module is used for creating a preset feature extraction model; the model creation module includes:

the picture acquisition sub-module is used for acquiring a first preset number of historical pictures;

the classification submodule is used for carrying out data preprocessing on the historical pictures and dividing the historical pictures into a training picture set and a test picture set according to a preset proportion;

the optimization submodule is used for circularly inputting the pictures in the training picture set into a preset convolutional neural network to obtain an output result and reversely optimizing the preset convolutional neural network;

the test sub-module is used for testing the current convolutional neural network through the test picture set when the circulating times reach a preset training turn;

and the determining submodule is used for determining the current convolutional neural network as a preset feature extraction model when the test is passed.

Optionally, in this embodiment of the present invention, the classification sub-module is configured to:

forming the historical pictures into a second preset number of homologous picture pairs and a third preset number of heterologous picture pairs; the image pair of the same source is two images from the same object, the image pair of the different source is two images from different objects, and the object is the object aimed at by the target feature;

respectively converting the homologous picture pair and the heterologous picture pair into floating point matrix data;

and respectively carrying out normalization processing on the floating-point matrix data to obtain sample matrix data.

before the converting the pair of homologous pictures and the pair of heterologous pictures into floating point matrix data, the method further includes:

Optionally, in this embodiment of the present invention, the optimization submodule is configured to:

In the above embodiment of the present invention, a feature extraction module 601 is used to obtain a newly added picture of a first newly added number in a preset period, and the newly added picture is input to a preset feature extraction model to extract a target feature of the newly added picture; the number obtaining module 602 obtains a similar picture which meets a preset requirement corresponding to the feature type with the target feature, and determines a similar number to which the similar picture belongs; the method comprises the steps that a plurality of similar numbers of the same user can be identified through facial features, a plurality of similar numbers of pictures (or the similar numbers for handling network access) taken at the same place can be identified through background features and/or panoramic features, when the number of the similar numbers reaches a preset threshold value corresponding to the feature type, the number identification module 603 identifies the first newly added number and the similar numbers as abnormal numbers, the purpose of automatically searching the abnormal numbers of suspected unknown accounts or borrowed accounts is achieved, manual examination is not needed, and the abnormal numbers are prevented from being used in illegal behaviors such as card maintenance and the like; the preset feature extraction model is a deep-learning convolutional neural network, features extracted through convolution calculation and convolution feature similarity are comprehensively calculated, and accuracy is high.

Fig. 7 is a schematic structural diagram of an electronic device according to yet another embodiment of the present invention.

Referring to fig. 7, an embodiment of the present invention provides an electronic device, which includes a memory (memory)71, a processor (processor)72, a bus 73, and a computer program stored in the memory 71 and running on the processor. The memory 71 and the processor 72 complete communication with each other through the bus 73.

The processor 72 is configured to call the program instructions in the memory 71 to implement the method as provided in the above-described embodiment of the present invention when the program is executed.

In another embodiment, the processor, when executing the program, implements the method of:

The electronic device provided in the embodiment of the present invention may be configured to execute a program corresponding to the method in the foregoing method embodiment, and details of this implementation are not described again.

According to the electronic equipment provided by the embodiment of the invention, the newly added picture of the first newly added number in the preset period is acquired, the newly added picture is input into the preset feature extraction model, and the target feature of the newly added picture is extracted; acquiring a similar picture which meets the preset requirement corresponding to the feature type with the target feature, and determining a similar number to which the similar picture belongs; the method comprises the steps that a plurality of similar numbers of the same user can be identified through facial features, a plurality of similar numbers of pictures (or transacting network access) shot at the same place can be identified through background features and/or panoramic features, when the number of the similar numbers reaches a preset threshold value corresponding to the feature type, the first newly-added number and the similar numbers are identified to be abnormal numbers, the purpose of automatically searching the abnormal numbers of suspected unknown accounts or borrowed accounts is achieved, manual examination is not needed, and the abnormal numbers are prevented from being used in illegal behaviors such as card keeping and the like; the preset feature extraction model is a deep-learning convolutional neural network, features extracted through convolution calculation and convolution feature similarity are comprehensively calculated, and accuracy is high.

A further embodiment of the invention provides a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps in the method as provided in the above-described embodiments of the invention.

In another embodiment, the program when executed by a processor implements a method comprising:

In the non-transitory computer-readable storage medium provided in the embodiment of the present invention, when the program is executed by the processor, the method in the above-described method embodiment is implemented, and details of this implementation are not described again.

According to the non-transitory computer-readable storage medium provided by the embodiment of the invention, a newly added picture of a first newly added number in a preset period is acquired, the newly added picture is input into a preset feature extraction model, and a target feature of the newly added picture is extracted; acquiring a similar picture which meets the preset requirement corresponding to the feature type with the target feature, and determining a similar number to which the similar picture belongs; the method comprises the steps that a plurality of similar numbers of the same user can be identified through facial features, a plurality of similar numbers of pictures (or transacting network access) shot at the same place can be identified through background features and/or panoramic features, when the number of the similar numbers reaches a preset threshold value corresponding to the feature type, the first newly-added number and the similar numbers are identified to be abnormal numbers, the purpose of automatically searching the abnormal numbers of suspected unknown accounts or borrowed accounts is achieved, manual examination is not needed, and the abnormal numbers are prevented from being used in illegal behaviors such as card keeping and the like; the preset feature extraction model is a deep-learning convolutional neural network, features extracted through convolution calculation and convolution feature similarity are comprehensively calculated, and accuracy is high.

Yet another embodiment of the present invention discloses a computer program product comprising a computer program stored on a non-transitory computer readable storage medium, the program comprising program instructions which, when executed by a computer, enable the computer to perform the methods provided by the above-mentioned method embodiments, for example comprising:

The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.

Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware. With this understanding in mind, the above-described technical solutions may be embodied in the form of a software product, which can be stored in a computer-readable storage medium such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments.

Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims

1. An abnormal number identification method, characterized in that the method comprises:

2. The method according to claim 1, wherein the step of obtaining the similar picture of the newly added picture comprises:

when the feature type is a facial feature, acquiring a first similar picture which meets a first preset similarity requirement with the target feature; determining a user identification number of a user to which the first newly added number belongs, and acquiring a second similar picture of the number included in the user identification number;

and/or

3. The method according to claim 1, wherein when the feature type is a background feature or a panoramic feature, the similar number is a new number in the preset period;

determining a place corresponding to the target feature;

if the place is a newly added place, when the number of the similar numbers reaches a first preset threshold value, identifying the first newly added number and the similar numbers as abnormal numbers;

and/or

4. The method according to claim 1, wherein before the step of obtaining the added picture of the first added number in the preset period, the method further comprises:

creating a preset feature extraction model;

the step of creating a preset feature extraction model includes:

acquiring a first preset number of historical pictures;

performing data preprocessing on the historical pictures, and dividing the historical pictures into a training picture set and a testing picture set according to a preset proportion;

circularly inputting the pictures in the training picture set to a preset convolutional neural network to obtain an output result and reversely optimizing the preset convolutional neural network;

when the number of times of circulation reaches a preset training turn, testing the current convolutional neural network through the test picture set;

and when the test is passed, determining the current convolutional neural network as a preset feature extraction model.

5. The method according to claim 4, wherein the step of pre-processing the historical picture comprises:

6. The method of claim 5, wherein when the feature type is a facial feature,

7. The method according to claim 5, wherein the step of circularly inputting the pictures in the training picture set to a preset convolutional neural network to obtain an output result and reversely optimizing the preset convolutional neural network comprises:

8. An abnormal number recognition apparatus, comprising:

9. An electronic device comprising a memory, a processor, a bus, and a computer program stored on the memory and executable on the processor, the processor implementing the steps in the anomaly number identification method according to any one of claims 1 to 7 when executing the program.

10. A non-transitory computer-readable storage medium having stored thereon a computer program, characterized in that: the program, when executed by a processor, implements the steps in the abnormal number identification method of any one of claims 1 to 7.