CN109660676B - Abnormal object identification method, device and equipment - Google Patents

Abnormal object identification method, device and equipment Download PDF

Info

Publication number
CN109660676B
CN109660676B CN201811184719.XA CN201811184719A CN109660676B CN 109660676 B CN109660676 B CN 109660676B CN 201811184719 A CN201811184719 A CN 201811184719A CN 109660676 B CN109660676 B CN 109660676B
Authority
CN
China
Prior art keywords
storage
behavior sequence
objects
data
remark
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811184719.XA
Other languages
Chinese (zh)
Other versions
CN109660676A (en
Inventor
彭求应
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Advanced New Technologies Co Ltd
Advantageous New Technologies Co Ltd
Original Assignee
Advanced New Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Advanced New Technologies Co Ltd filed Critical Advanced New Technologies Co Ltd
Priority to CN201811184719.XA priority Critical patent/CN109660676B/en
Publication of CN109660676A publication Critical patent/CN109660676A/en
Application granted granted Critical
Publication of CN109660676B publication Critical patent/CN109660676B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M3/00Automatic or semi-automatic exchanges
    • H04M3/22Arrangements for supervision, monitoring or testing
    • H04M3/2281Call monitoring, e.g. for law enforcement purposes; Call tracing; Detection or prevention of malicious calls
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/50Network services
    • H04L67/535Tracking the activity of the user

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Signal Processing (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Biomedical Technology (AREA)
  • Artificial Intelligence (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computer Hardware Design (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Security & Cryptography (AREA)
  • Technology Law (AREA)
  • Storage Device Security (AREA)
  • Telephonic Communication Services (AREA)

Abstract

The embodiment of the specification provides an abnormal object identification method, an abnormal object identification device and abnormal object identification equipment. And acquiring static statistical data and dynamic storage behavior sequences of each object according to the storage information. And coding the storage behavior sequence according to a preset coding algorithm. And splicing the coded storage behavior sequence with the statistical data to obtain spliced data of each object. And inputting the spliced data into a classifier to identify whether each object is an abnormal object.

Description

Abnormal object identification method, device and equipment
Technical Field
One or more embodiments of the present disclosure relate to the field of wind control technologies, and in particular, to a method, an apparatus, and a device for identifying an abnormal object.
Background
In recent years, with the rapid development of information technology, technology is constantly updated in the black industry, various fraud techniques are more hidden, and the defense and attack are more intense. Under such a large background, how to identify various abnormal objects to ensure the safety of the user becomes a problem to be solved.
In the conventional technology, some static data are generally counted based on behavior information of online behaviors (e.g., transaction behaviors) of users, and then abnormal users or abnormal behaviors are identified based on the counted static data. However, the static data described above typically does not accurately characterize an anomalous object.
Therefore, it is desirable to provide a more accurate and comprehensive identification scheme for abnormal objects.
Disclosure of Invention
One or more embodiments of the present specification describe a method, an apparatus, and a device for identifying an abnormal object, which may implement accurate identification of the abnormal object.
In a first aspect, a method for identifying an abnormal object is provided, including:
acquiring storage information generated when a plurality of users execute storage behaviors aiming at each object in each object set;
acquiring static statistical data and dynamic storage behavior sequences of each object according to the storage information;
coding the storage behavior sequence according to a preset coding algorithm;
splicing the coded storage behavior sequence with the statistical data to obtain spliced data of each object;
inputting the spliced data into a classifier to identify whether each object is an abnormal object.
In a second aspect, an apparatus for identifying an abnormal object is provided, including:
the device comprises an acquisition unit, a storage unit and a processing unit, wherein the acquisition unit is used for acquiring storage information generated when a plurality of users execute storage behaviors aiming at each object in each object set;
the acquiring unit is further configured to acquire static statistical data and a dynamic storage behavior sequence of each object according to the storage information;
the coding unit is used for coding the storage behavior sequence acquired by the acquisition unit according to a preset coding algorithm;
the splicing unit is used for splicing the storage behavior sequence coded by the coding unit with the statistical data to obtain spliced data of each object;
and the identification unit is used for inputting the spliced data obtained by the splicing unit into a classifier so as to identify whether each object is an abnormal object.
In a third aspect, an apparatus for identifying an abnormal object is provided, including:
a memory;
one or more processors; and
one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by the one or more processors, the programs when executed by the processors implement the steps of:
acquiring storage information generated when a plurality of users execute storage behaviors aiming at each object in each object set;
acquiring static statistical data and dynamic storage behavior sequences of each object according to the storage information;
coding the storage behavior sequence according to a preset coding algorithm;
splicing the coded storage behavior sequence with the statistical data to obtain spliced data of each object;
inputting the spliced data into a classifier to identify whether each object is an abnormal object.
According to the abnormal object identification method, the abnormal object identification device and the abnormal object identification equipment provided by one or more embodiments of the specification, storage information generated when a plurality of users execute storage behaviors aiming at each object in respective object sets is acquired. And acquiring static statistical data and dynamic storage behavior sequences of each object according to the storage information. And coding the storage behavior sequence according to a preset coding algorithm. And splicing the coded storage behavior sequence with the statistical data to obtain spliced data of each object. And inputting the spliced data into a classifier to identify whether each object is an abnormal object. It can be seen that the solution provided by the present specification is based on two aspects of data: static data and dynamic data are used for identifying abnormal objects, so that the accuracy of identifying the abnormal objects can be improved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present disclosure, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present disclosure, and it is obvious for those skilled in the art to obtain other drawings based on the drawings without creative efforts.
FIG. 1 is a schematic diagram of a method for identifying an abnormal object provided in the present specification;
FIG. 2 is a flowchart of a method for identifying an abnormal object according to an embodiment of the present disclosure;
FIG. 3 is a flow chart of a method of identifying fraudulent calls provided by the present specification;
FIG. 4 is a schematic diagram of an abnormal object recognition device according to an embodiment of the present disclosure;
fig. 5 is a schematic diagram of an abnormal object recognition device according to an embodiment of the present disclosure.
Detailed Description
The scheme provided by the specification is described below with reference to the accompanying drawings.
Before introducing the method for identifying an abnormal object provided in one or more embodiments of the specification, the inventive concept of the method will be described.
First, the identification and management of telephone fraud continues to be a major and difficult point in the field of wind control. Therefore, the scheme will identify the abnormal telephone number. In addition, website addresses and apps that have similar characteristics to phone numbers and are also typically capable of carrying out fraudulent activities by users can also be identified.
Second, the conventional technology usually performs abnormal object recognition based on static data. Although these static data may characterize some behavioral preference or statistical significance of behavior in the history of the abnormal object, revealing the potential risk of behavior. However, these static data are typically based on statistics of individual behaviors. The feature of counting the number of certain behaviors individually causes a large disturbance because users have different behavior motivations in different environments or scenes. Thus, identifying abnormal objects based solely on static data is often inaccurate.
In order to improve the accuracy of abnormal object identification, the scheme tries to combine all behaviors together so as to identify behavior motivation and risk corresponding to the behavior combination. The behavior sequence is better combined with the context, and starts from a series of behaviors in which the user is located. Therefore, the scheme takes the behavior sequence of the object as the relevant characteristic when identifying the abnormal object.
The action sequence is a sequence in which operation action histories of the user are arranged in chronological order. It contains the action event itself and the sequence information of the action event in a certain time window. For example, the sequence of behavior over the past 1 hour can be expressed as: "A- > B- > C- > D", wherein A-D can be used for representing remark names stored by different users for a certain object. It should be noted that although "B- > C- > a- > D" and "a- > B- > C- > D" both contain the same behavior event, they are two completely different behavior patterns due to different occurrence sequences.
Finally, because the behavior sequence is simultaneously used as the input feature in the scheme, when the abnormal object is identified, the classification of the abnormal object is considered to be carried out by adopting a Recurrent Neural Network (RNN) model for describing the input feature with the sequence characteristic. And the remark names of the objects may be contained in the behavior sequence, so that the remark names cannot be directly input into the computer. Therefore, it may be considered to encode or vectorize the behavior sequence. Specifically, the behavior sequence may be encoded or vectorized by using a word vectorization algorithm (e.g., word2vector or cw2vec, etc.) or a text classification (fasttext) algorithm.
The above is the inventive concept of the solution provided in the present specification, and the solution provided in the present specification can be obtained based on the inventive concept. The solutions provided in this specification are further elaborated below:
fig. 1 is a schematic diagram of an identification method of an abnormal object provided in this specification. In fig. 1, first, storage information generated when N users perform storage activities for respective objects in respective object sets is collected. The object may include, but is not limited to, a phone number, a website address, an app, and the like. Taking the object as a phone number as an example, the corresponding object set may be an address book. The content of the address book is a very important supplement to the existing wind control data. The storage information may include, but is not limited to, a remark name and a storage time of each object. Then, based on the collected storage information, static statistical data and dynamic storage behavior sequences of the objects are obtained. And coding the storage behavior sequence according to a preset coding algorithm. And finally, splicing the coded storage behavior sequence with the statistical data and inputting the spliced storage behavior sequence and the statistical data into a classifier so as to identify whether each object is an abnormal object.
Fig. 2 is a flowchart of an identification method for an abnormal object according to an embodiment of the present disclosure. The execution subject of the method may be a device with processing capabilities: as shown in fig. 2, the method may specifically include:
step 202, obtaining storage information generated when a plurality of users execute storage behaviors for each object in the respective object set.
The object may include, but is not limited to, a phone number, a website address, an app, and the like. Taking the object as a phone number as an example, the corresponding object set may be an address book. The storage information may include, but is not limited to, a remark name and a storage time of each object. It is understood that, when the object is a telephone number, the stored information can be obtained from the address book. That is, the address lists of a plurality of users are obtained first, and then the storage information of each telephone number is obtained from the respective address lists of the plurality of users. Taking the object as a website address as an example, the corresponding storage information can be obtained from the browser.
It should be noted that, in addition to the above stored information, the present solution may also obtain an Equipment identifier (e.g., International Mobile Equipment identity Number (IMEI)) corresponding to the stored information. Furthermore, correspondence between users, objects, and stored information may also be recorded.
Taking the object as a phone number as an example, the correspondence relationship can be shown in table 1.
TABLE 1
Figure BDA0001825909970000061
It should be understood that the content of table 1 is only illustrative, and the correspondence provided by the embodiments of the present specification is not limited to the above. For example, table 1 may further include a device identifier, which is not limited in this specification.
It should be noted that the step 202 may be performed periodically. So as to determine whether the user performs a deletion operation for a certain object after comparing the stored information of the same user in two previous and next periods, and so on.
And step 204, acquiring static statistical data and dynamic storage behavior sequences of each object according to the storage information.
The static statistical data is obtained by statistical and data mining, and may include one or more of the following: how many users the object was stored over the past few days, the number of days the object was stored over the past few days, whether the object was stored as a target name over the past few days (e.g., a cheat, etc.), the number of times the object was deleted over the past few days, and so on.
Taking the object as a phone number as an example, the static statistical data can be obtained based on the contents in table 1. The number of times that the telephone number is deleted in the past several days can be obtained by comparing the address lists of the previous and next two periods of each user.
Because the characteristics of the abnormal objects cannot be fully described only through static statistical data, the scheme further obtains the dynamic storage behavior sequence of each object so as to improve the accuracy and coverage rate of the identification of the abnormal objects. The storage behavior sequence can more accurately depict a mode formed by a certain behavior combination of the abnormal object, and is more accurate than the statistical characteristics of single behaviors.
In an implementation manner, the obtaining a dynamic storage behavior sequence of each object according to the storage information may include:
and for each object in the objects, screening out the remark name and the storage time of the object from the remark name and the storage time of the object. And sorting the remark names of the objects according to the storage time of the objects. And generating a storage behavior sequence of the object according to the sorted remark names.
For example, assume that the phone number: 186 xxx is stored in its respective address book by three different users, user a having stored the phone number on 1/9/2017 with the remark names: a cheater; user B stored the phone number in 2017, 9, 10, and its remark name is: a fraudster; user C stored the phone number in 2017, 9, month 22, with the remark name: a fraudster. Then according to the above-mentioned storage time, the corresponding remark names may be sorted as: cheater, fraudster. According to the sorting result, the storage behavior sequence of the telephone numbers can be generated: cheater- > cheater.
Therefore, the storage behavior sequence in the scheme reflects the sequence information of the object stored by the user and the content information, so that the characteristics of the object can be more accurately described.
It will be appreciated that the above example is the generation of a sequence of storage behaviors of an object from a dimension of a user. Of course, in practical applications, the storage behavior sequence may also be generated from other dimensions (e.g., dimensions of the device), which is not limited in this specification. When the storage behavior sequence of the object is generated from the dimension of the device, the device identifier may be obtained while the storage information is obtained, and a generation manner of the storage behavior sequence of the dimension may be as described above, which is not repeated herein.
And step 206, coding the storage behavior sequence according to a preset coding algorithm.
The predetermined encoding algorithm may include, but is not limited to, a word vectorization algorithm (e.g., word2vector or cw2vec, etc.), a text classification algorithm (fasttext), and the like.
With the sequence of storage behaviors generated above: for example, the remark names in the storage sequence are essentially character strings, and thus, each character string can be converted into a corresponding vector through the preset encoding algorithm. And then, splicing the vectors to obtain the coded storage behavior sequence.
And step 208, splicing the coded storage behavior sequence with the statistical data to obtain spliced data of each object.
Since static statistics are usually some numbers, such as the number of times of storage and the number of times of deletion, they can be directly input into the classifier. Therefore, the coded storage behavior sequence can be directly spliced with the statistical data, and spliced data can be obtained. It is understood that the spliced data is a multidimensional vector.
Step 210, inputting the spliced data into a classifier to identify whether each object is an abnormal object.
In one implementation, the classifier may be an RNN model or a Long Short-Term Memory network (LSTM) model or the like. Specifically, after the stitched data is input to the classifier, the probability that each object is an abnormal object and the probability that it is not an abnormal object may be output. Based on the two probabilities, it is possible to identify whether each object is an abnormal object.
In summary, the present solution aims to provide a method for identifying an abnormal object by combining a storage behavior sequence of the object. The storage behavior sequence intuitively reflects the case-making skills of the cheater, and can assist the strategy analyst to conveniently analyze the case-cheating behavior skills, so that the working efficiency is improved. In addition, the scheme takes the whole stored behavior sequence (including the sequence information of the behaviors and the like) as a research object, and characterizes the behavior of the abnormal object. Therefore, a fraud behavior system in the wind control system is enriched, and more effective information is provided for feature depiction. In particular, the characteristic of the storage behavior sequence of the telephone numbers is introduced in the telephone fraud process, so that the accuracy and the coverage rate of fraud telephone identification can be obviously improved.
In the method for identifying an abnormal object provided in one or more embodiments of the present specification, data of two aspects of an object are obtained: static data and dynamic data, the dynamic data and the static data are fused and subjected to characteristic processing, a mode formed by certain behavior combination of an abnormal object is accurately carved, and the statistical characteristic of the abnormal object is more accurate than that of a single behavior.
The following describes the identification process using an object as a telephone number as an example. It should be noted that, since the address book usually includes various information, the identification process of the fraud phone is specifically described below based on the content of the address book.
Fig. 3 is a flow chart of a method for identifying fraudulent calls provided by the present specification. As shown in fig. 3, the method may include the steps of:
step 302, obtaining the address lists of a plurality of users.
The address book may include information such as a telephone number, a remark name, and storage time of the contact. The remark name and the storage time may be collectively referred to as storage information corresponding to the telephone number.
After the address lists of a plurality of users are acquired, the correspondence shown in table 1 may be established. In addition, table 1 may also include device identifiers and the like, which are not limited in this specification.
It should be noted that the step 302 may be performed periodically. So as to determine whether the user performs a deleting operation for a certain telephone number after comparing the address lists of the same user in two periods before and after the user.
And step 304, acquiring static statistical data and dynamic storage behavior sequences of each telephone number according to the content of the address list.
The static statistical data is obtained by statistical and data mining, and may include one or more of the following: how many users a phone number was stored in the past days, the number of days a phone number was stored in the past days, the number of times a phone number was deleted in the past days, and whether a phone number was stored as a "spoof" in the past days, etc. Specifically, the above static statistical data may be acquired based on the contents of table 1. The number of times that the telephone number is deleted in the past several days can be obtained by comparing the address lists of the previous and next two periods of each user.
The embodiment also obtains a dynamic storage behavior sequence of the telephone number, because only static statistical data is obtained and the content of the address list is not fully utilized. The storage behavior sequence is better combined with the context, and the behavior motivation and the risk corresponding to the combination of all behaviors are comprehensively considered from the series of behaviors where the user is located. Compared with the statistical data, the stored behavior sequence can depict certain behavior preference of the fraud telephone in history or the statistical significance of the behavior, and reveal the potential risk of the behavior. The difference is that the stored behavior sequence can more accurately depict the pattern of a certain behavior combination of fraudulent calls, more accurately than using the statistical characteristics of the individual behaviors.
In an implementation manner, the obtaining a dynamic storage behavior sequence of 1 phone number according to the content of the address book may include:
and screening the remark names and the storage time of the telephone numbers from the address lists of a plurality of users. And sorting the remark names of the telephone numbers according to the storage time of the telephone numbers. And generating a storage behavior sequence of the telephone number according to the sorted remark names.
It can be understood that, with reference to the above obtaining method, a dynamic storage behavior sequence of each phone number in the address list of multiple users can be obtained.
For example, assume that the phone number: 186 xxx is stored in its respective address book by three different users, user a having stored the phone number on 1/9/2017 with the remark names: a cheater; user B stored the phone number in 2017, 9, 10, and its remark name is: a fraudster; user C stored the phone number in 2017, 9, month 22, with the remark name: a fraudster. Then according to the above-mentioned storage time, the corresponding remark names may be sorted as: cheater, fraudster. According to the sorting result, the storage behavior sequence of the telephone numbers can be generated: cheater- > cheater.
It will be appreciated that in the present scenario a sequence of stored actions for a telephone number is generated from the user's dimensions. Of course, in practical applications, the storage behavior sequence may also be generated from other dimensions (e.g., dimensions of the device), which is not limited in this specification. When the storage behavior sequence of the phone number is generated from the dimension of the device, the address book may be acquired and the device identifier may be acquired at the same time, and the generation manner of the storage behavior sequence of the dimension may be as described above, which is not repeated herein.
And step 306, coding the storage behavior sequence according to a preset coding algorithm.
The preset encoding algorithm may include, but is not limited to, a word vectorization algorithm (e.g., word2vector or cw2 vec), a text classification algorithm (fasttext), and the like.
With the sequence of storage behaviors generated above: for example, the remark names in the storage sequence are essentially character strings, and thus, each character string can be converted into a corresponding vector through the preset encoding algorithm. And then, splicing the vectors to obtain the coded storage behavior sequence.
And 308, splicing the coded storage behavior sequence with the statistical data to obtain spliced data of each telephone number.
Since static statistics are usually some numbers, such as the number of times of storage and the number of times of deletion, they can be directly input into the classifier. Therefore, the coded storage behavior sequence can be directly spliced with the statistical data, and spliced data can be obtained. It is understood that the spliced data is a multidimensional vector.
Step 310, the spliced data is input into a classifier to identify whether each phone number is a fraud phone.
In one implementation, the classifier may be an RNN model or a Long Short-Term Memory network (LSTM) model or the like. Specifically, after the spliced data is input into the classifier, the probability that each telephone number is a fraudulent telephone and the probability that it is not a fraudulent telephone can be output. Based on the two probabilities, it can be identified whether the respective telephone numbers are fraudulent calls.
Embodiments of the present specification aim to propose a method of identifying fraudulent calls according to a sequence of actions of a user storing telephone numbers as a feature. The method is mainly characterized in that the whole behavior sequence (including the sequence information of behaviors and the like) with the stored telephone numbers is used as a research object to characterize the behavior of a fraudster. By the method, the accuracy and the coverage rate of fraud phone identification can be remarkably improved.
In correspondence to the above method for identifying an abnormal object, an embodiment of the present specification further provides an apparatus for identifying an abnormal object, as shown in fig. 4, the apparatus may include:
an obtaining unit 402, configured to obtain storage information generated when a plurality of users perform a storage action on each object in the respective object sets.
The object herein may include any one of: phone number, website address, and app, etc.
The obtaining unit 402 is further configured to obtain static statistical data and a dynamic storage behavior sequence of each object according to the storage information.
The static statistical data may include one or more of the following: how many users the object was stored in the past several days, the number of days the object was stored in the past several days, whether the object was stored as a target name in the past several days, and the number of times the object was deleted in the past several days.
The storage information may include the remark names and storage times of the respective objects.
The obtaining unit 402 may specifically be configured to:
and for each object in the objects, screening out the remark name and the storage time of the object from the remark name and the storage time of the object.
And sorting the remark names of the objects according to the storage time of the objects.
And generating a storage behavior sequence of the object according to the sequenced remark names.
The encoding unit 404 is configured to encode the sequence of storage behaviors acquired by the acquiring unit 402 according to a preset encoding algorithm.
The preset encoding algorithm may include any one of: word vectorization algorithms, text classification algorithms, and the like.
And a splicing unit 406, configured to splice the storage behavior sequence encoded by the encoding unit 404 with the statistical data to obtain spliced data of each object.
And the identifying unit 408 is configured to input the spliced data obtained by the splicing unit 406 into a classifier to identify whether each object is an abnormal object.
The functions of each functional module of the device in the above embodiments of the present description may be implemented through each step of the above method embodiments, and therefore, a specific working process of the device provided in one embodiment of the present description is not repeated herein.
In the device for identifying an abnormal object provided in one embodiment of the present specification, the obtaining unit 402 obtains storage information generated when a plurality of users perform storage behaviors on respective objects in respective object sets. The obtaining unit 402 obtains static statistical data and dynamic storage behavior sequences of each object according to the storage information. The encoding unit 404 encodes the sequence of storage behaviors according to a preset encoding algorithm. The splicing unit 406 splices the encoded storage behavior sequence with the statistical data to obtain spliced data of each object. The identification unit 408 inputs the stitched data to a classifier to identify whether each object is an abnormal object. This can improve the accuracy of identifying an abnormal object.
Corresponding to the above method for identifying an abnormal object, an embodiment of the present specification further provides an apparatus for identifying an abnormal object, which may include, as shown in fig. 5: memory 502, one or more processors 504, and one or more programs. Wherein the one or more programs are stored in the memory 502 and configured to be executed by the one or more processors 504, the programs when executed by the processors 504 implement the steps of:
and acquiring storage information generated when a plurality of users execute storage behaviors aiming at each object in the respective object set.
And acquiring static statistical data and dynamic storage behavior sequences of each object according to the storage information.
And coding the storage behavior sequence according to a preset coding algorithm.
And splicing the coded storage behavior sequence with the statistical data to obtain spliced data of each object.
And inputting the spliced data into a classifier to identify whether each object is an abnormal object.
The identification device for the abnormal object provided by one embodiment of the specification can improve the accuracy of identification of the abnormal object.
The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the apparatus embodiment, since it is substantially similar to the method embodiment, the description is relatively simple, and for the relevant points, reference may be made to the partial description of the method embodiment.
The steps of a method or algorithm described in connection with the disclosure herein may be embodied in hardware or may be embodied in software instructions executed by a processor. The software instructions may consist of corresponding software modules that may be stored in RAM memory, flash memory, ROM memory, EPROM memory, EEPROM memory, registers, a hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art. An exemplary storage medium is coupled to the processor such the processor can read information from, and write information to, the storage medium. Of course, the storage medium may also be integral to the processor. The processor and the storage medium may reside in an ASIC. Additionally, the ASIC may reside in a server. Of course, the processor and the storage medium may reside as discrete components in a server.
Those skilled in the art will recognize that, in one or more of the examples described above, the functions described in this invention may be implemented in hardware, software, firmware, or any combination thereof. When implemented in software, the functions may be stored on or transmitted over as one or more instructions or code on a computer-readable medium. Computer-readable media includes both computer storage media and communication media including any medium that facilitates transfer of a computer program from one place to another. A storage media may be any available media that can be accessed by a general purpose or special purpose computer.
The foregoing description has been directed to specific embodiments of this disclosure. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims may be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing may also be possible or may be advantageous.
The above-mentioned embodiments, objects, technical solutions and advantages of the present specification are further described in detail, it should be understood that the above-mentioned embodiments are only specific embodiments of the present specification, and are not intended to limit the scope of the present specification, and any modifications, equivalent substitutions, improvements and the like made on the basis of the technical solutions of the present specification should be included in the scope of the present specification.

Claims (9)

1. A method for identifying an abnormal object comprises the following steps:
acquiring storage information generated when a plurality of users execute storage behaviors aiming at each object in each object set;
acquiring static statistical data and dynamic storage behavior sequences of each object according to the storage information; the static statistical data is obtained by means of statistics and data mining and is used for characterizing the object;
coding the storage behavior sequence according to a preset coding algorithm;
splicing the coded storage behavior sequence with the statistical data to obtain spliced data of each object;
inputting the spliced data into a classifier to identify whether each object is an abnormal object;
the storage information comprises remark names and storage time of the objects;
the acquiring the dynamic storage behavior sequence of each object according to the storage information includes:
for each object in the objects, screening out the remark name and the storage time of the object from the remark name and the storage time of the object;
sorting the remark names of the objects according to the storage time of the objects;
and generating a storage behavior sequence of the object according to the sequenced remark names.
2. The method of claim 1, the static statistics comprising one or more of: how many users the object was stored in the past several days, the number of days the object was stored in the past several days, whether the object was stored as a target name in the past several days, and the number of times the object was deleted in the past several days.
3. The method of claim 1, the preset encoding algorithm comprising any one of: word vectorization algorithms and text classification algorithms.
4. The method according to any one of claims 1-3, the subject comprising any one of: phone number, website address, and app.
5. An apparatus for identifying an abnormal object, comprising:
the device comprises an acquisition unit, a storage unit and a processing unit, wherein the acquisition unit is used for acquiring storage information generated when a plurality of users execute storage behaviors aiming at each object in each object set;
the acquiring unit is further configured to acquire static statistical data and a dynamic storage behavior sequence of each object according to the storage information; the static statistical data is obtained by means of statistics and data mining and is used for characterizing the object;
the coding unit is used for coding the storage behavior sequence acquired by the acquisition unit according to a preset coding algorithm;
the splicing unit is used for splicing the storage behavior sequence coded by the coding unit with the statistical data to obtain spliced data of each object;
the identification unit is used for inputting the spliced data obtained by the splicing unit into a classifier so as to identify whether each object is an abnormal object;
the storage information comprises remark names and storage time of the objects;
the obtaining unit is specifically configured to:
for each object in the objects, screening out the remark name and the storage time of the object from the remark name and the storage time of the object;
sorting the remark names of the objects according to the storage time of the objects;
and generating a storage behavior sequence of the object according to the sequenced remark names.
6. The apparatus of claim 5, the static statistics comprising one or more of: how many users the object was stored in the past several days, the number of days the object was stored in the past several days, whether the object was stored as a target name in the past several days, and the number of times the object was deleted in the past several days.
7. The apparatus of claim 5, the preset encoding algorithm comprising any one of: word vectorization algorithms and text classification algorithms.
8. The apparatus according to any one of claims 5-7, the object comprising any one of: phone number, website address, and app.
9. An identification device of an abnormal object, comprising:
a memory;
one or more processors; and
one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by the one or more processors, the programs when executed by the processors implement the steps of:
acquiring storage information generated when a plurality of users execute storage behaviors aiming at each object in each object set;
acquiring static statistical data and dynamic storage behavior sequences of each object according to the storage information; the static statistical data is obtained by means of statistics and data mining and is used for characterizing the object;
coding the storage behavior sequence according to a preset coding algorithm;
splicing the coded storage behavior sequence with the statistical data to obtain spliced data of each object;
inputting the spliced data into a classifier to identify whether each object is an abnormal object;
the storage information comprises remark names and storage time of the objects;
the acquiring the dynamic storage behavior sequence of each object according to the storage information includes:
for each object in the objects, screening out the remark name and the storage time of the object from the remark name and the storage time of the object;
sorting the remark names of the objects according to the storage time of the objects;
and generating a storage behavior sequence of the object according to the sequenced remark names.
CN201811184719.XA 2018-10-11 2018-10-11 Abnormal object identification method, device and equipment Active CN109660676B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811184719.XA CN109660676B (en) 2018-10-11 2018-10-11 Abnormal object identification method, device and equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811184719.XA CN109660676B (en) 2018-10-11 2018-10-11 Abnormal object identification method, device and equipment

Publications (2)

Publication Number Publication Date
CN109660676A CN109660676A (en) 2019-04-19
CN109660676B true CN109660676B (en) 2021-03-19

Family

ID=66110010

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811184719.XA Active CN109660676B (en) 2018-10-11 2018-10-11 Abnormal object identification method, device and equipment

Country Status (1)

Country Link
CN (1) CN109660676B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112399013B (en) * 2019-08-15 2021-12-03 中国电信股份有限公司 Abnormal telephone traffic identification method and device
CN113449523B (en) * 2021-06-29 2024-05-24 京东科技控股股份有限公司 Method and device for determining abnormal address text, electronic equipment and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102497479A (en) * 2011-12-16 2012-06-13 深圳市金立通信设备有限公司 Method for smart phone to judge Trojan programs according to application software behaviors
CN105744035A (en) * 2014-12-08 2016-07-06 北京奇虎科技有限公司 Mobile communication terminal harassing communication interception method and mobile communication terminal
CN107040494A (en) * 2015-07-29 2017-08-11 深圳市腾讯计算机***有限公司 User account exception prevention method and system
CN107886344A (en) * 2016-09-30 2018-04-06 北京金山安全软件有限公司 Convolutional neural network-based cheating advertisement page identification method and device

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170206462A1 (en) * 2016-01-14 2017-07-20 International Business Machines Corporation Method and apparatus for detecting abnormal contention on a computer system

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102497479A (en) * 2011-12-16 2012-06-13 深圳市金立通信设备有限公司 Method for smart phone to judge Trojan programs according to application software behaviors
CN105744035A (en) * 2014-12-08 2016-07-06 北京奇虎科技有限公司 Mobile communication terminal harassing communication interception method and mobile communication terminal
CN107040494A (en) * 2015-07-29 2017-08-11 深圳市腾讯计算机***有限公司 User account exception prevention method and system
CN107886344A (en) * 2016-09-30 2018-04-06 北京金山安全软件有限公司 Convolutional neural network-based cheating advertisement page identification method and device

Also Published As

Publication number Publication date
CN109660676A (en) 2019-04-19

Similar Documents

Publication Publication Date Title
CN110198310B (en) Network behavior anti-cheating method and device and storage medium
CN110442712B (en) Risk determination method, risk determination device, server and text examination system
CN110263916B (en) Data processing method and device, storage medium and electronic device
CN113383362B (en) User identification method and related product
CN108287823B (en) Message data processing method and device, computer equipment and storage medium
CN110955874A (en) Identity authentication method, identity authentication device, computer equipment and storage medium
CN110674144A (en) User portrait generation method and device, computer equipment and storage medium
CN111339436A (en) Data identification method, device, equipment and readable storage medium
CN111090807A (en) Knowledge graph-based user identification method and device
CN114693192A (en) Wind control decision method and device, computer equipment and storage medium
CN111783415B (en) Template configuration method and device
CN109660676B (en) Abnormal object identification method, device and equipment
CN108846292B (en) Desensitization rule generation method and device
CN112528166A (en) User relationship analysis method and device, computer equipment and storage medium
CN114186760A (en) Analysis method and system for stable operation of enterprise and readable storage medium
CN113538070A (en) User life value cycle detection method and device and computer equipment
CN112347457A (en) Abnormal account detection method and device, computer equipment and storage medium
CN111105064A (en) Method and device for determining suspected information of fraud event
CN113010785A (en) User recommendation method and device
CN112100604B (en) Terminal equipment information processing method and device
CN117252429A (en) Risk user identification method and device, storage medium and electronic equipment
CN111339317A (en) User registration identification method and device, computer equipment and storage medium
CN113051601A (en) Sensitive data identification method, device, equipment and medium
CN110727576A (en) Web page testing method, device, equipment and storage medium
CN115393100A (en) Resource recommendation method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20201010

Address after: Cayman Enterprise Centre, 27 Hospital Road, George Town, Grand Cayman Islands

Applicant after: Innovative advanced technology Co.,Ltd.

Address before: Cayman Enterprise Centre, 27 Hospital Road, George Town, Grand Cayman Islands

Applicant before: Advanced innovation technology Co.,Ltd.

Effective date of registration: 20201010

Address after: Cayman Enterprise Centre, 27 Hospital Road, George Town, Grand Cayman Islands

Applicant after: Advanced innovation technology Co.,Ltd.

Address before: A four-storey 847 mailbox in Grand Cayman Capital Building, British Cayman Islands

Applicant before: Alibaba Group Holding Ltd.

GR01 Patent grant
GR01 Patent grant