CN113592036A - Flow cheating behavior identification method and device, storage medium and electronic equipment - Google Patents
Flow cheating behavior identification method and device, storage medium and electronic equipment Download PDFInfo
- Publication number
- CN113592036A CN113592036A CN202110981015.0A CN202110981015A CN113592036A CN 113592036 A CN113592036 A CN 113592036A CN 202110981015 A CN202110981015 A CN 202110981015A CN 113592036 A CN113592036 A CN 113592036A
- Authority
- CN
- China
- Prior art keywords
- user
- time period
- users
- behavior
- click
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 52
- 238000003860 storage Methods 0.000 title claims abstract description 24
- 238000001514 detection method Methods 0.000 claims description 13
- 230000002776 aggregation Effects 0.000 claims description 10
- 238000004220 aggregation Methods 0.000 claims description 10
- 238000000605 extraction Methods 0.000 claims description 10
- 230000009471 action Effects 0.000 claims description 9
- 238000005304 joining Methods 0.000 claims description 6
- 230000006399 behavior Effects 0.000 abstract description 182
- 238000010586 diagram Methods 0.000 description 8
- 230000006870 function Effects 0.000 description 7
- 230000004913 activation Effects 0.000 description 4
- 230000008569 process Effects 0.000 description 4
- 238000004364 calculation method Methods 0.000 description 3
- 238000001914 filtration Methods 0.000 description 3
- 238000004590 computer program Methods 0.000 description 2
- 238000009826 distribution Methods 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 238000005457 optimization Methods 0.000 description 2
- 230000000737 periodic effect Effects 0.000 description 2
- 238000003066 decision tree Methods 0.000 description 1
- 238000007599 discharging Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000004880 explosion Methods 0.000 description 1
- 239000000284 extract Substances 0.000 description 1
- 230000006698 induction Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 238000004088 simulation Methods 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/22—Matching criteria, e.g. proximity measures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/955—Retrieval from the web using information identifiers, e.g. uniform resource locators [URL]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/50—Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
- G06F21/55—Detecting local intrusion or implementing counter-measures
- G06F21/552—Detecting local intrusion or implementing counter-measures involving long-term monitoring or reporting
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Software Systems (AREA)
- Computer Security & Cryptography (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Evolutionary Computation (AREA)
- Evolutionary Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Computational Biology (AREA)
- Artificial Intelligence (AREA)
- Computer Hardware Design (AREA)
- Life Sciences & Earth Sciences (AREA)
- Information Transfer Between Computers (AREA)
Abstract
The embodiment of the invention provides a method and a device for identifying flow cheating behaviors, a storage medium and electronic equipment. The method comprises the following steps: acquiring user click data of a webpage accessed by a user; extracting user click data for accessing a first link within a first time period from the user click data; extracting user click data of each user from user click data accessing the first link within the first time period; extracting the click behavior characteristics of each user from the user click data of each user; and determining whether the users have clustering behavior according to the clicking behavior characteristics of each user, and if so, adding the users with the clustering behavior into the flow cheating group partner set. The embodiment of the invention realizes the identification of the cheating behavior of the group flow.
Description
Technical Field
The invention relates to the technical field of internet access, in particular to a method and a device for identifying flow cheating behaviors, a readable storage medium and electronic equipment.
Background
In the internet era of current information explosion, traffic has a crucial value in the internet world, virtual traffic is a common behavior in the internet, and a group traffic cheating behavior of generating traffic by means of massive manual or machine simulation is currently occurring.
The current methods for identifying traffic cheating behaviors are all to identify a single cheating device. It has the following disadvantages:
firstly, the cheating behavior of group flow cannot be identified;
and secondly, the identification method is used for identifying according to empirical data and rules, has certain errors, can only identify the flow cheating behaviors in a short time, and cannot identify the flow cheating behaviors in any selected period when the periodic flow cheating behaviors are periodically identified in months, quarters, years and the like.
Disclosure of Invention
The embodiment of the invention provides a method and a device for identifying a flow cheating behavior, a readable storage medium and electronic equipment, so as to identify a group flow cheating behavior.
The technical scheme of the embodiment of the invention is realized as follows:
a traffic cheating behavior identification method comprises the following steps:
acquiring user click data of a webpage accessed by a user;
extracting user click data for accessing a first link within a first time period from the user click data;
extracting user click data of each user from user click data accessing the first link within the first time period;
extracting the click behavior characteristics of each user from the user click data of each user;
and determining whether the users have clustering behavior according to the clicking behavior characteristics of each user, and if so, adding the users with the clustering behavior into the flow cheating group partner set.
The determining whether the clustering behavior exists among the users comprises:
and calculating the click behavior similarity between every two users according to the click behavior characteristics of each user, and if the click behavior similarity is larger than a preset similarity threshold, determining that the two users corresponding to the click behavior similarity have a clustering behavior.
The user click data includes: user identification information, a webpage link identification clicked by a user and user click time;
the extracting the click behavior feature of each user from the user click data of each user comprises: dividing the first time period into at least one sub-time period;
for each sub-time period, acquiring user click data of each user accessing the first link in the sub-time period;
for each sub-time period, all users accessing the first link in the sub-time period are completely paired pairwise;
for each user pair accessing the first link in each sub-time period, according to the user click data of two users in the user pair in the current sub-time period, counting the number of days for the two users in the user pair to simultaneously access the first link in the current sub-time period, regarding the number of days as a first class of days, counting the number of days for the two users in the user pair to access the first link and only one user in the current sub-time period, and setting the number of days as a second class of days;
and taking the first type of days and the second type of days as the click behavior characteristics of two users in the user pair in the current sub-time period.
The determining whether the clustering behavior exists among the users comprises the following steps:
for each user pair accessing the first link in each sub-time period, calculating the sum of the first type days and the second type days corresponding to the user pair, and dividing the first type days by the sum to obtain the initial value of the click behavior similarity between the two users in the user pair; dividing the first type of days by the total days of the current sub-time period to obtain a weight; multiplying the weight by the initial value of the click behavior similarity to obtain the click behavior similarity between two users in the user pair, and if the click behavior similarity is greater than a preset similarity threshold, determining that a clustering behavior exists between the two users in the user pair;
the joining of the users with the clustering behavior into the flow cheating group partner set comprises the following steps:
and adding two users in the user pair into the flow cheating group and partner set in the current sub-time period.
After the two users in the user pair join the traffic cheating group aggregation in the current sub-time period, the method further comprises the following steps:
when the flow cheating group sets of all the sub-time periods in the first time period are obtained, selecting users who appear in each flow cheating group set;
deleting the users appearing in each flow cheating group set from each flow cheating group set respectively to obtain an updated flow cheating group set in each sub-time period;
and for each user appearing in each flow cheating group set, calculating the click behavior similarity of the user and each flow cheating group set, selecting the flow cheating group set with the maximum click behavior similarity, and adding the user into the selected flow cheating group set.
The calculating the similarity of the click behaviors of the user and each flow cheating group partner set comprises the following steps:
and for each flow cheating group set, respectively calculating the click behavior similarity of the user and each user in the flow cheating group set, and selecting the minimum click behavior similarity as the click behavior similarity of the user and the flow cheating group set.
The dividing the first time period into at least one sub-time period comprises:
dividing the first time period into at least one sub-time period according to the initial length of a preset sub-time period or the length and the adjustment step length of the current sub-time period;
and after the two users in the user pair are added into the traffic cheating group aggregation in the current sub-time period, the method further comprises the following steps:
after the flow cheating group set of all the current sub-time periods is obtained, judging whether the length of the current sub-time period reaches the maximum length of a preset sub-time period or not;
if not, returning to the action of dividing the first time period into at least one sub-time period according to the length of the current sub-time period and the adjustment step length;
and if so, selecting the optimal at least one flow cheating group set as a final detection result in the at least one flow cheating group set detected each time according to the principle that the more members of the detected flow cheating group sets are and the better detection result is.
The dividing the first time period into at least one sub-time period comprises:
dividing the first time period into a plurality of sub-time periods, and enabling two adjacent sub-time periods to be overlapped to preset a second time length.
After the extracting the user click data of each user from the user click data accessing the first link in the first time period and before the extracting the click behavior feature of each user from the user click data of each user, further comprising:
for each user, according to the user click data of the user, counting the maximum time length of the user for continuously accessing the first link in the first time period, if the maximum time length is less than the preset first time length, determining that the user is not a flow cheating group member, and deleting the user click data of the user from the user click data of each user accessing the first link in the first time period.
After the user who has the clustering behavior joins the traffic cheating group aggregation, the method further includes:
for each user with the clustering behavior, respectively judging whether the user meets the following conditions, and if so, deleting the user from the traffic cheating cluster set:
the dwell time of the user on the webpage of the first link is larger than a preset time threshold, or/and the access depth of the user to the first link is larger than a preset depth threshold, or/and the number of types of items purchased by the user on the first link is larger than a preset number.
A traffic cheating behavior recognition device, the device comprising:
the click behavior feature extraction module is used for acquiring user click data of a webpage accessed by a user; extracting user click data for accessing a first link within a first time period from the user click data; extracting user click data of each user from user click data accessing the first link within the first time period; extracting the click behavior characteristics of each user from the user click data of each user;
and the identification module is used for determining whether the clustering behavior exists among the users according to the clicking behavior characteristics of each user, and if so, adding the users with the clustering behavior into the flow cheating clustering set.
A non-transitory computer readable storage medium storing instructions that, when executed by a processor, cause the processor to perform the steps of the traffic cheating behavior identification method as recited in any one of the above.
An electronic device comprising a non-transitory computer readable storage medium as described above, and the processor having access to the non-transitory computer readable storage medium.
In the embodiment of the invention, the user click data accessing the first link in the first time period is extracted from the user click data, the click behavior characteristic of each user is extracted from the user click data of each user, whether the clustering behavior exists among the users is determined according to the click behavior characteristic of each user, and if the clustering behavior exists, the users with the clustering behavior are added into the flow cheating group partner set, so that the identification of the group flow cheating behavior is realized.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without inventive labor.
Fig. 1 is a flowchart of a method for identifying a traffic cheating action according to an embodiment of the present invention;
FIG. 2 is an exemplary application of the present invention;
fig. 3 is a flowchart of a method for identifying cheating actions in traffic according to another embodiment of the present invention;
FIG. 4 is an exemplary application of the sliding time window of the present invention;
fig. 5 is a schematic structural diagram of a device for identifying a traffic cheating action according to an embodiment of the present invention;
fig. 6 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The terms "first," "second," "third," "fourth," and the like in the description and in the claims, as well as in the drawings, if any, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the invention described herein are, for example, capable of operation in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprising" and "having," as well as any variations thereof, are intended to cover non-exclusive inclusions. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements explicitly listed, but may include other steps or elements not explicitly listed or inherent to such process, method, article, or apparatus.
The technical solution of the present invention will be described in detail with specific examples. Several of the following embodiments may be combined with each other and some details of the same or similar concepts or processes may not be repeated in some embodiments.
The embodiment of the invention provides a flow cheating behavior identification method, which comprises the steps of acquiring user click data of a user access webpage; extracting user click data for accessing a first link within a first time period from the user click data; extracting user click data of each user from user click data accessing the first link within the first time period; extracting the click behavior characteristics of each user from the user click data of each user; and determining whether the users have clustering behavior according to the clicking behavior characteristics of each user, and if so, adding the users with the clustering behavior into the flow cheating group partner set. The embodiment of the invention realizes the identification of the cheating behavior of the group flow.
Fig. 1 is a flowchart of a method for identifying a traffic cheating action according to an embodiment of the present invention, which includes the following steps:
step 101: and acquiring user click data of a user accessing the webpage.
In practical application, the user click data of the user accessing the webpage can be acquired from a station long grouping statistical table, a wake-up UUID (universal Unique Identifier) distribution table, an equipment browsing information table and the like.
The user click data at least includes: user identification information, a webpage link identification clicked by a user and user click time, wherein the webpage link identification is as follows: URL (Uniform Resource Locator).
The user identification information is as follows: user ID, user equipment ID, browser ID, login account number, or any combination thereof.
The user click data may also include: user transaction information such as: the type of item purchased by the user, the user's GMV (Gross Merchandis Volume), etc., or a combination thereof.
Step 102: and extracting user click data for accessing the first link in a first time period from the acquired user click data.
The first time period may be selected as desired.
Step 103: user click data for each user is extracted from user click data for accessing the first link within the first time period.
Step 104: and extracting the click behavior characteristics of each user from the user click data of each user.
Step 105: and determining whether the users have clustering behavior according to the clicking behavior characteristics of each user, and if so, adding the users with the clustering behavior into the flow cheating group partner set.
In the embodiment, the user click data accessing the first link in the first time period is extracted from the user click data, the click behavior feature of each user is extracted from the user click data of each user, whether the clustering behavior exists among the users is determined according to the click behavior feature of each user, and if the clustering behavior exists, the users with the clustering behavior are added into the flow cheating group partner set, so that the identification of the group flow cheating behavior is realized
In an alternative embodiment, the step 105 of determining whether there is a clustering behavior among the users comprises: and calculating the click behavior similarity between every two users according to the click behavior characteristics of each user, and if the click behavior similarity is larger than a preset similarity threshold, determining that the two users corresponding to the click behavior similarity have a clustering behavior.
In the embodiment, the judgment of the clustering behavior between the users is realized by calculating the similarity of the click behavior between every two users.
In an alternative embodiment, in step 104, extracting the click behavior feature of each user from the user click data of each user includes: dividing the first time period into at least one sub-time period; for each sub-time period, acquiring user click data of each user accessing the first link in the sub-time period; for each sub-time period, all users accessing the first link in the sub-time period are completely paired pairwise; for each user pair accessing the first link in each sub-time period, according to the user click data of two users in the user pair in the current sub-time period, counting the number of days for the two users in the user pair to simultaneously access the first link in the current sub-time period, regarding the number of days as a first class of days, counting the number of days for the two users in the user pair to access the first link and only one user in the current sub-time period, and setting the number of days as a second class of days; and taking the first type of days and the second type of days as the click behavior characteristics of two users in the user pair in the current sub-time period.
The term "pair-by-pair complete pairing" herein means that, for each user accessing the first link during the inter-sub period, the user is paired with each of the other users, respectively. For example: the total number of users accessing the first link in the sub-period is 4, which are respectively: a. b, c and d, the pairing result is as follows: ab. ac, ad, bc, bd, cd. When m users access the first link in a sub-period, the matched user pairs share: 1+2+ 3. + (m-1) ═ m (m-1)/2.
In the above embodiment, for each user pair, the number of days that two users in the user pair access the first link simultaneously in the current sub-time period is counted, and the number of days that two users in the user pair access the first link and only one user in the current sub-time period is counted, and the number of days is set as the second type of days; and taking the first type days and the second type days as the click behavior characteristics of the two users in the user pair in the current sub-time period, thereby realizing the extraction of the click behavior characteristics of the users.
Consider that: when the number of days that two users do not access the first link at the same time is large, if the click behavior similarity calculation method is not suitable, the click behavior similarity of the two users calculated under the condition is high, so that the click behavior similarity is not consistent with the actual condition, and aiming at the condition, the embodiment of the invention provides the following click behavior similarity calculation method: in an alternative embodiment, the step 105 of determining whether there is a clustering behavior among the users comprises: for each user pair accessing the first link in each sub-time period, calculating the sum of the first type days and the second type days corresponding to the user pair, and dividing the first type days by the sum to obtain the initial value of the click behavior similarity between the two users in the user pair; dividing the first type of days by the total days of the current sub-time period to obtain a weight; multiplying the weight by the initial value of the click behavior similarity to obtain the click behavior similarity between two users in the user pair, and if the click behavior similarity is greater than a preset similarity threshold, determining that a clustering behavior exists between the two users in the user pair;
and, in step 105, joining the users who have the clustering behavior into the traffic cheating group set includes: and adding two users in the user pair into the flow cheating group and partner set in the current sub-time period.
In the embodiment, the quotient obtained by dividing the first-class days by the total days of the current sub-time period is used as the weight, so that the occurrence of the situation that the similarity of the clicking behaviors of two users is high when the number of days that the two users in the user pair do not access the first link at the same time is large is reduced, and the misjudgment of the cheating behavior of the group flow is finally reduced.
For example: there are 4 users accessing the first link in a sub-period of time, which are: a. b, c and d, the user pair comprises: ab. and ac, ad, bc, bd and cd, wherein the similarity of the clicking behaviors of the users to the ab and the bd is larger than a preset similarity threshold, and then the users a, b and d are added into the flow cheating group and partner set in the sub-time period.
In an optional embodiment, after two users in the pair of users join in the traffic cheating group aggregation in the current sub-time period, the method further includes: when the flow cheating group sets of all the sub-time periods in the first time period are obtained, selecting users who appear in each flow cheating group set; deleting the users appearing in each flow cheating group set from each flow cheating group set respectively to obtain an updated flow cheating group set in each sub-time period; and for each user appearing in each flow cheating group set, calculating the click behavior similarity of the user and each flow cheating group set, selecting the flow cheating group set with the maximum click behavior similarity, and adding the user into the selected flow cheating group set.
In the embodiment, the users who repeatedly appear in all the traffic cheating group sets are finally divided into the traffic cheating group set according to the similarity of the users and the clicking behaviors of each traffic cheating group set, so that the identification accuracy of the traffic cheating group sets is further improved, and the management of the traffic cheating group sets is facilitated.
In an alternative embodiment, calculating the similarity of the click behavior of the user and each traffic cheating group set comprises: and for each flow cheating group set, respectively calculating the click behavior similarity of the user and each user in the flow cheating group set, and selecting the minimum click behavior similarity as the click behavior similarity of the user and the flow cheating group set.
The above embodiment provides a specific scheme of how to determine the similarity of the click behaviors of the user and the traffic cheating group.
Fig. 2 is an application example of the present invention. In this example, the first time period is divided into 4 sub-time periods, each of which is 4 days, and the situation that the users a to m access the first link on each day of the first time period is shown in fig. 2, wherein a value of 1 indicates that the user accesses the first link on the current day, and a value of 0 indicates that the user does not access the first link on the current day.
Respectively calculating the click behavior similarity between every two users in each sub-time period, and adding the users with the click behavior similarity larger than a preset similarity threshold value into the flow cheating group set of the corresponding sub-time period to obtain:
traffic cheating group set for sub-period 1: x1 ═ { a, b, f, h, k };
traffic cheating group set for sub-period 2: x2 ═ { a, b, d, e, h };
traffic cheating group set for sub-period 3: x3 ═ { a, b, i, l };
traffic cheating group set for sub-period 4: x4 ═ { a, b, g, j, l }.
Then: the users who have appeared in each traffic cheating group set are: Φ ═ X1 ═ X2 ═ X3 ═ X4 ═ a, b };
then X1 is updated to X1- Φ { f, h, k }, X2 is X2- Φ { d, e, h }, X3 is X3- Φ { i, l }, and X4 is X4- Φ { g, j, l }.
And for a, calculating click behavior similarities of a and X1, X2, X3 and X4 respectively, and selecting Xn (n is 1, 2, 3 or 4) corresponding to the maximum click behavior similarity as the traffic cheating group set of a. Among them, for example: when the click behavior similarity of a and X1 is calculated, the click behavior similarity of a and f, h and k in X1 is calculated respectively, and the smallest click behavior similarity is selected as the click behavior similarity of a and X1.
And selecting one of X1, X2, X3 and X4 as the flow cheating group set of b by adopting the same method as the method of a.
Consider that: the length of the sub-time period may affect the identification accuracy of the group flow cheating behavior, and for the situation, the embodiment of the invention provides the following optimization scheme:
in an alternative embodiment, dividing the first time period into at least one sub-time period comprises: dividing the first time period into at least one sub-time period according to the initial length of a preset sub-time period or the length and the adjustment step length of the current sub-time period;
and after two users in the user pair are added into the traffic cheating group aggregation in the current sub-time period, the method further comprises the following steps: after the flow cheating group set of all the current sub-time periods is obtained, judging whether the length of the current sub-time period reaches the maximum length of a preset sub-time period or not; if not, returning to the action of dividing the first time period into at least one sub-time period according to the length of the current sub-time period and the adjustment step length; and if so, selecting the optimal at least one flow cheating group set as a final detection result in the at least one flow cheating group set detected each time according to the principle that the more members of the detected flow cheating group sets are and the better detection result is.
In the embodiment, by changing the length of the sub-time period, a plurality of groups of flow cheating group sets are calculated (wherein, a corresponding flow cheating group set is calculated every time the sub-time period is divided), and according to the principle that the detected flow cheating group set is more and more excellent in detection result, an optimal flow cheating group set is selected as a final detection result in the detected plurality of groups of flow cheating group sets, so that the identification precision of the group flow cheating behavior is improved.
In an alternative embodiment, dividing the first time period into at least one sub-time period comprises: dividing the first time period into a plurality of sub-time periods, and enabling two adjacent sub-time periods to be overlapped to preset a second time length.
Consider that: therefore, in order to reduce the workload of identifying the group flow cheating behavior and accelerate the identification speed of the group flow cheating behavior, the embodiment of the invention provides the following optimization scheme:
in an optional embodiment, after step 103 and before step 104, further comprising: for each user, according to the user click data of the user, counting the maximum time length of the user for continuously accessing the first link in the first time period, if the maximum time length is less than the preset first time length, determining that the user is not a flow cheating group member, and deleting the user click data of the user from the user click data of each user accessing the first link in the first time period.
In an optional embodiment, after joining the users with the clustering behavior into the traffic cheating group set in step 105, the method further includes: for each user with the clustering behavior, respectively judging whether the user meets the following conditions, and if so, deleting the user from the traffic cheating cluster set: the dwell time of the user on the webpage of the first link is larger than a preset time threshold, or/and the access depth of the user to the first link is larger than a preset depth threshold, or/and the number of types of items purchased by the user on the first link is larger than a preset number.
According to the embodiment, whether the user belongs to the traffic cheating group member or not is further confirmed by analyzing the stay time of the user on the webpage of the first link, or/and the access depth of the user on the first link, or/and the type of the article purchased by the user on the first link, misjudgment on the traffic cheating behavior is avoided, and the accuracy of traffic cheating behavior identification is improved.
Fig. 3 is a flowchart of a method for identifying a traffic cheating action according to another embodiment of the present invention, which includes the following specific steps:
step 301: and acquiring user click data of a user accessing the webpage.
In practical application, user click data of a user access webpage can be acquired from a station long grouping statistical table, a wake-up UUID distribution table, an equipment browsing information table and the like.
The user click data at least includes: user identification information, a webpage link identification clicked by a user and user click time, wherein the webpage link identification is as follows: URL (Uniform Resource Locator).
The user identification information is as follows: user ID, user equipment ID, browser ID, login account number, or any combination thereof.
The user click data may also include: user transaction information such as: the type of item purchased by the user, the user's GMV (Gross Merchandis Volume), etc., or a combination thereof.
In practical application, if some information in the user click data has data loss, the loss value can be filled by adopting methods such as an adjacent average value, a Bayesian formalization method or decision tree induction; in addition, noise filtering may be performed on each piece of information in the user click data, for example: and filtering according to the click time of the user, or filtering out null value data of the station leader ID, and the like.
Step 302: selecting a first time period, extracting user click data accessing the first link in the first time period from the user click data acquired in step 301, and setting an initial window length, an initial sliding step length, a window length adjustment step length, a sliding step length adjustment step length, a maximum window length and a maximum sliding step length of the sliding time window.
In practical applications, for example: in a traffic activation scenario, all users to be activated may be assigned to multiple traffic activation actors. Each traffic activation executor activates the user by posting a link of interest to the user on a certain web page or web pages to attract the user to click on the link. For this scenario, the first link may be a link that a certain traffic activation actor drops.
In practical applications, consider: therefore, in this step 302, after the user click data for accessing the first link in the first time period is extracted, the user click data in which the maximum duration for continuously accessing the first link is less than the first duration is filtered.
For example: and if the maximum time for continuously accessing the first link of a certain user does not exceed 30 days, the user is considered to be not a flow cheating group member, and the user click data is filtered out without participating in the subsequent flow.
Step 303: and sequentially sliding the sliding time window over the first time period according to the current window length and the current sliding step length.
In practical applications, the sliding time window may start from day q +1 if no user clicks on the first link for the first q days (q is a positive integer) within the first time period. For example: as shown in fig. 4, assuming that the first time period is 2020, the year and no user clicks on the first link in the first 58 days of 2020, the sliding time windows start from day 59, where in fig. 4, the window length of each time window is 120 days and the sliding step size is 10 days.
Step 304: for each time window, selecting all user click data accessing the first link within the time window from all user click data accessing the first link within the first time period.
Step 305: and for each user pair, calculating the similarity of the clicking behaviors of the two users according to the number of days that the two users in the user pair simultaneously access the first link in the time window and the number of days that only one user accesses the first link.
Here, to pair each user accessing the first link within the time window completely means that, for any user a, the user a is paired with each other user, for example: there are 4 users accessing the first link in the time window, which are: a. b, c and d, the pairing result is as follows: ab. ac, ad, bc, bd, cd. When there are m users accessing the first link within a time window, the resulting pair of users is 1+2+ 3. + (m-1) ═ m (m-1)/2.
The calculation formula of the click behavior similarity is as follows:
wherein, δ is click behavior similarity; n is the number of days per time window; k is the kth day in the current time window;
αkthe meaning of (A) is: if two users in the current user pair access the first link at the same time on the k-th day, alpha k1, otherwise, αk=0;
βkThe meaning of (A) is: if on day k, the first link is accessed by only one and only one user within the current user pair (i.e., either user a or user b accessed), then βk1, otherwise, βk=0。
That is to say, the position of the nozzle is,the total number of days for two users in the current user pair to simultaneously access the first link in the current time window;
refers to the total number of days within the current time window that one of the two users within the current user pair accessed the first link.
For example: n is 120 days, the number of days for which the users a and b in the user pair simultaneously access the first link is 80 days, the number of days for which the users a and b do not access the first link is 10 days, and the number of days for which only one user accesses (i.e., either the user a accesses or the user b accesses) the first link is 30 days, then the click behavior similarity of the users a and b is:
in addition, in the formula (1)The effect of the weight is to reduce the similarity of the click behaviors of two users in the user pair caused by the fact that the two users do not access the first link at the same time for a large number of days.
Step 306: and according to the click behavior similarity of each user pair in the time window, selecting each user pair with the click behavior similarity larger than a preset similarity threshold, and adding all the users in the selected user pairs into the flow cheating group set of the time window.
For example: there are 4 users accessing the first link in the time window, which are: a. b, c and d, the user pair comprises: ab. and ac, ad, bc, bd and cd, wherein the similarity of the clicking behaviors of the users to the ab and the bd is larger than a preset similarity threshold, and the users a, b and d are added into the flow cheating group and partner set of the time window.
Step 307: and selecting users appearing in the traffic cheating group set of each time window according to the traffic cheating group sets of the time windows in the first time period.
Step 308: and deleting the users appearing in the flow cheating group set of each time window from the flow cheating group set of each time window respectively to obtain the updated flow cheating group set of each time window.
Step 309: and for each user appearing in the flow cheating group set of each time window, respectively calculating the click behavior similarity of the user and the flow cheating group set of each time window, selecting the maximum click behavior similarity, and adding the user into the flow cheating group set corresponding to the maximum click behavior similarity.
The method comprises the following steps of calculating the similarity of the click behaviors of the user and the flow cheating group set of each time window, wherein the similarity of the click behaviors of the user and the flow cheating group set of each time window is specifically calculated as follows: and for the flow cheating group set of each time window, respectively calculating the click behavior similarity of the user and each user in the flow cheating group set, and selecting the minimum click behavior similarity as the click behavior similarity of the user and the flow cheating group set.
Step 310: judging whether the window length and the sliding step length of each current time window respectively reach the maximum window length and the maximum sliding step length, if so, executing a step 312; otherwise, step 311 is performed.
Step 311: and adjusting the window length of the sliding time window or/and the sliding step length according to the preset window length adjustment step length or/and the sliding step length adjustment step length, and returning to the step 303.
Step 312: according to the principle that the more the detected flow cheating group members are in each time window, the better the detection result is, an optimal group of flow cheating group set is selected from the detected multiple groups of flow cheating group sets, and the selected optimal group of flow cheating group set is used as a final detection result.
Wherein, a group of flow cheating group set is calculated according to each group of window length and sliding step length.
In practical application, the embodiment of the invention can be periodically executed, the flow cheating group partner set is updated according to the latest user click data, the identification mode can be carried out based on the real-time flow click behavior, so that the identification accuracy is improved, the periodic group partner flow cheating behavior can be identified at random, and the application range is wide.
Fig. 5 is a schematic structural diagram of a device for identifying a flow cheating behavior according to an embodiment of the present invention, where the device mainly includes:
the click behavior feature extraction module 51 is configured to obtain user click data of a user accessing a webpage; extracting user click data for accessing the first link within a first time period from the user click data; extracting user click data of each user from user click data accessing the first link in a first time period; and extracting the click behavior characteristics of each user from the user click data of each user.
The identifying module 52 is configured to determine whether a clustering behavior exists among the users according to the click behavior feature of each user extracted by the click behavior feature extraction module 51, and if so, join the users who have the clustering behavior into the traffic cheating group partner set.
In an alternative embodiment, the identification module 52 determining whether there is a clustering behavior between users includes: and calculating the click behavior similarity between every two users according to the click behavior characteristics of each user, and if the click behavior similarity is larger than a preset similarity threshold, determining that the two users corresponding to the click behavior similarity have a clustering behavior.
In an optional embodiment, the user click data obtained by the click behavior feature extraction module 51 includes: user identification information, a webpage link identification clicked by a user and user click time;
the click behavior feature extraction module 51 extracts click behavior features of each user from the user click data of each user, including: dividing the first time period into at least one sub-time period; for each sub-time period, acquiring user click data of each user accessing the first link in the sub-time period; for each sub-time period, all users accessing the first link in the sub-time period are completely paired pairwise; for each user pair accessing the first link in each sub-time period, according to the user click data of two users in the user pair in the current sub-time period, counting the number of days for the two users in the user pair to simultaneously access the first link in the current sub-time period, regarding the number of days as a first class of days, counting the number of days for the two users in the user pair to access the first link and only one user in the current sub-time period, and setting the number of days as a second class of days; and taking the first type of days and the second type of days as the click behavior characteristics of two users in the user pair in the current sub-time period.
In an alternative embodiment, the identification module 52 determines whether there is a clustering behavior between users, including: for each user pair accessing the first link in each sub-time period, calculating the sum of the first type days and the second type days corresponding to the user pair, and dividing the first type days by the sum to obtain the initial value of the click behavior similarity between the two users in the user pair; dividing the first type of days by the total days of the current sub-time period to obtain a weight; multiplying the weight by the initial value of the click behavior similarity to obtain the click behavior similarity between two users in the user pair, and if the click behavior similarity is greater than a preset similarity threshold, determining that a clustering behavior exists between the two users in the user pair;
the recognition module 52 joins users with a clustering behavior in a traffic cheating group aggregation, including: and adding two users in the user pair into the flow cheating group and partner set in the current sub-time period.
In an optional embodiment, after the identifying module 52 joins two users in the pair of users to the traffic cheating group aggregation in the current sub-time period, the method further includes: when the flow cheating group sets of all the sub-time periods in the first time period are obtained, selecting users who appear in each flow cheating group set; deleting the users appearing in each flow cheating group set from each flow cheating group set respectively to obtain an updated flow cheating group set in each sub-time period; and for each user appearing in each flow cheating group set, calculating the click behavior similarity of the user and each flow cheating group set, selecting the flow cheating group set with the maximum click behavior similarity, and adding the user into the selected flow cheating group set.
In an alternative embodiment, the identifying module 52 calculates the similarity of the click-through behavior of the user with each traffic cheating group, including: and for each flow cheating group set, respectively calculating the click behavior similarity of the user and each user in the flow cheating group set, and selecting the minimum click behavior similarity as the click behavior similarity of the user and the flow cheating group set.
In an alternative embodiment, the click behavior feature extraction module 51 divides the first time period into at least one sub-time period, which includes: dividing the first time period into at least one sub-time period according to the initial length of a preset sub-time period or the length and the adjustment step length of the current sub-time period;
moreover, after the identifying module 52 joins two users in the pair of users into the traffic cheating group aggregation in the current sub-time period, the method further includes: after the flow cheating group set of all the current sub-time periods is obtained, judging whether the length of the current sub-time period reaches the maximum length of a preset sub-time period or not; if not, the click behavior feature extraction module 51 is notified to return the action of dividing the first time period into at least one sub-time period according to the length and the adjustment step length of the current sub-time period; and if so, selecting the optimal at least one flow cheating group set as a final detection result in the at least one flow cheating group set detected each time according to the principle that the more members of the detected flow cheating group sets are and the better detection result is.
In an alternative embodiment, the click behavior feature extraction module 51 divides the first time period into at least one sub-time period, which includes: dividing the first time period into a plurality of sub-time periods, and enabling two adjacent sub-time periods to be overlapped to preset a second time length.
In an optional embodiment, after the extracting the user click data of each user from the user click data accessing the first link in the first time period and before extracting the click behavior feature of each user from the user click data of each user, the click behavior feature extracting module 51 further includes: for each user, according to the user click data of the user, counting the maximum time length of the user for continuously accessing the first link in the first time period, if the maximum time length is less than the preset first time length, determining that the user is not a flow cheating group member, and deleting the user click data of the user from the user click data of each user accessing the first link in the first time period.
In an optional embodiment, after the identifying module 52 joins the user with the clustering behavior in the traffic cheating group, the method further includes: for each user with the clustering behavior, respectively judging whether the user meets the following conditions, and if so, deleting the user from the traffic cheating cluster set: the dwell time of the user on the webpage of the first link is larger than a preset time threshold, or/and the access depth of the user to the first link is larger than a preset depth threshold, or/and the number of types of items purchased by the user on the first link is larger than a preset number.
Embodiments of the present application also provide a computer-readable storage medium storing instructions, which when executed by a processor, may perform the steps in the traffic cheating behavior identification method as described above. In practical applications, the computer readable medium may be included in each device/apparatus/system of the above embodiments, or may exist separately and not be assembled into the device/apparatus/system. Wherein instructions are stored in a computer readable storage medium, which stored instructions, when executed by a processor, may perform the steps in the traffic cheating behavior identification method as described above.
According to embodiments disclosed herein, the computer-readable storage medium may be a non-volatile computer-readable storage medium, which may include, for example and without limitation: a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing, without limiting the scope of the present disclosure. In the embodiments disclosed herein, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
As shown in fig. 6, an embodiment of the present invention further provides an electronic device. As shown in fig. 6, it shows a schematic structural diagram of an electronic device according to an embodiment of the present invention, specifically:
the electronic device may include a processor 61 of one or more processing cores, memory 62 of one or more computer-readable storage media, and a computer program stored on the memory and executable on the processor. The above-described traffic cheating act recognition method may be implemented when the program of the memory 62 is executed.
Specifically, in practical applications, the electronic device may further include a power supply 63, an input/output unit 64, and the like. Those skilled in the art will appreciate that the configuration of the electronic device shown in fig. 6 is not intended to be limiting of the electronic device and may include more or fewer components than shown, or some components in combination, or a different arrangement of components. Wherein:
the processor 61 is a control center of the electronic device, connects various parts of the entire electronic device by various interfaces and lines, performs various functions of the server and processes data by running or executing software programs and/or modules stored in the memory 62 and calling data stored in the memory 62, thereby performing overall monitoring of the electronic device.
The memory 62 may be used to store software programs and modules, i.e., the computer-readable storage media described above. The processor 61 executes various functional applications and data processing by executing software programs and modules stored in the memory 62. The memory 62 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function, and the like; the storage data area may store data created according to the use of the server, and the like. Further, the memory 62 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid state storage device. Accordingly, the memory 62 may also include a memory controller to provide the processor 61 access to the memory 62.
The electronic device further comprises a power supply 63 for supplying power to the various components, which can be logically connected to the processor 61 via a power management system, so as to implement functions of managing charging, discharging, and power consumption via the power management system. The power supply 63 may also include any component of one or more dc or ac power sources, recharging systems, power failure detection circuitry, power converters or inverters, power status indicators, and the like.
The electronic device may also include an input-output unit 64, the input-unit output 64 operable to receive input numeric or character information and generate keyboard, mouse, joystick, optical or trackball signal inputs related to user settings and function control. The input unit output 64 may also be used to display information input by or provided to the user as well as various graphical user interfaces, which may be made up of graphics, text, icons, video, and any combination thereof.
The flowchart and block diagrams in the figures of the present application illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments disclosed herein. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
Those skilled in the art will appreciate that various combinations and/or combinations of features recited in the various embodiments and/or claims of the present disclosure can be made, even if such combinations or combinations are not explicitly recited in the present application. In particular, the features recited in the various embodiments and/or claims of the present application may be combined and/or coupled in various ways, all of which fall within the scope of the present disclosure, without departing from the spirit and teachings of the present application.
The principles and embodiments of the present invention are explained herein using specific examples, which are provided only to help understanding the method and the core idea of the present invention, and are not intended to limit the present application. It will be appreciated by those skilled in the art that changes may be made in this embodiment and its broader aspects and without departing from the principles, spirit and scope of the invention, and that all such modifications, equivalents, improvements and equivalents as may be included within the scope of the invention are intended to be protected by the claims.
Claims (13)
1. A flow cheating behavior identification method is characterized by comprising the following steps:
acquiring user click data of a webpage accessed by a user;
extracting user click data for accessing a first link within a first time period from the user click data;
extracting user click data of each user from user click data accessing the first link within the first time period;
extracting the click behavior characteristics of each user from the user click data of each user;
and determining whether the users have clustering behavior according to the clicking behavior characteristics of each user, and if so, adding the users with the clustering behavior into the flow cheating group partner set.
2. The method of claim 1, wherein determining whether there is a clustering behavior among users comprises:
and calculating the click behavior similarity between every two users according to the click behavior characteristics of each user, and if the click behavior similarity is larger than a preset similarity threshold, determining that the two users corresponding to the click behavior similarity have a clustering behavior.
3. The method of claim 1, wherein the user click data comprises: user identification information, a webpage link identification clicked by a user and user click time;
the extracting the click behavior feature of each user from the user click data of each user comprises: dividing the first time period into at least one sub-time period;
for each sub-time period, acquiring user click data of each user accessing the first link in the sub-time period;
for each sub-time period, all users accessing the first link in the sub-time period are completely paired pairwise;
for each user pair accessing the first link in each sub-time period, according to the user click data of two users in the user pair in the current sub-time period, counting the number of days for the two users in the user pair to simultaneously access the first link in the current sub-time period, regarding the number of days as a first class of days, counting the number of days for the two users in the user pair to access the first link and only one user in the current sub-time period, and setting the number of days as a second class of days;
and taking the first type of days and the second type of days as the click behavior characteristics of two users in the user pair in the current sub-time period.
4. The method of claim 3, wherein determining whether there is a clustering behavior among users comprises:
for each user pair accessing the first link in each sub-time period, calculating the sum of the first type days and the second type days corresponding to the user pair, and dividing the first type days by the sum to obtain the initial value of the click behavior similarity between the two users in the user pair; dividing the first type of days by the total days of the current sub-time period to obtain a weight; multiplying the weight by the initial value of the click behavior similarity to obtain the click behavior similarity between two users in the user pair, and if the click behavior similarity is greater than a preset similarity threshold, determining that a clustering behavior exists between the two users in the user pair;
the joining of the users with the clustering behavior into the flow cheating group partner set comprises the following steps:
and adding two users in the user pair into the flow cheating group and partner set in the current sub-time period.
5. The method of claim 4, wherein after joining two users in the pair of users into the traffic cheating group aggregation in the current sub-period of time, further comprising:
when the flow cheating group sets of all the sub-time periods in the first time period are obtained, selecting users who appear in each flow cheating group set;
deleting the users appearing in each flow cheating group set from each flow cheating group set respectively to obtain an updated flow cheating group set in each sub-time period;
and for each user appearing in each flow cheating group set, calculating the click behavior similarity of the user and each flow cheating group set, selecting the flow cheating group set with the maximum click behavior similarity, and adding the user into the selected flow cheating group set.
6. The method of claim 5, wherein calculating the similarity of click behavior of the user with each traffic cheating group comprises:
and for each flow cheating group set, respectively calculating the click behavior similarity of the user and each user in the flow cheating group set, and selecting the minimum click behavior similarity as the click behavior similarity of the user and the flow cheating group set.
7. The method of claim 4, wherein dividing the first time period into at least one sub-time period comprises:
dividing the first time period into at least one sub-time period according to the initial length of a preset sub-time period or the length and the adjustment step length of the current sub-time period;
and after the two users in the user pair are added into the traffic cheating group aggregation in the current sub-time period, the method further comprises the following steps:
after the flow cheating group set of all the current sub-time periods is obtained, judging whether the length of the current sub-time period reaches the maximum length of a preset sub-time period or not;
if not, returning to the action of dividing the first time period into at least one sub-time period according to the length of the current sub-time period and the adjustment step length;
and if so, selecting the optimal at least one flow cheating group set as a final detection result in the at least one flow cheating group set detected each time according to the principle that the more members of the detected flow cheating group sets are and the better detection result is.
8. The method of claim 3, wherein dividing the first time period into at least one sub-time period comprises:
dividing the first time period into a plurality of sub-time periods, and enabling two adjacent sub-time periods to be overlapped to preset a second time length.
9. The method of claim 1, wherein after extracting the user click data of each user from the user click data accessing the first link in the first time period and before extracting the click behavior feature of each user from the user click data of each user, further comprising:
for each user, according to the user click data of the user, counting the maximum time length of the user for continuously accessing the first link in the first time period, if the maximum time length is less than the preset first time length, determining that the user is not a flow cheating group member, and deleting the user click data of the user from the user click data of each user accessing the first link in the first time period.
10. The method of claim 1, wherein after joining a user with a clustering behavior into a traffic cheating group, further comprising:
for each user with the clustering behavior, respectively judging whether the user meets the following conditions, and if so, deleting the user from the traffic cheating cluster set:
the dwell time of the user on the webpage of the first link is larger than a preset time threshold, or/and the access depth of the user to the first link is larger than a preset depth threshold, or/and the number of types of items purchased by the user on the first link is larger than a preset number.
11. A traffic cheating behavior recognition apparatus, comprising:
the click behavior feature extraction module is used for acquiring user click data of a webpage accessed by a user; extracting user click data for accessing a first link within a first time period from the user click data; extracting user click data of each user from user click data accessing the first link within the first time period; extracting the click behavior characteristics of each user from the user click data of each user;
and the identification module is used for determining whether the clustering behavior exists among the users according to the clicking behavior characteristics of each user, and if so, adding the users with the clustering behavior into the flow cheating clustering set.
12. A non-transitory computer readable storage medium storing instructions that, when executed by a processor, cause the processor to perform the steps of the traffic cheating behavior identification method recited in any of claims 1-10.
13. An electronic device comprising the non-transitory computer readable storage medium of claim 12, and the processor having access to the non-transitory computer readable storage medium.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110981015.0A CN113592036A (en) | 2021-08-25 | 2021-08-25 | Flow cheating behavior identification method and device, storage medium and electronic equipment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110981015.0A CN113592036A (en) | 2021-08-25 | 2021-08-25 | Flow cheating behavior identification method and device, storage medium and electronic equipment |
Publications (1)
Publication Number | Publication Date |
---|---|
CN113592036A true CN113592036A (en) | 2021-11-02 |
Family
ID=78239703
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110981015.0A Pending CN113592036A (en) | 2021-08-25 | 2021-08-25 | Flow cheating behavior identification method and device, storage medium and electronic equipment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113592036A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114511134A (en) * | 2021-12-30 | 2022-05-17 | 北京字跳网络技术有限公司 | Wind control strategy generation method, device, storage medium and program product |
CN114818933A (en) * | 2021-12-23 | 2022-07-29 | 金数信息科技(苏州)有限公司 | Method and device for monitoring artificial flow cheating based on Epsilon greedy algorithm |
Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8533825B1 (en) * | 2010-02-04 | 2013-09-10 | Adometry, Inc. | System, method and computer program product for collusion detection |
CN104765874A (en) * | 2015-04-24 | 2015-07-08 | 百度在线网络技术(北京)有限公司 | Method and device for detecting click-cheating |
CN108898505A (en) * | 2018-05-28 | 2018-11-27 | 武汉斗鱼网络科技有限公司 | Recognition methods, corresponding medium and the electronic equipment of cheating clique |
JP2019003629A (en) * | 2017-06-16 | 2019-01-10 | Line株式会社 | Cheating application identification method and system |
CN110213209A (en) * | 2018-05-11 | 2019-09-06 | 腾讯科技(深圳)有限公司 | A kind of cheat detection method, device and storage medium that pushed information is clicked |
CN110659954A (en) * | 2019-08-29 | 2020-01-07 | 北京三快在线科技有限公司 | Cheating identification method and device, electronic equipment and readable storage medium |
CN112163096A (en) * | 2020-09-18 | 2021-01-01 | 中国建设银行股份有限公司 | Malicious group determination method and device, electronic equipment and storage medium |
CN112488765A (en) * | 2020-12-08 | 2021-03-12 | 深圳市欢太科技有限公司 | Advertisement anti-cheating method, advertisement anti-cheating device, electronic equipment and storage medium |
CN112766995A (en) * | 2019-10-21 | 2021-05-07 | 招商证券股份有限公司 | Article recommendation method and device, terminal device and storage medium |
CN112800419A (en) * | 2019-11-13 | 2021-05-14 | 北京数安鑫云信息技术有限公司 | Method, apparatus, medium and device for identifying IP group |
CN112989295A (en) * | 2019-12-16 | 2021-06-18 | 北京沃东天骏信息技术有限公司 | User identification method and device |
-
2021
- 2021-08-25 CN CN202110981015.0A patent/CN113592036A/en active Pending
Patent Citations (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8533825B1 (en) * | 2010-02-04 | 2013-09-10 | Adometry, Inc. | System, method and computer program product for collusion detection |
CN104765874A (en) * | 2015-04-24 | 2015-07-08 | 百度在线网络技术(北京)有限公司 | Method and device for detecting click-cheating |
WO2016169193A1 (en) * | 2015-04-24 | 2016-10-27 | 百度在线网络技术(北京)有限公司 | Method and apparatus for detecting cheated clicks |
JP2019003629A (en) * | 2017-06-16 | 2019-01-10 | Line株式会社 | Cheating application identification method and system |
CN110213209A (en) * | 2018-05-11 | 2019-09-06 | 腾讯科技(深圳)有限公司 | A kind of cheat detection method, device and storage medium that pushed information is clicked |
CN108898505A (en) * | 2018-05-28 | 2018-11-27 | 武汉斗鱼网络科技有限公司 | Recognition methods, corresponding medium and the electronic equipment of cheating clique |
CN110659954A (en) * | 2019-08-29 | 2020-01-07 | 北京三快在线科技有限公司 | Cheating identification method and device, electronic equipment and readable storage medium |
CN112766995A (en) * | 2019-10-21 | 2021-05-07 | 招商证券股份有限公司 | Article recommendation method and device, terminal device and storage medium |
CN112800419A (en) * | 2019-11-13 | 2021-05-14 | 北京数安鑫云信息技术有限公司 | Method, apparatus, medium and device for identifying IP group |
CN112989295A (en) * | 2019-12-16 | 2021-06-18 | 北京沃东天骏信息技术有限公司 | User identification method and device |
CN112163096A (en) * | 2020-09-18 | 2021-01-01 | 中国建设银行股份有限公司 | Malicious group determination method and device, electronic equipment and storage medium |
CN112488765A (en) * | 2020-12-08 | 2021-03-12 | 深圳市欢太科技有限公司 | Advertisement anti-cheating method, advertisement anti-cheating device, electronic equipment and storage medium |
Non-Patent Citations (4)
Title |
---|
HAYDEN CHEERS; YUQING LIN; SHAMUS P. SMITH: "Academic Source Code Plagiarism Detection by Measuring Program Behavioral Similarity", IEEE ACCESS, 29 March 2021 (2021-03-29) * |
孙勇;谭文安;金婷;周亮广;: "基于在线聚类的协同作弊团体识别方法", 计算机研究与发展, no. 06, 15 June 2018 (2018-06-15) * |
李彤岩;李兴明;: "基于双约束滑动时间窗口的告警预处理方法研究", 计算机应用研究, no. 02, 15 February 2013 (2013-02-15) * |
陈霞;闵华清;宋恒杰;: "众包平台作弊用户自动识别", 计算机工程, no. 08, 9 March 2016 (2016-03-09) * |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114818933A (en) * | 2021-12-23 | 2022-07-29 | 金数信息科技(苏州)有限公司 | Method and device for monitoring artificial flow cheating based on Epsilon greedy algorithm |
CN114818933B (en) * | 2021-12-23 | 2024-05-28 | 金数信息科技(苏州)有限公司 | Method and device for monitoring artificial flow cheating based on Epsilon greedy algorithm |
CN114511134A (en) * | 2021-12-30 | 2022-05-17 | 北京字跳网络技术有限公司 | Wind control strategy generation method, device, storage medium and program product |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107613022B (en) | Content pushing method and device and computer equipment | |
WO2019153604A1 (en) | Device and method for creating human/machine identification model, and computer readable storage medium | |
CN108833458B (en) | Application recommendation method, device, medium and equipment | |
CN103116582B (en) | A kind of information retrieval method and related system and device | |
JP2019533205A (en) | User keyword extraction apparatus, method, and computer-readable storage medium | |
CN110971659A (en) | Recommendation message pushing method and device and storage medium | |
US20180060426A1 (en) | Systems and methods for issue management | |
CN108259638B (en) | Intelligent sorting method for personal group list, intelligent terminal and storage medium | |
CN108345601B (en) | Search result ordering method and device | |
CN112380859A (en) | Public opinion information recommendation method and device, electronic equipment and computer storage medium | |
CN113592036A (en) | Flow cheating behavior identification method and device, storage medium and electronic equipment | |
US11887013B2 (en) | System and method for facilitating model-based classification of transactions | |
WO2013082297A2 (en) | Classifying attribute data intervals | |
CN108921587B (en) | Data processing method and device and server | |
CN111563198B (en) | Material recall method, device, equipment and storage medium | |
CN112561332B (en) | Model management method, device, electronic equipment, storage medium and program product | |
CN111460384A (en) | Policy evaluation method, device and equipment | |
US20190220924A1 (en) | Method and device for determining key variable in model | |
CN111444438B (en) | Method, device, equipment and storage medium for determining quasi-recall rate of recall strategy | |
CN110825868A (en) | Topic popularity based text pushing method, terminal device and storage medium | |
CN111291082B (en) | Data aggregation processing method, device, equipment and storage medium | |
CN113763066A (en) | Method and apparatus for analyzing information | |
CN110929169A (en) | Position recommendation method based on improved Canopy clustering collaborative filtering algorithm | |
CN110737432A (en) | script aided design method and device based on root list | |
CN112529181B (en) | Method and apparatus for model distillation |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |