CN112435068A - Malicious order identification method and device, electronic equipment and storage medium - Google Patents

Malicious order identification method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN112435068A
CN112435068A CN202011383524.5A CN202011383524A CN112435068A CN 112435068 A CN112435068 A CN 112435068A CN 202011383524 A CN202011383524 A CN 202011383524A CN 112435068 A CN112435068 A CN 112435068A
Authority
CN
China
Prior art keywords
order
data
characteristic
malicious
orders
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011383524.5A
Other languages
Chinese (zh)
Inventor
高强
崔波
王桥
张磊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Wodong Tianjun Information Technology Co Ltd
Original Assignee
Beijing Wodong Tianjun Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Wodong Tianjun Information Technology Co Ltd filed Critical Beijing Wodong Tianjun Information Technology Co Ltd
Priority to CN202011383524.5A priority Critical patent/CN112435068A/en
Publication of CN112435068A publication Critical patent/CN112435068A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0207Discounts or incentives, e.g. coupons or rebates
    • G06Q30/0225Avoiding frauds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • G06Q30/0601Electronic shopping [e-shopping]
    • G06Q30/0609Buyer or seller confidence or verification

Landscapes

  • Business, Economics & Management (AREA)
  • Accounting & Taxation (AREA)
  • Finance (AREA)
  • Strategic Management (AREA)
  • Development Economics (AREA)
  • Engineering & Computer Science (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Game Theory and Decision Science (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The embodiment of the invention is suitable for the technical field of e-commerce and provides a malicious order identification method, a malicious order identification device, electronic equipment and a storage medium, wherein the malicious order identification comprises the following steps: determining first order data; the first order data represent order data corresponding to all orders in a first time period; determining a characteristic data set corresponding to the first order data; the characteristic data set characterizes each first set characteristic and a corresponding characteristic value in at least one first set characteristic corresponding to the first order data; converting data in the characteristic data set into tensor data, and extracting a dense block from the tensor data; the dense block represents at least one sub-tensor data block of which the number of tensor data is greater than a first set value; each sub-tensor data block represents tensor data of which the eigenvalues corresponding to at least one set feature in the at least one set feature are the same; determining a malicious order of all orders within the first time period based on the extracted dense blocks.

Description

Malicious order identification method and device, electronic equipment and storage medium
Technical Field
The invention relates to the technical field of e-commerce, in particular to a malicious order identification method and device, electronic equipment and a storage medium.
Background
With the rapid development of the internet, online shopping is more and more popular. Many malicious users make malicious orders on the e-commerce platform for the purposes of wool pulling, arbitrage and the like, and huge economic losses are caused to merchants and the e-commerce platform directly or indirectly. At present, in the related art, a rule model or a machine learning model is usually used for identifying malicious orders when identifying the malicious orders, but the models consume a large amount of manpower, and the order data processing complexity is high and the identification accuracy is low.
Disclosure of Invention
In order to solve the above problem, embodiments of the present invention provide a malicious order identification method, apparatus, electronic device and storage medium, so as to at least solve the problem in the related art that malicious order identification accuracy is low.
The technical scheme of the invention is realized as follows:
in a first aspect, an embodiment of the present invention provides a method for identifying a malicious order, where the method includes:
determining first order data; the first order data represent order data corresponding to all orders in a first time period;
determining a characteristic data set corresponding to the first order data; the characteristic data set characterizes each first set characteristic and a corresponding characteristic value in at least one first set characteristic corresponding to the first order data;
converting the data in the feature data set into tensor data, and extracting a dense block from the tensor data; the dense block represents at least one sub-tensor data block of which the number of tensor data is greater than a first set value; each sub-tensor data block represents tensor data of which the eigenvalues corresponding to at least one set feature in the at least one set feature are the same;
determining a malicious order of all orders within the first time period based on the extracted dense blocks.
In the above aspect, the method further includes:
determining a characteristic value corresponding to each second set characteristic in all second set characteristics corresponding to the second order data; the second order data characterizes order data that have been determined to be a malicious order within a second time period;
in the case where the feature value satisfies the corresponding setting condition, the corresponding second setting feature is determined as one of the at least one first setting feature.
In the foregoing aspect, when the corresponding second setting feature is determined as one of the at least one first setting feature in a case where the feature value satisfies the corresponding setting condition, the method further includes:
and determining the setting condition corresponding to the second setting characteristic.
In the above scheme, when the second setting characteristic represents an order amount, the corresponding setting condition represents that the order amount in the second order data is greater than a second setting value, and the proportion of the order amount with the order amount greater than the second setting value to the total order amount of the second order data is greater than a first setting proportion;
when the second set characteristic represents that the order is a shop mark, the corresponding set condition represents that the number of orders with the same shop mark in the second order data is larger than a third set value;
when the second set characteristic represents the user identifier, the corresponding set condition represents that the number of orders with the same user identifier in the second order data is larger than a fourth set value;
when the second set characteristic represents an order address, the corresponding set condition represents that the number of orders with the same order address in the second order data is larger than a fifth set value;
when the second setting characteristic is the number of the commodities, the corresponding setting condition represents that the number of the orders of which the number of the commodities is larger than the sixth setting value in the second order data is larger than the seventh setting value.
In the foregoing solution, the determining the first order data includes:
the first order data is determined based on a set time period.
In the above solution, the determining a malicious order from all orders in the first time period based on the extracted dense blocks includes:
and mapping tensor data in all sub-tensor data blocks in the dense block to the data set to obtain a malicious order in the first order data.
In the above aspect, the first setting feature includes at least any one of:
a user identification;
an order address;
a store identification;
the amount of the order;
the number of goods.
In a second aspect, an embodiment of the present invention provides a malicious order identification apparatus, where the apparatus includes:
the first determining module is used for determining first order data; the first order data represent order data corresponding to all orders in a first time period;
the second determining module is used for determining a characteristic data set corresponding to the first order data; the characteristic data set characterizes each first set characteristic and a corresponding characteristic value in at least one first set characteristic corresponding to the first order data;
the extraction module is used for converting the data in the characteristic data set into tensor data and extracting a dense block from the tensor data; the dense block represents at least one sub-tensor data block of which the number of tensor data is greater than a first set value; each sub-tensor data block represents tensor data of which the eigenvalues corresponding to at least one set feature in the at least one set feature are the same;
and the third determination module is used for determining a malicious order in all orders in the first time interval based on the extracted dense blocks.
In a third aspect, an embodiment of the present invention provides an electronic device, which includes a processor and a memory, where the processor and the memory are connected to each other, where the memory is used to store a computer program, and the computer program includes program instructions, and the processor is configured to call the program instructions to execute the steps of the malicious order identification method provided in the first aspect of the embodiment of the present invention.
In a fourth aspect, an embodiment of the present invention provides a computer-readable storage medium, including: the computer-readable storage medium stores a computer program. The computer program, when executed by a processor, implements the steps of the malicious order identification method as provided by the first aspect of an embodiment of the present invention.
The method and the device for determining the malicious orders in the first time period have the advantages that the characteristic data set corresponding to the first order data is determined, the data in the characteristic data set are converted into tensor data, dense blocks are extracted from the tensor data, and the malicious orders in all the orders in the first time period are determined based on the extracted dense blocks. The first order data represent order data corresponding to all orders in a first time period, the feature data set represents each first set feature and a corresponding feature value in at least one first set feature corresponding to the first order data, the dense blocks represent at least one sub-tensor data block of which the number of tensor data is greater than a first set value, and each sub-tensor data block represents tensor data of which the feature values corresponding to at least one set feature in at least one set feature are the same. According to the embodiment of the invention, the data in the feature data set is converted into tensor data, the complexity of data processing can be reduced, the dense blocks are extracted from the tensor data, the malicious order is determined based on the extracted dense blocks, the accuracy of malicious order identification can be improved, and the identification result is more in line with the features of the malicious order.
Drawings
Fig. 1 is a schematic flow chart illustrating an implementation of a malicious order identification method according to an embodiment of the present invention;
FIG. 2 is a diagram of a dense block output model according to an embodiment of the present invention;
fig. 3 is a schematic flow chart illustrating an implementation of another malicious order identification method according to an embodiment of the present invention;
FIG. 4 is a schematic diagram of a malicious order in relation to the amount of the order;
FIG. 5 is a schematic illustration of a malicious order quantity and corresponding user identification;
FIG. 6 is a schematic illustration of a malicious order quantity and corresponding order address;
FIG. 7 is a diagram illustrating a malicious order identification process according to an embodiment of the present invention;
fig. 8 is a schematic diagram of a malicious order identification apparatus according to an embodiment of the present invention;
fig. 9 is a schematic diagram of an electronic device according to an embodiment of the invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
At present, the related art mainly has three methods of identifying malicious orders, namely, a rule-based method, a machine learning-based method and a graph neural network-based method. Among the above methods for identifying malicious orders, the method for identifying malicious orders based on rules requires an expert with professional domain knowledge to analyze specific items, and meanwhile, to evaluate and verify the analysis results, which consumes a lot of labor cost. The malicious order identification method based on machine learning needs manual labeling, has no reasonable interpretability on the actual significance of projects, and cannot explain why the identified orders are malicious orders. The method based on the graph neural network is mainly used for identifying entities, has higher composition complexity when identifying the network ordering behavior, and is difficult to be applied to identifying the network malicious ordering behavior.
In summary, the problems of high data processing complexity, incomplete identification of malicious orders and low identification accuracy exist in the prior art when malicious orders are identified. In view of the above disadvantages of the related art, embodiments of the present invention provide a method for identifying malicious orders, which can at least improve the accuracy of identifying malicious orders. In order to explain the technical means of the present invention, the following description will be given by way of specific examples.
Fig. 1 is a schematic view of an implementation flow of a malicious order identification method according to an embodiment of the present invention, where an execution subject of the malicious order identification method is an electronic device, and the electronic device includes a desktop computer, a notebook computer, a server, and the like, and referring to fig. 1, the malicious order identification method includes:
s101, determining first order data; the first order data represent order data corresponding to all orders in a first time period.
Here, the first order data is order data corresponding to all orders within the first time period. For example, the user places an order on the shopping platform, and the first order data may be order data corresponding to all orders generated by the shopping platform in a first period, such as order data corresponding to all orders generated in the first two hours from the current time.
The determining first order data comprises:
the first order data is determined based on a set time period.
In practical applications, the first order data may be collected according to a set time period. For example, a user places an order on a shopping platform, and order data corresponding to all orders generated by the platform in a first period are collected at set time intervals. For example, at the hour, order data corresponding to all orders generated by the shopping platform within the first 2 hours of the current time is collected. For example, when the current time is 12 o ' clock, order data corresponding to all orders generated between 10 o ' clock and 12 o ' clock are collected.
Here, the first order data includes various information such as an order identification, a user identification, an order amount, a store identification, an order status, and commodity information in the order.
S102, determining a characteristic data set corresponding to the first order data; the characteristic data set characterizes each first setting characteristic and a corresponding characteristic value in at least one first setting characteristic corresponding to the first order data.
In an embodiment of the present invention, the first setting feature includes at least any one of:
a user identification;
an order address;
a store identification;
the amount of the order;
the number of goods.
The user identification can be a user account, a contact way and the like of the user; the order address is a receiving address of the order, and if the order is a virtual account recharging type order, the order address is account information of the virtual account; the store identification may be the name of the store; the order amount comprises an order amount corresponding to each commodity in the order; the commodity quantity is the commodity quantity in the order, and comprises the total commodity quantity and the quantity corresponding to each commodity.
And determining a characteristic data set corresponding to the first order data, wherein each first set characteristic and the corresponding characteristic value are included in the characteristic data set. For example, assuming that the at least one first setting characteristic corresponding to the first order data includes an order amount and a user identifier, the characteristic data set includes the order amount and the user identifier corresponding to each order.
S103, converting the data in the feature data set into tensor data, and extracting a dense block from the tensor data; the dense block represents at least one sub-tensor data block of which the number of tensor data is greater than a first set value; each sub-tensor data block characterizes tensor data in which eigenvalues corresponding to at least one set feature of the at least one set feature are the same.
In the embodiment of the present invention, the tensor data is composed of the respective eigenvalues of the order data. Referring to table 1, table 1 is a schematic diagram of a feature data set provided in an application example of the present invention, where the feature data set in table 1 includes features and feature values of order 1 to order 5, and the features include F1, F2, and F3. By converting the data in the feature data set into tensor data, the feature data set in table 1 can obtain a total of 5 tensor data of (a, I, O), (B, J, P), (B, J, O), (C, K, Q), and (B, J, O).
Of these, tensor data (B, J, O) appears 2 times.
F1 F2 F3
Order 1 A I O
Order 2 B J P
Order 3 B J O
Order 4 C K Q
Order 5 B J O
TABLE 1
In the embodiment of the present invention, tensor data in which eigenvalues corresponding to at least one of the at least one set feature are the same form a sub-tensor data block. For example, as shown in table 1, assuming that at least one of the at least one setting feature includes F1, F2, and F3, a total of 4 sub tensor data blocks are available. The number of tensor data included in the sub-tensor data block in which the tensor data (B, J, O) are located is 2, and the number of tensor data included in the remaining sub-tensor data blocks is 1. The number of tensor data included in the sub-tensor data block is defined as the density of the sub-tensor data block, and if N identical tensor data are included in the sub-tensor data block, the density of the sub-tensor data block is N.
The dense block represents at least one sub-tensor data block in which the number of tensor data is greater than the first set value, for example, as shown in table 1, assuming that the first set value is 1, since the density of the sub-tensor data block in which the tensor data (B, J, O) is located is 2, the sub-tensor data block in which the number of tensor data is greater than 1 is the sub-tensor data block in which the tensor data (B, J, O) is located, that is, the dense block is the sub-tensor data block in which the tensor data (B, J, O) is located.
In practical application, the feature data set can be learned to obtain a dense block based on a catcher core algorithm. Firstly, data in the feature data set is converted into tensor data, then the tensor data is input into a captcore model, and a dense block is output by the captcore model. The catcher core model assumes that each piece of sample data has a certain density, and the catcher core model obtains a sub tensor data block with the maximum density through hierarchical iterative mining, wherein the sub tensor data block comprises features and eigenvalues which enable the sub tensor data block to have the maximum density.
Referring to fig. 2, fig. 2 is a schematic diagram of a dense block output model according to an embodiment of the present invention, where the dense block output model is the catcher core model described above. In fig. 2, the gradation degree indicates the density size of the dense block, and the darker the color, the greater the density of the dense block, that is, the greater the number of tensor data included. And converting the characteristic data set into tensor data, and performing iterative computation on the tensor data by using a CatchCore model. In each iteration calculation process, the catcher core model selects a dense block with the highest density from tensor data as the input of the next iteration until the optimal dense block is found. The sub-tensor data blocks are mined layer by the algorithm, the sub-tensor data blocks output by the next layer are smaller in area and larger in density than the sub-tensor data blocks output by the previous layer, and the sub-tensor data blocks obtained finally are dense blocks.
In practical applications, a catcher core model needs to be trained based on training data. The training data are order data which are detected by malicious orders within a certain historical time period, are marked, and represent whether the training data are malicious orders or not. Features used in the training are then selected, including user identification, order address, store identification, order amount, quantity of goods, and the like. And finally, training a CatchCore model based on the training data, the selected features and the corresponding feature values, evaluating and verifying the CatchCore model after the CatchCore model is trained, and detecting the recognition effect of the CatchCore model.
In the course of the CatchCore model training, not all features used in the training can be finally retained in the trained CatchCore model.
Referring to fig. 3, in an embodiment, the method further comprises:
s301, determining a characteristic value corresponding to each second setting characteristic in all second setting characteristics corresponding to second order data; the second order data characterizes order data that has been determined to be a malicious order within a second time period.
Here, the second order data is order data that has been determined to be a malicious order within the second period. The second setting characteristic is a characteristic of the second order data, including a user identifier, an order address, a store identifier, an order amount, a commodity quantity, and the like. It should be understood that the second period of time and the first period of time are not the same period of time.
And S302, under the condition that the characteristic value meets the corresponding setting condition, determining the corresponding second setting characteristic as one first setting characteristic in the at least one first setting characteristic.
And determining the corresponding second setting characteristic as one of the at least one first setting characteristic when the characteristic value corresponding to the second setting characteristic meets the setting condition.
Here, the corresponding setting conditions are different depending on the second setting characteristics.
Further, when the corresponding second setting characteristic is determined as one of the at least one first setting characteristic in the case that the characteristic value satisfies the corresponding setting condition, the method further includes:
and determining the setting condition corresponding to the second setting characteristic.
Specifically, when the second setting characteristic represents an order amount, the corresponding setting condition represents that the order amount in the second order data is greater than a second setting value, and the proportion of the order amount with the order amount greater than the second setting value to the total order amount of the second order data is greater than a first setting proportion.
The malicious order has a characteristic of high amount of money, referring to fig. 4, fig. 4 is a schematic diagram of a relationship between the amount of the malicious order and the amount of the order, in fig. 4, a horizontal axis indicates a date, a vertical axis indicates the amount of the order, a black line with a darker color indicates order data determined as the malicious order, and a gray line with a lighter color indicates order data with an amount of money larger than 1 ten thousand yuan in the order data determined as the malicious order. As can be seen from fig. 4, on any day, the amount of order data for malicious orders with an order amount greater than 1 ten thousand dollars accounts for more than 80% of the total malicious order amount. Assuming that the second setting value is 1 ten thousand yuan, the first setting ratio is 80%. And when the order amount in the second order data is more than 1 ten thousand yuan and the proportion of the order quantity with the order amount of more than 1 ten thousand yuan to the total order quantity of the second order data is more than 80%, determining the order amount as one of the at least one first set characteristic.
And when the second set characteristic represents that the order is the shop mark, the corresponding set condition represents that the number of orders with the same shop mark in the second order data is larger than a third set value.
The malicious user usually uses different user accounts to place a large number of malicious orders in the same store, and when the second setting characteristic is the store identifier, the corresponding setting condition is that the number of orders with the same store identifier in the second order data is greater than a third setting value, for example, the second order data includes 100 malicious orders in total, and the third setting value may be set to 30. And when the number of orders with the same shop identification in the second order data is larger than a third set value, determining the shop identification as one of the at least one first set characteristic.
And when the second set characteristic is a user identifier, the corresponding set condition represents that the number of orders with the same user identifier in the second order data is larger than a fourth set value.
Referring to fig. 5, fig. 5 is a schematic diagram of the malicious order quantity and the corresponding user identifier, where the horizontal axis represents the user identifier and the vertical axis represents the order quantity, it can be seen that most malicious orders are placed for the user identifier a, and a malicious user usually submits a large number of malicious orders using one user account. And when the number of orders with the same user identification in the second order data is larger than a fourth set value, determining the user identification as one first set characteristic in the at least one first set characteristic.
And when the second set characteristic represents that the order address is the corresponding set condition represents that the number of orders with the same order address in the second order data is larger than a fifth set value.
Referring to fig. 6, fig. 6 is a schematic diagram of malicious order quantity and corresponding order address, where the horizontal axis represents the order address and the vertical axis represents the order quantity, it can be seen that most malicious orders are received at the order address a, and a malicious user usually uses the same order address to receive goods in the order. And when the second set characteristic is characterized as an order address, and the number of orders with the same order address in the second order data is larger than a fifth set value, determining the order address as one first set characteristic in the at least one first set characteristic.
And when the second set characteristic represents the quantity of the commodities, the corresponding set condition represents that the quantity of the orders with the quantity of the commodities larger than the sixth set value in the second order data is larger than the seventh set value.
The malicious user usually purchases a large amount of goods in one order, and when the second set characteristic is a quantity of goods, and when the quantity of orders in which the quantity of goods is greater than the sixth set value in the second order data is greater than the seventh set value, the quantity of goods is determined as one of the at least one first set characteristic.
For the second setting feature satisfying the setting condition, it may be directly determined as one of the at least one first setting feature. In some cases, a plurality of first setting features are needed to extract the dense block, if the number of the first setting features determined based on the condition is small, the second setting features which do not meet the setting condition and the second setting features which meet the setting condition are subjected to iterative training together when the model is trained, the training result is judged according to the evaluation index, and whether the second setting features which do not meet the setting condition are selected to be added into at least one first setting feature is judged. Here, the training result is the dense block output by the model, and the evaluation index includes an invalid order rate, an order cancellation rate, an unpaid order rate, an unfinished order rate, a payment method, and the like, where the invalid order rate is an invalid order proportion in the order corresponding to the dense block obtained by learning, the unfinished order rate is an unpaid order proportion in the order corresponding to the dense block obtained by learning, and the unfinished order rate is an unfinished order proportion in the order corresponding to the dense block obtained by learning. Through relevant data statistics, the non-enterprise orders have the condition that common users occupy the inventory maliciously in an enterprise transfer mode, the enterprise transfer payment mode has a long period, and malicious transactions are often found in the scene. Because the payment mode cannot directly evaluate the learning effect, the order state needs to be tracked subsequently, and whether the order is malicious or not is judged together through the index and other indexes. And if the order corresponding to the dense block meets the evaluation index, supplementing a second setting characteristic which does not meet the setting condition into at least one first setting characteristic.
Referring to table 2, table 2 shows the malicious order recognition result of the catcher core model provided in the application embodiment of the present invention, and as can be seen from table 2, there are 3 layers of the catcher core model, each layer has the same number of cancelled orders, number of unfinished orders and number of unpaid orders, and the ratio of the malicious orders marked is: 95.08%, 94.48% and 96.54%. The professional in the field analyzes the marked malicious orders, the malicious orders which are not identified by the catcher core model are the order brushing behavior, and each layer is analyzed and calculated:
Figure BDA0002809091810000111
therefore, the proportion of p in each layer is close to one hundred, so that the catcher core model provided by the application embodiment of the invention has high accuracy in identifying the malicious orders, is more comprehensive in identification, has strong interpretability of the identification result, and accords with the characteristics of the malicious orders.
Figure BDA0002809091810000112
TABLE 2
And S104, determining malicious orders in all orders in the first time interval based on the extracted dense blocks.
In an embodiment, the determining a malicious order of all orders within the first time period based on the extracted dense blocks comprises:
and mapping tensor data in all sub-tensor data blocks in the dense block to the characteristic data set to obtain a malicious order in the first order data.
For example, as shown in table 1, if the obtained dense block is a sub-tensor data block in which the tensor data (B, J, O) is located, the tensor data (B, J, O) is mapped into the feature data set, in the feature data set, the orders with feature values of B, J, O respectively include order 3 and order 5, that is, order 3 and order 5 are malicious orders in the first order data.
The method and the device for determining the malicious orders in the first time period have the advantages that the characteristic data set corresponding to the first order data is determined, the data in the characteristic data set are converted into tensor data, dense blocks are extracted from the tensor data, and the malicious orders in all the orders in the first time period are determined based on the extracted dense blocks. The first order data represent order data corresponding to all orders in a first time period, the feature data set represents each first set feature and a corresponding feature value in at least one first set feature corresponding to the first order data, the dense blocks represent at least one sub-tensor data block of which the number of tensor data is greater than a first set value, and each sub-tensor data block represents tensor data of which the feature values corresponding to at least one set feature in at least one set feature are the same. According to the embodiment of the invention, the data in the feature data set is converted into tensor data, the complexity of data processing can be reduced, the dense blocks are extracted from the tensor data, the malicious order is determined based on the extracted dense blocks, the accuracy of malicious order identification can be improved, the identification result has strong interpretability, and the characteristics of the malicious order are better met.
Referring to fig. 7, fig. 7 is a schematic diagram of a malicious order identification process provided by an application embodiment of the present invention, where the malicious order identification process includes:
and S701, acquiring training data.
Here, the training data is order data that has been determined to be a malicious order or not within a certain period of time.
S702, selecting characteristics and marking training data.
And marking the training data, wherein the marking represents whether the training data is a malicious order. Features used in the training are then selected, including user identification, order address, store identification, order amount, quantity of goods, and the like.
And S703, determining a characteristic data set corresponding to the training data.
The feature data set characterizes each feature in the features corresponding to the training data and the corresponding feature value.
S704, setting an evaluation index of the first model.
Here, the first model is a catcher core model.
S705, a first model is trained.
S706, verifying and evaluating the first model based on the evaluation index.
And finally, training a CatchCore model based on the training data, the selected features and the corresponding feature values, evaluating and verifying the CatchCore model after the CatchCore model is trained, and detecting the effect of the CatchCore model.
After the training of the first model is completed, the dense block can be extracted based on the first model.
S707, the first order data is determined.
Here, the first order data is order data corresponding to all orders within the first time period. For example, where the user places an order on a shopping platform, the first order data may be all orders generated by the shopping platform during the first time period.
And S708, determining a characteristic data set corresponding to the first order data.
The characteristic data set characterizes each first setting characteristic and a corresponding characteristic value in at least one first setting characteristic corresponding to the first order data.
S709, convert the data in the feature data set into tensor data.
And S710, inputting tensor data into the first model to obtain a dense block output by the first model.
The dense block represents at least one sub-tensor data block of which the number of tensor data is greater than a first set value; each sub-tensor data block characterizes tensor data in which eigenvalues corresponding to at least one set feature of the at least one set feature are the same.
S711, determining a malicious order based on the dense blocks.
And mapping tensor data in all sub-tensor data blocks in the dense block to the characteristic data set to obtain a malicious order in the first order data.
The application embodiment of the invention can reduce the complexity of data processing, improve the accuracy of malicious order identification, ensure that the identification result better conforms to the characteristics of the malicious order, has strong interpretability and can be verified.
It should be understood that, the sequence numbers of the steps in the foregoing embodiments do not imply an execution sequence, and the execution sequence of each process should be determined by its function and inherent logic, and should not constitute any limitation to the implementation process of the embodiments of the present invention.
It will be understood that the terms "comprises" and/or "comprising," when used in this specification and the appended claims, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
The technical means described in the embodiments of the present invention may be arbitrarily combined without conflict.
In addition, in the embodiments of the present invention, "first", "second", and the like are used for distinguishing similar objects, and are not necessarily used for describing a specific order or a sequential order.
Referring to fig. 8, fig. 8 is a schematic diagram of a malicious order identification apparatus according to an embodiment of the present invention, and as shown in fig. 8, the apparatus includes: the device comprises a first determining module, a second determining module, an extracting module and a third determining module.
The first determining module is used for determining first order data; the first order data represent order data corresponding to all orders in a first time period;
the second determining module is used for determining a characteristic data set corresponding to the first order data; the characteristic data set characterizes each first set characteristic and a corresponding characteristic value in at least one first set characteristic corresponding to the first order data;
the extraction module is used for converting the data in the characteristic data set into tensor data and extracting a dense block from the tensor data; the dense block represents at least one sub-tensor data block of which the number of tensor data is greater than a first set value; each sub-tensor data block represents tensor data of which the eigenvalues corresponding to at least one set feature in the at least one set feature are the same;
and the third determination module is used for determining a malicious order in all orders in the first time interval based on the extracted dense blocks.
The device further comprises:
the fourth determining module is used for determining a characteristic value corresponding to each second setting characteristic in all second setting characteristics corresponding to the second order data; the second order data characterizes order data that have been determined to be a malicious order within a second time period;
and the fifth determining module is used for determining the corresponding second setting characteristic as one of the at least one first setting characteristic under the condition that the characteristic value meets the corresponding setting condition.
The device further comprises:
and the sixth determining module is used for determining the setting condition corresponding to the second setting characteristic.
When the second set characteristic represents that the order amount is larger than the second set value, the corresponding set condition represents that the order amount in the second order data is larger than the second set value, and the proportion of the order quantity with the order amount larger than the second set value in the total order amount of the second order data is larger than the first set proportion;
when the second set characteristic represents that the order is a shop mark, the corresponding set condition represents that the number of orders with the same shop mark in the second order data is larger than a third set value;
when the second set characteristic represents the user identifier, the corresponding set condition represents that the number of orders with the same user identifier in the second order data is larger than a fourth set value;
when the second set characteristic represents an order address, the corresponding set condition represents that the number of orders with the same order address in the second order data is larger than a fifth set value;
when the second setting characteristic is the number of the commodities, the corresponding setting condition represents that the number of the orders of which the number of the commodities is larger than the sixth setting value in the second order data is larger than the seventh setting value.
The first determining module is specifically configured to:
the first order data is determined based on a set time period.
The third determining module is specifically configured to:
and mapping tensor data in all sub-tensor data blocks in the dense block to the data set to obtain a malicious order in the first order data.
The first setting feature includes at least any one of:
a user identification;
an order address;
a store identification;
the amount of the order;
the number of goods.
In practical applications, the first determining module, the second determining module, the extracting module and the third determining module may be implemented by a Processor in an electronic device, such as a Central Processing Unit (CPU), a Digital Signal Processor (DSP), a Micro Control Unit (MCU), or a Programmable Gate Array (FPGA).
It should be noted that: in the malicious order identification apparatus provided in the above embodiment, when performing malicious order identification, only the division of the above modules is used for illustration, and in practical applications, the processing distribution may be completed by different modules according to needs, that is, the internal structure of the apparatus is divided into different modules, so as to complete all or part of the above-described processing. In addition, the malicious order identification device and the malicious order identification method provided by the above embodiments belong to the same concept, and specific implementation processes thereof are detailed in the method embodiments and are not described herein again.
Based on the hardware implementation of the program module, in order to implement the method of the embodiment of the present application, an embodiment of the present application further provides an electronic device. Fig. 9 is a schematic diagram of a hardware structure of an electronic device according to an embodiment of the present application, and as shown in fig. 9, the electronic device includes:
the communication interface can carry out information interaction with other equipment such as network equipment and the like;
and the processor is connected with the communication interface to realize information interaction with other equipment, and is used for executing the method provided by one or more technical schemes on the electronic equipment side when running a computer program. And the computer program is stored on the memory.
Of course, in practice, the various components in an electronic device are coupled together by a bus system. It will be appreciated that a bus system is used to enable communications among the components. The bus system includes a power bus, a control bus, and a status signal bus in addition to a data bus. For clarity of illustration, however, the various buses are labeled as a bus system in fig. 9.
The memory in the embodiments of the present application is used to store various types of data to support the operation of the electronic device. Examples of such data include: any computer program for operating on an electronic device.
It will be appreciated that the memory can be either volatile memory or nonvolatile memory, and can include both volatile and nonvolatile memory. Among them, the nonvolatile Memory may be a Read Only Memory (ROM), a Programmable Read Only Memory (PROM), an Erasable Programmable Read-Only Memory (EPROM), an Electrically Erasable Programmable Read-Only Memory (EEPROM), a magnetic random access Memory (FRAM), a Flash Memory (Flash Memory), a magnetic surface Memory, an optical disk, or a Compact Disc Read-Only Memory (CD-ROM); the magnetic surface storage may be disk storage or tape storage. Volatile Memory can be Random Access Memory (RAM), which acts as external cache Memory. By way of illustration and not limitation, many forms of RAM are available, such as Static Random Access Memory (SRAM), Synchronous Static Random Access Memory (SSRAM), Dynamic Random Access Memory (DRAM), Synchronous Dynamic Random Access Memory (SDRAM), Double Data Rate Synchronous Dynamic Random Access Memory (DDRSDRAM), Enhanced Synchronous Dynamic Random Access Memory (ESDRAM), Enhanced Synchronous Dynamic Random Access Memory (Enhanced DRAM), Synchronous Dynamic Random Access Memory (SLDRAM), Direct Memory (DRmb Access), and Random Access Memory (DRAM). The memory 130 described in embodiments herein is intended to comprise, without being limited to, these and any other suitable types of memory.
The method disclosed in the embodiments of the present application may be applied to a processor, or may be implemented by a processor. The processor may be an integrated circuit chip having signal processing capabilities. In implementation, the steps of the above method may be performed by integrated logic circuits of hardware in a processor or instructions in the form of software. The processor described above may be a general purpose processor, a DSP, or other programmable logic device, discrete gate or transistor logic device, discrete hardware components, or the like. The processor may implement or perform the methods, steps, and logic blocks disclosed in the embodiments of the present application. A general purpose processor may be a microprocessor or any conventional processor or the like. The steps of the method disclosed in the embodiments of the present application may be directly implemented by a hardware decoding processor, or implemented by a combination of hardware and software modules in the decoding processor. The software modules may be located in a storage medium located in a memory where a processor reads the programs in the memory and in combination with its hardware performs the steps of the method as previously described.
Optionally, when the processor executes the program, the corresponding process implemented by the electronic device in each method of the embodiment of the present application is implemented, and for brevity, no further description is given here.
In an exemplary embodiment, the present application further provides a storage medium, specifically a computer storage medium, for example, a first memory storing a computer program, where the computer program is executable by a processor of an electronic device to perform the steps of the foregoing method. The computer readable storage medium may be Memory such as FRAM, ROM, PROM, EPROM, EEPROM, Flash Memory, magnetic surface Memory, optical disk, or CD-ROM.
In the several embodiments provided in the present application, it should be understood that the disclosed apparatus, electronic device and method may be implemented in other ways. The above-described device embodiments are merely illustrative, for example, the division of the unit is only a logical functional division, and there may be other division ways in actual implementation, such as: multiple units or components may be combined, or may be integrated into another system, or some features may be omitted, or not implemented. In addition, the coupling, direct coupling or communication connection between the components shown or discussed may be through some interfaces, and the indirect coupling or communication connection between the devices or units may be electrical, mechanical or other forms.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed on a plurality of network units; some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, all functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may be separately regarded as one unit, or two or more units may be integrated into one unit; the integrated unit can be realized in a form of hardware, or in a form of hardware plus a software functional unit.
Those of ordinary skill in the art will understand that: all or part of the steps for implementing the method embodiments may be implemented by hardware related to program instructions, and the program may be stored in a computer readable storage medium, and when executed, the program performs the steps including the method embodiments; and the aforementioned storage medium includes: a removable storage device, a ROM, a RAM, a magnetic or optical disk, or various other media that can store program code.
Alternatively, the integrated units described above in the present application may be stored in a computer-readable storage medium if they are implemented in the form of software functional modules and sold or used as independent products. Based on such understanding, the technical solutions of the embodiments of the present application may be essentially implemented or portions thereof contributing to the prior art may be embodied in the form of a software product stored in a storage medium, and including several instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the methods described in the embodiments of the present application. And the aforementioned storage medium includes: a removable storage device, a ROM, a RAM, a magnetic or optical disk, or various other media that can store program code.
The technical means described in the embodiments of the present application may be arbitrarily combined without conflict.
In addition, in the examples of the present application, "first", "second", and the like are used for distinguishing similar objects, and are not necessarily used for describing a specific order or a sequential order.
The above description is only for the specific embodiments of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present application, and shall be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (10)

1. A malicious order identification method, the method comprising:
determining first order data; the first order data represent order data corresponding to all orders in a first time period;
determining a characteristic data set corresponding to the first order data; the characteristic data set characterizes each first set characteristic and a corresponding characteristic value in at least one first set characteristic corresponding to the first order data;
converting the data in the feature data set into tensor data, and extracting a dense block from the tensor data; the dense block represents at least one sub-tensor data block of which the number of tensor data is greater than a first set value; each sub-tensor data block represents tensor data of which the eigenvalues corresponding to at least one set feature in the at least one set feature are the same;
determining a malicious order of all orders within the first time period based on the extracted dense blocks.
2. The method of claim 1, further comprising:
determining a characteristic value corresponding to each second set characteristic in all second set characteristics corresponding to the second order data; the second order data characterizes order data that have been determined to be a malicious order within a second time period;
in the case where the feature value satisfies the corresponding setting condition, the corresponding second setting feature is determined as one of the at least one first setting feature.
3. The method according to claim 2, wherein when the corresponding second setting characteristic is determined as one of the at least one first setting characteristic in a case where the characteristic value satisfies the corresponding setting condition, the method further comprises:
and determining the setting condition corresponding to the second setting characteristic.
4. The method of claim 3,
when the second set characteristic represents that the order amount is larger than the second set value, the corresponding set condition represents that the order amount in the second order data is larger than the second set value, and the proportion of the order quantity with the order amount larger than the second set value in the total order amount of the second order data is larger than the first set proportion;
when the second set characteristic represents that the order is a shop mark, the corresponding set condition represents that the number of orders with the same shop mark in the second order data is larger than a third set value;
when the second set characteristic represents the user identifier, the corresponding set condition represents that the number of orders with the same user identifier in the second order data is larger than a fourth set value;
when the second set characteristic represents an order address, the corresponding set condition represents that the number of orders with the same order address in the second order data is larger than a fifth set value;
when the second setting characteristic is the number of the commodities, the corresponding setting condition represents that the number of the orders of which the number of the commodities is larger than the sixth setting value in the second order data is larger than the seventh setting value.
5. The method of claim 1, wherein determining first order data comprises:
the first order data is determined based on a set time period.
6. The method of claim 1, wherein the determining a malicious order of all orders within the first time period based on the extracted dense blocks comprises:
and mapping tensor data in all sub-tensor data blocks in the dense block to the data set to obtain a malicious order in the first order data.
7. The method according to any of claims 1-6, characterized in that the first setting feature comprises at least any of the following:
a user identification;
an order address;
a store identification;
the amount of the order;
the number of goods.
8. A malicious order identification apparatus, comprising:
the first determining module is used for determining first order data; the first order data represent order data corresponding to all orders in a first time period;
the second determining module is used for determining a characteristic data set corresponding to the first order data; the characteristic data set characterizes each first set characteristic and a corresponding characteristic value in at least one first set characteristic corresponding to the first order data;
the extraction module is used for converting the data in the characteristic data set into tensor data and extracting a dense block from the tensor data; the dense block represents at least one sub-tensor data block of which the number of tensor data is greater than a first set value; each sub-tensor data block represents tensor data of which the eigenvalues corresponding to at least one set feature in the at least one set feature are the same;
and the third determination module is used for determining a malicious order in all orders in the first time interval based on the extracted dense blocks.
9. An electronic device comprising a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor implements the malicious order identification method of any of claims 1 to 7 when executing the computer program.
10. A computer-readable storage medium, characterized in that the computer-readable storage medium stores a computer program comprising program instructions that, when executed by a processor, cause the processor to carry out the malicious order identification method according to any of claims 1 to 7.
CN202011383524.5A 2020-11-30 2020-11-30 Malicious order identification method and device, electronic equipment and storage medium Pending CN112435068A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011383524.5A CN112435068A (en) 2020-11-30 2020-11-30 Malicious order identification method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011383524.5A CN112435068A (en) 2020-11-30 2020-11-30 Malicious order identification method and device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN112435068A true CN112435068A (en) 2021-03-02

Family

ID=74698096

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011383524.5A Pending CN112435068A (en) 2020-11-30 2020-11-30 Malicious order identification method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN112435068A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113435974A (en) * 2021-06-25 2021-09-24 杭州推啊网络科技有限公司 Order filtering method and system
CN115641177A (en) * 2022-10-20 2023-01-24 北京力尊信通科技股份有限公司 Prevent second and kill prejudgement system based on machine learning

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113435974A (en) * 2021-06-25 2021-09-24 杭州推啊网络科技有限公司 Order filtering method and system
CN115641177A (en) * 2022-10-20 2023-01-24 北京力尊信通科技股份有限公司 Prevent second and kill prejudgement system based on machine learning

Similar Documents

Publication Publication Date Title
CN107330445B (en) User attribute prediction method and device
CN110263821B (en) Training of transaction feature generation model, and method and device for generating transaction features
CN107590688A (en) The recognition methods of target customer and terminal device
CN107886241B (en) Resource analysis method, device, medium, and electronic apparatus
US20190080352A1 (en) Segment Extension Based on Lookalike Selection
CN112989059A (en) Method and device for identifying potential customer, equipment and readable computer storage medium
CN111242318B (en) Service model training method and device based on heterogeneous feature library
CN112435068A (en) Malicious order identification method and device, electronic equipment and storage medium
CN110458644A (en) A kind of information processing method and relevant device
CN112988840A (en) Time series prediction method, device, equipment and storage medium
CN109582906A (en) Determination method, apparatus, equipment and the storage medium of data reliability
CN113592605A (en) Product recommendation method, device, equipment and storage medium based on similar products
CN111291936B (en) Product life cycle prediction model generation method and device and electronic equipment
CN108647714A (en) Acquisition methods, terminal device and the medium of negative label weight
CN115018588A (en) Product recommendation method and device, electronic equipment and readable storage medium
CN109558619B (en) Data processing method, terminal and readable storage medium based on building information model
CN112950347B (en) Resource data processing optimization method and device, storage medium and terminal
CN113591900A (en) Identification method and device for high-demand response potential user and terminal equipment
CN111325372A (en) Method for establishing prediction model, prediction method, device, medium and equipment
CN114723554A (en) Abnormal account identification method and device
CN112307334B (en) Information recommendation method, information recommendation device, storage medium and electronic equipment
CN110570301B (en) Risk identification method, device, equipment and medium
CN110837596B (en) Intelligent recommendation method and device, computer equipment and storage medium
CN112541357A (en) Entity identification method and device and intelligent equipment
CN111523011A (en) Cold and hot wallet intelligent label system based on block chain technology distributed graph calculation engine

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination