Disclosure of Invention
The invention aims to overcome the defects of the prior art and provides a differentiated excitation method facing edge computing data quality perception.
In order to achieve the purpose, the invention adopts the following technical scheme:
a differentiated incentive method for edge-oriented computing data quality perception comprises the following steps:
s1, extracting crowd sensing data uploaded by a user, establishing a two-dimensional data coordinate according to the crowd sensing data, and calculating an outlier factor of each point relative to other points in the established two-dimensional data coordinate by adopting an LOF algorithm;
s2, comparing the difference value between the outlier factor and the standard value 1 of each data with a preset range to obtain the quality grade of each data;
and S3, obtaining a user grade corresponding to the quality grade according to the quality grade of each piece of data, and providing return information corresponding to the user grade.
Further, in step S1, an LOF algorithm is used to calculate an outlier factor of each point relative to other points in the established two-dimensional data coordinates, which is expressed as:
wherein, LOF
k (A) An outlier representing data a; a represents one of data uploaded by a user; lrd
k (A) Represents the inverse of the average reachable distance of points in the kth neighborhood of point a;
representing all points in the kth neighborhood of point a; lrd (O) represents the inverse of the average reachable distance for point O; n is a radical of hydrogen
k (A) Representing the kth neighborhood of point a.
Further, after comparing the difference between the outlier factor and the standard value 1 of each datum with the preset range in the step S2, the method further includes:
s21, if the difference value between the outlier factor and the standard value 1 is smaller than a preset range, obtaining that the data corresponding to the outlier factor is high-quality data, and storing the user corresponding to the high-quality data to a set U g The preparation method comprises the following steps of (1) performing;
s22, if the difference value between the outlier factor and the standard value 1 is larger than or equal to a preset range, obtaining that the data corresponding to the outlier factor is low-quality data, and storing the user corresponding to the low-quality data to a set U m In (1).
Further, after the data corresponding to the outlier obtained in step S22 is low quality data, the method further includes:
judging whether the accuracy of the low-quality data is higher than a preset threshold value, if so, saving users corresponding to the data with the accuracy higher than the preset threshold value in the low-quality data in a set
Performing the following steps; if not, saving the users corresponding to the data with the accuracy lower than the preset threshold value in the low-quality data in a set U
m In (1).
Further, the step S3 specifically includes:
if the user is in the set U m If yes, no report is provided;
if the user is in the set
If yes, the user obtains a basic return delta;
if the user is in the set U g In (3), the user gets a report r i 。
Further, the basic reward δ obtained by the user is represented as:
wherein R represents the total reward given by the server;
expressed as a collection
The number of users in (1); r is
i Representing the reward earned by the user.
Further, the reward r obtained by the user i The method specifically comprises the following steps:
and calculating the credibility of the user, wherein the credibility is represented as:
ρ i =k/LOF(i)
where ρ is i Representing the credibility of the user i; k represents a constant defined by a particular application;
user-derived reward r i Expressed as:
wherein the content of the first and second substances,
an estimate representing a reward earned by the user; w represents U
g An estimate of the total reward earned by the users in the set.
Further, in step S3, the reward information corresponding to the user rating is provided as:
wherein r is i Representing the reward earned by the user.
Compared with the prior art, the invention has the following beneficial effects:
(1) the invention innovatively provides a differentiated incentive mechanism based on data quality perception, provides an optimal incentive mechanism meeting task requirements based on data quality and is used for selecting crowd sensing users in edge calculation;
(2) the method is carried out based on the data quality in the process of selecting the crowd sensing data source, so that the accuracy of the data uploaded by the user and collected by the edge server is greatly improved;
(3) when the system distributes the report back, the invention considers the inertia and selfishness of the user, adopts a multi-classification and multi (credit) grade distribution mode, applies a two-stage Stent Boger game and optimizes the distribution mode, thereby greatly improving the enthusiasm of the user.
Detailed Description
The embodiments of the present invention are described below with reference to specific embodiments, and other advantages and effects of the present invention will be easily understood by those skilled in the art from the disclosure of the present specification. The invention is capable of other and different embodiments and of being practiced or of being carried out in various ways, and its several details are capable of modification in various respects, all without departing from the spirit and scope of the present invention. It is to be noted that the features in the following embodiments and examples may be combined with each other without conflict.
The invention aims to overcome the defects of the prior art and provides a differentiated excitation method facing edge computing data quality perception.
Example one
The embodiment provides a differentiated excitation method facing edge computing data quality perception, as shown in fig. 1, including the steps of:
s11, extracting crowd sensing data uploaded by a user, establishing a two-dimensional data coordinate according to the crowd sensing data, and calculating an outlier factor of each point relative to other points in the established two-dimensional data coordinate by adopting an LOF algorithm;
s12, comparing the difference value between the outlier factor and the standard value 1 of each data with a preset range to obtain the quality grade of each data;
and S13, obtaining a user grade corresponding to the quality grade according to the quality grade of each piece of data, and providing return information corresponding to the user grade.
In step S11, crowd sensing data uploaded by the user is extracted, a two-dimensional data coordinate is established according to the crowd sensing data, and an LOF algorithm is used to calculate an outlier factor of each point relative to other points in the established two-dimensional data coordinate.
Extracting two characteristics (selected by a server) from data provided by a user to establish a two-dimensional data set, drawing the two-dimensional data set on a two-dimensional coordinate axis in a point form, calculating an Outlier factor of each point relative to other points by using an LOF (local Outlier factor) algorithm, calculating the Outlier factor of data A according to the following formula, and when the LOF is used, extracting two characteristics (selected by the server) from the data provided by the user to establish a two-dimensional data set, calculating the Outlier factor of each point relative to other points by using the LOF (local Outlier factor) algorithm k (A) The closer to 0, the more likely it is that the data a is abnormal data.
The formula is expressed as:
wherein, LOF
k (A) An outlier factor representing data a; a represents one of data uploaded by a user; lrd
k (A) Represents the inverse of the average reachable distance of points within the kth neighborhood of point a;
representing all points in the kth neighborhood of point a; lrd (O) represents the inverse of the average reachable distance for point O; n is a radical of
k (A) Representing the kth neighborhood of point a.
The details involved are as follows:
dis (A, O): representing the Euclidean distance between the point A and the point O;
kth distance (k-distance): the kth distance of point A is simply the distance from point A that is k-th away from point A (excluding A itself) to point A, and is denoted as d k (A);
Distance domain (k-distance neighborhoo)d) The method comprises the following steps The kth distance field of point A is represented by N, which is all points (including points on a circle) within an area with A as the center of the circle and the kth distance as the radius k (A) In that respect Therefore, the number of the kth domain points of A is at least k;
reach-distance (reach-distance): the k-th reachable distance from point O to point a may take the larger of dis (a, O) and dk (a), denoted as reach _ disk (a, O) ═ MAX (dis (a, O), d k (A));
The local reachable density (lrdk (a)) of point a represents the inverse of the average reachable distance of points in the kth domain of point a, and is calculated as follows:
lrd
k (A) representing a density, the higher the density, the more points around a, and it is obvious that we consider that a with higher density is more likely to belong to the same cluster as the surrounding points, and conversely, the lower density is more likely to be outliers.
In step S12, the difference between the outlier factor and the standard value 1 of each data is compared with a preset range to obtain a quality grade of each data.
The edge server sets a threshold range according to the requirement of the edge server on data, judges the data quality of each data according to the outlier factor of each data, divides users into three levels, and respectively sets the data quality from high to low as U
g 、
U
m 。
In this embodiment, after comparing the outlier factor of each data with the preset threshold, the method further includes:
s121, if the difference value between the outlier factor and the standard value 1 is smaller than the preset range, obtaining data corresponding to the outlier factor as high-quality data, and storing users corresponding to the high-quality data to a set U g Performing the following steps;
and S122, if the difference value between the outlier factor and the standard value 1 is larger than or equal to a preset range, obtaining that the data corresponding to the outlier factor is low-quality data.
For low-quality data, whether the low-quality data is caused by the acquisition time, the time delay, the packet loss rate and the like of the data needs to be checked, and the accuracy of the low-quality data is judged, specifically:
judging whether the accuracy of the low-quality data is higher than a preset threshold value, if so, considering that the low quality is caused by unstable network conditions and the like, and saving users corresponding to the data with the accuracy higher than the preset threshold value in the low-quality data in a set
Performing the following steps; if not, the user is determined to be a malicious user, and the user corresponding to the data with the accuracy lower than the preset threshold value in the low-quality data is stored in the set U
m In (1).
In step S13, a user rank corresponding to the quality rank is obtained according to the quality rank of each data, and reward information corresponding to the user rank is provided.
The edge server gives out the total reward willing to be provided, and distributes the reward according to different grades of the users, which specifically comprises the following steps:
if the user is in the set U m If yes, no report is provided;
if the user is in the set
If so, the basic return delta obtained by the user is obtained;
if the user is in the set U g In (3), the user gets a report r i 。
For user in set U m And (4) deeming it to be a malicious user, giving no return, and then not adopting the data uploaded by the user.
Is in the set for the user
Although it uploads low-quality data due to network reasons, the uploaded data does not help the edge server, and after considering that it still may provide high-quality data to encourage it to upload continuouslyData, given a base return δ, δ determined by the upper layer application, is expressed as:
wherein R represents the total reward given by the server;
expressed as a collection
The number of users in (1); r is
i Representing the reward the user receives.
For user in set U g In this level, users are a group of users with the largest contribution to the edge computing crowd sensing, so the users can reasonably obtain the most return; but each user has selfishness and wants to get more returns. To solve this problem, the present embodiment employs a two-stage steinberg-based gaming method.
First consider the reputation of the user, expressed as:
ρ i =k/LOF(i)
wherein ρ i Representing the credibility of the user i; k denotes a constant defined by a specific application.
Defining the utility function of the user as: u. u i =r i -c i ,r i A reward obtained for the user, c i The cost spent uploading data for the user;
defining the utility function of the MEC (Mobile Edge computing) server as follows:
wherein v is
j Indicating the value of the task that the server can obtain after task j is completed.
The two parties participating in the game are an MEC server and a plurality of users (from U) collecting data g )。The game is an incomplete information game because the server does not know the value of the specific cost of a user to participate in the task, but only the probability distribution of the cost. Since both parties want to maximize their utility, the two-stage Stent Boger game is specifically processed as follows (as shown in FIG. 2):
(1) gaming between the MEC server and the user. Game content: according to the cost probability distribution of the user, the MEC server determines the value of the total return R so that u s And maximum.
(2) Given R, users play games with each other. Game content: determining a task participation strategy set of each user according to the total return R value given by the server, so that u i And max.
The solving method is as follows:
by adopting a backtracking method, the second-stage game equilibrium solution is solved first, and then the first-stage game equilibrium solution is solved
And a second stage game: given R, determine the set of task participation policies for each user such that u i And max.
According to the personal rationality attribute u
i =r
i -c
i Is more than or equal to 0. Suppose that
Then
Then all r
i ≥r
* The user may be willing to participate in the task.
From this it can be calculated how many tasks can be completed, the condition for which is | U j |≥m j So that u can be calculated from the definition of the server utility function s 。
The first stage game: each R corresponds to a u s Then u can be derived s The maximum R can be judged whether the R exists or not and whether the R is unique or not through a geometric mode.
To sum up, for being in set U g Of the user, the reward r obtained by the user i Expressed as:
wherein, the first and the second end of the pipe are connected with each other,
an estimate representing a reward earned by the user; w represents U
g An estimate of the total reward earned by the users in the set.
To summarize, the allocated reward for all users is expressed as:
wherein r is
i Representing the reward obtained by the user;
wherein R is
j Representing the total reward assigned to users participating in task j; m is
j Indicating that task j requires at least m
j The task can be completed only when the user participates, also called the task threshold.
And is
U
j Representing a set of users participating in task j; t is
c A set of tasks to be performed is represented,
t is the set of all tasks.
The scenario shown in fig. 3 is a single-edge server multitasking multi-user scenario, and there are 7 crowd sensing users (users 1-7) who submit crowd sensing data to the MEC server, and have already calculated LOF (i) of data submitted by all users (the LOF drawing and calculating process is skipped here). LOF (1) ═ 0.7; LOF (2) ═ 0.8; LOF (3) ═ 0.9; LOF (4) ═ 0.6; LOF (5) ═ 0.6; LOF (6) ═ 0.3; LOF (7) ═ 0.4.
It should be noted that, at this time, the edge server does not know the specific value spent by the user, but only knows the probability distribution of the cost spent by the user, and the probability distribution is one subject to the expectation of μ and the variance of σ 2 Is normally distributed.
The threshold set by the system is set to be 0.6, and it is assumed that the user 6 uploads low-quality data due to network packet loss after subsequent judgment, the user is classified as follows:
the MEC server has given a set of tasks T to be completed c Two tasks, task 1 and task 2, U 1 、U 2 Represent a set of users participating in task 1 and task 2, respectively; task 1 requires m 1 Each user participates in completing task, and task 2 needs m 2 Each user participates in completing the task, where m is assumed 1 And m 2 Are all 2.
Two stages of the Stainberg game are started: assume that the total return to each task that the MEC server can give is R units. According to the formula:
according to personal rationality attributes, u
i =r
i -c
i Is more than or equal to 0. Suppose that
Then
Then all c
i ≥r
* The user may be willing to participate in the task. Suppose cost of user 4
4 Greater than r, then the user will not choose to participate in the task, while c is preferentially used among the remaining users
i Smaller users. Finally, selecting user 1,
user 2 and user3 and data provided by the user 5, U
1 ={1,2},U
2 That they will receive r for each
i The users are reported back in order to be able to continue to provide good quality data at a later time,
users 4 and 6 will only get a basic return δ, users 7 and 8 will not receive any return, and the MEC server will then refuse to receive its uploaded data.
In summary, the embodiment provides a differentiated incentive method for edge-oriented data quality calculation, which can provide an optimal incentive method for satisfying the requirements of both users for multi-task cooperative application of the users and the MEC server based on the data quality uploaded by the users; according to the method, the data source with high quality is selected for the edge server, so that the accuracy of crowd sensing data collection is enhanced, and the follow-up application is improved; meanwhile, in the process of distributing the reward, the game based on two sections of Stainberg games is carried out: for the server, the final payment is made the lowest, i.e. the utility ratio of the server is the highest. For the users, the reward obtained by the users is always larger than the cost spent by the users, the enthusiasm for participating in crowd sensing data collection is maintained, and meanwhile, the higher the credit of each user is, the higher the reward obtained is, and the great incentive is also provided for the users to continuously provide high-quality data for the edge server.
In addition, the influence of sporadic problems on the data uploaded by the user is also considered through the incentive method. For users who upload low-quality data, a set of decision processes is established to determine the reason for uploading the low-quality data: and (3) discriminating real malicious users from non-malicious users in the aspects of data collection time, network packet loss rate, uploading delay and the like, and giving consideration to the basic points. This effectively encourages the enthusiasm of such users and gives positive stimulation to the crowd-sourcing perception potential users in the prospect, effectively increasing the number of crowd-sourcing perception users.
Those skilled in the art will appreciate that all or part of the steps in the methods of the above embodiments may be implemented by associated hardware instructed by a program, which may be stored in a computer-readable storage medium, and the storage medium may include: ROM, RAM, magnetic or optical disks, and the like.
It is to be noted that the foregoing description is only exemplary of the invention and that the principles of the technology may be employed. It will be understood by those skilled in the art that the present invention is not limited to the particular embodiments described herein, but is capable of various obvious changes, rearrangements and substitutions as will now become apparent to those skilled in the art without departing from the scope of the invention. Therefore, although the present invention has been described in greater detail by the above embodiments, the present invention is not limited to the above embodiments, and may include other equivalent embodiments without departing from the spirit of the present invention, and the scope of the present invention is determined by the scope of the appended claims.