CN111028080A - Multi-arm slot machine and Shapley value-based crowd sensing data dynamic transaction method - Google Patents

Multi-arm slot machine and Shapley value-based crowd sensing data dynamic transaction method Download PDF

Info

Publication number
CN111028080A
CN111028080A CN201911250169.1A CN201911250169A CN111028080A CN 111028080 A CN111028080 A CN 111028080A CN 201911250169 A CN201911250169 A CN 201911250169A CN 111028080 A CN111028080 A CN 111028080A
Authority
CN
China
Prior art keywords
data
price
worker
buyer
contribution
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911250169.1A
Other languages
Chinese (zh)
Inventor
徐畅
司雅蕴
祝烈煌
张川
张璨
饶鸿洲
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Institute of Technology BIT
Original Assignee
Beijing Institute of Technology BIT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Institute of Technology BIT filed Critical Beijing Institute of Technology BIT
Priority to CN201911250169.1A priority Critical patent/CN111028080A/en
Publication of CN111028080A publication Critical patent/CN111028080A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/04Trading; Exchange, e.g. stocks, commodities, derivatives or currency exchange
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/18Complex mathematical operations for evaluating statistical data, e.g. average values, frequency distributions, probability functions, regression analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data
    • G06Q30/0206Price or cost determination based on market factors

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Accounting & Taxation (AREA)
  • Finance (AREA)
  • Development Economics (AREA)
  • Strategic Management (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Entrepreneurship & Innovation (AREA)
  • General Business, Economics & Management (AREA)
  • Marketing (AREA)
  • Economics (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Physics (AREA)
  • Game Theory and Decision Science (AREA)
  • Pure & Applied Mathematics (AREA)
  • Mathematical Optimization (AREA)
  • Computational Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Operations Research (AREA)
  • Probability & Statistics with Applications (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Algebra (AREA)
  • Technology Law (AREA)
  • Databases & Information Systems (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention relates to a multi-arm slot machine and Shapley value-based crowd sensing data dynamic transaction method, and belongs to the technical field of big data and crowd sensing. The present invention first determines the marginal contribution of each "worker"'s data to a "buyer" using the sharley value, including considering the direct contribution of new data and considering the indirect contribution of redundant data. The "buyer" would then select the "worker" with the higher marginal contribution and give the transaction price for the intent. In order to improve the success rate of the transaction and obtain the maximum return, the buyer implements a certain learning strategy. Aiming at the dilemma that high price is given to guarantee successful transaction and the trial bottom line obtains greater return, the multi-arm slot machine model in the context form is utilized for learning, the strategy selects the best observable price in each round, and the strategy is gradually adjusted to adapt to the psychological bottom line of a worker. The price of the worker deduced by the method is expected to be closer to the actual value, and the buyer obtains greater return.

Description

Multi-arm slot machine and Shapley value-based crowd sensing data dynamic transaction method
Technical Field
The invention relates to a data dynamic transaction method under crowd sensing, in particular to a data dynamic transaction method based on a dobby slot machine and a Shapley value, and belongs to the technical field of big data and crowd sensing.
Background
In recent years, with the rapid development of wireless communication and sensor technologies and the rapid popularization of wireless mobile intelligent terminal devices, most smart phones and tablet computers integrate sensing modules with powerful computing and sensing functions, such as a Global Positioning System (GPS), an accelerometer, a gyroscope, a microphone, a camera and the like, so that people can sense and acquire surrounding environment information and acquire related data anytime and anywhere. A large number of applications based on perceptual information continue to emerge, such as: environmental monitoring, traffic monitoring, social networking applications, and the like. These increasing applications have prompted the birth and development of crowd sensing (crowd sensing).
In the context of crowd sensing, some organizations (such as meteorological centers, traffic management departments, etc.) urgently need instant distributed data and become a party who purchases the data, called a buyer; various users who upload perception data through the intelligent terminal are called as a party selling the data as a worker. When supply-demand relationships persist, there is always a "buyer" paying for valuable data, naturally forming a data market. This kind of data transaction mechanism in currency can be regarded as dynamic zero sum game. Both parties are to maximize their own interest, which at the same time means loss to the other. From the "buyer" perspective, it is aimed to obtain more valuable data at the lowest price.
At present, the data market formed in the context of crowd sensing still has some limitations, so that the context cannot be completely marketized, and the data market is pushed to a wider application context. Given that "workers" are unable to communicate in the trading market, this means that "workers" cannot see each other and can not agree on each other's bids, and thus cannot form a seller league to control prices. That is, the "buyer" can know all the bids of the "worker" from the market, and the "worker" only knows the bids of the "worker" and does not know the market quotation of the whole market.
To maximize the benefits of the "buyer," the performance of the different "workers" needs to be measured. Where the value of the perception data affects the final decision of the "buyer". Some conventional solutions believe that the performance of a "worker" or the objective quality of the data completely determines the value of the current data to a particular "buyer". Alternatively, environmental factors can have a significant impact on the value of the data, such as the time at which the data is collected and the location at which the data is collected. However, these views are all comparative. The process of data transaction may be divided into multiple time rounds, and as the time rounds progress, the "buyer" will get more data gradually. That is, most of the time, the "buyer" itself may be considered to have a data set stored. Under this premise, even if the same perception data, they are likely to be of different value to different "buyers". For example, assuming that there are currently "buyers" A and B, A already has data 1, and B does not, then data 1 has a higher probability of being worth B. Thus, "buyers" have a tendency to choose data that is more valuable to them.
In addition to the value of data, another important factor in the data market is the ultimate transaction price of the buyer and seller. How to determine the transaction price of valuable data is a huge challenge in this scenario. The method can be thought of, the buyer and the seller negotiate, and through multiple rounds of discussion, the price acceptable by the two parties is finally obtained progressively and the contract is made. However, because the communication cost is high, the method is more suitable in a scene with less bidding rounds. In the crowd sensing, because the number of participating entities is very large, especially the number of "workers" may be much larger than the number of "buyers", each "buyer" contracts with a large number of groups one by one, which is only a theoretically feasible method.
"buyers" are more inclined to make decisions based on observed environmental information. The environmental information here refers to the expectation of the "worker" for the transaction price. This value may fluctuate within certain limits while having certain proprietary properties. There have been some incentive mechanisms designed in the past to encourage "workers" to indicate their desired price on the market. Still other scenarios default expectations are public. In other words, the "buyer" has fully understood the probability distribution of the "worker" to the deal price in advance, which is somewhat impractical. Specifically, the "worker" may give his/her preliminary psychological price while selling the data, but this is not equal to the final price of the deal, and the two parties still have the problem of information asymmetry. The "buyer" wishes to directly predict the bargaining price closest to the "worker" psychological base line without multiple rounds of negotiation in order to be able to successfully trade and obtain data with minimal expense.
Disclosure of Invention
The invention aims to solve the defects of the prior art, and provides a group intelligence perception data dynamic transaction method based on a dobby and a Shapley value in order to solve the technical problems that how a data demand party (buyer) finds an optimal data seller (buyer) in a data transaction market through multiple rounds of transaction data and can purchase data information at a relatively optimal price in a big data group intelligence perception scene.
The core of the method is as follows: in a crowd-sourcing aware data market, the problem of maximizing rewards for data collectors is solved, wherein data is traded in multiple rounds. The sharley value is first used to determine the marginal contribution of each worker's data to the data collector. This contribution is split into two parts, including a direct contribution that takes into account new data and an indirect contribution that takes into account redundant data. The data collector may then select workers with higher marginal contributions and give the transaction price for the intent. In order to improve the success rate of the transaction and obtain the maximum return, the data collector will implement a certain learning strategy. Aiming at the dilemma that high price is given to guarantee successful transaction and the trial bottom line obtains greater return, the multi-arm slot machine model in the context form is utilized for learning, the strategy selects the best observable price in each round, and the strategy is gradually adjusted to adapt to the psychological bottom line of workers.
Advantageous effects
Compared with the prior art, the method of the invention has the following advantages:
1. in the data transaction of the crowd sensing scene, the problem of maximizing the reward of a data collector is considered, namely, how to obtain the maximum reward when purchasing the data.
2. And dynamically evaluating the value of the perception data. The value of the sensory data collected by the "workers" at different time rounds is modeled as the sharley value, i.e., the marginal contribution of the new data set of the "workers" to the original data set of the "buyers". The marginal contribution includes a direct contribution of the new data to the original data set, and an indirect contribution of the redundant data
3. A multi-armed slot machine model in context is used as a pricing model between "buyers" and "workers". Given the time-varying nature of supply-demand relationships in the market, the value of data changes with time rounds, i.e., contextual attributes. The price of "workers" inferred therefrom is expected to be closer to the actual value, and the "buyers" are thereby rewarded more.
Drawings
FIG. 1 is a diagram of a system model in the process of the present invention;
FIG. 2 is a schematic illustration of collected data and uncollected data for a "buyer" in the method of the present invention;
FIG. 3 is a direct contribution made by the data of "workers" in the method of the present invention;
FIG. 4 is a graph of cumulative average revenue over time runs in the method of the present invention;
FIG. 5 is a direct contribution from a "buyer" in the method of the present invention for different time rounds;
FIG. 6 is an indirect contribution made by a "buyer" in the method of the present invention;
FIG. 7 is an unfortunate value of the LinUCB method based on the different price selection intervals of "workers" in the method of the present invention;
FIG. 8 is a graph of the LinUCB process performance at various α points in the process of the present invention.
Detailed Description
The following describes in further detail embodiments of the method of the present invention with reference to the accompanying drawings and examples.
As shown in fig. 1, a method for dynamically trading crowd sensing data based on a dobby slot machine and a sharley value has the following technical scheme:
in the crowd-sensing scenario, there are two main subjects in common: the "buyers" that collect purchase perception data and the "workers" that collect sales perception data. There are many "buyers" and "workers" but the number of "workers" is much greater than that of "buyers".
There is always a trade relationship between "buyers" and "workers" due to the constant demand for sensory data. This scenario can therefore be viewed as a dynamically changing data market. For convenience of description, the transaction process is divided into a plurality of time rounds, and the time of one transaction is regarded as one time round.
Over a number of time rounds, the "buyer" will gradually accumulate the required perception data, i.e. it can be seen that the "buyer" holds one perception data set. Specifically, in one turn, the buyer judges data of different workers, calculates the marginal value of the new sensing data set, finally selects the data set with the highest marginal value, and enters a pre-transaction stage.
The "buyer" spends money purchasing the perception data of the "worker". To ensure that the transaction is successful, the "buyer" will predict the psychological price baseline of the "worker" in anticipation of the highest return in the course of the transaction, depending on the relationship between the historical data value and the transaction price, where the return is defined as the difference between the value of the data and the transaction price.
Step 1: the value of the perception data is evaluated.
The specific evaluation method is as follows:
at time round t, define all data on the market as
Figure BDA0002308801610000051
The market here does not necessarily refer to the entire market, and it is obviously impractical to communicate too far or too many "workers" due to the distance between the two parties and the number of "workers". Thus, the market refers to a non-empty subset of the original market, after segmentation, where there is no obstacle to communication between entities.
Let "worker" uiThe data set is preserved as
Figure BDA0002308801610000052
"buyer" CjThe data set is preserved as
Figure BDA0002308801610000053
Wherein 0<Ωi<<Ωj<N, N represents the total amount of data on the market.
Definition of "buyer" CjThe demand for data at time round t is
Figure BDA0002308801610000054
Step 1.1: and solving the direct contribution and the indirect contribution of the marginal value.
Using Shapely's value, measure how much the sensory data provided by "worker" can bring benefits to "buyer":
defining functions
Figure BDA0002308801610000055
v (N) represents the value of the limited data set N,
Figure BDA0002308801610000056
i.e. real number field, defining data diFor a data set
Figure BDA0002308801610000057
The marginal contribution of (a) is:
Δdi(v,S)=v(S∪{di})-v(S) (1)
for Shapely values, the following are defined:
Figure BDA0002308801610000058
ψi(v, N) is the average of all marginal contributions, i.e. the contribution of new data to the original data set; the new data is data that the "buyer" does not have and the "worker" does. For data held in one 'worker' hand, the new data set is represented as
Figure BDA0002308801610000059
For single data
Figure BDA00023088016100000510
It represents a direct contribution:
Figure BDA00023088016100000511
for a "worker", its direct contribution is equal to the sum of the contributions of all new data, i.e.:
Figure BDA00023088016100000512
indirect contribution, is the contribution that the redundant data makes indirectly to the "buyer" in the transaction by lowering the price of the same type of data in the market. Redundant data refers to the portion of data that is owned by the old "worker" held in the hand of the new "worker". The indirect value is defined as follows:
Figure BDA0002308801610000061
wherein the content of the first and second substances,
Figure BDA0002308801610000062
means "workers" uiTo the data collector cjRedundant data of phijRefers to the data collector cjA collection of "workers" who have accessed or purchased data,
Figure BDA0002308801610000063
representing the original old worker ulOwned data sets.
Step 1.2: the value of the data is evaluated based on the direct contribution and the indirect contribution.
New "worker" uiFor "buyer" cjIs equal to the sum of the direct contribution of the new data in the data set and the indirect contribution of the redundant data.
Figure BDA0002308801610000064
Step 2: the transaction price of the data is evaluated.
The specific evaluation method is as follows:
after data value evaluation, the buyer determines the object of the transaction in the round, and predicts and approaches the psychological price bottom line of the worker by using a confidence interval upper limit model in the dobby slot machine algorithm to obtain the maximum return. There are two possibilities of transaction success and failure due to the estimated probability of failure.
In the multiple arm slot machine algorithm, the price of a historical deal is defined as the "arm" of the slot machine. For one arm, XtThe sequence representing the benefit of its selection in the previous t rounds, then has the actual mean r and the sample mean
Figure BDA0002308801610000065
Figure BDA0002308801610000066
Where n represents the number of times the arm is selected. Xi-r is a random variable obeying a gaussian distribution of degree- σ, represented by the chebyshev inequality:
Figure BDA0002308801610000067
wherein the content of the first and second substances,
Figure BDA0002308801610000068
for the variance of all the samples X,
Figure BDA0002308801610000069
representing the mathematical expectation of all samples X, epsilon is any value greater than 0. The above formula, under gaussian distribution, is equivalent to:
Figure BDA0002308801610000071
the formula (9) is shown after being finished,
Figure BDA0002308801610000072
meanwhile, considering that the buyer is only collected in the sample X of the first t-1 turns at the time of the t turn1-Xt-1. For each "arm", the most likely candidate for the unknown mean of this "arm", i.e. the upper confidence interval limit (UCB), is obtained:
UCBi(t-1,δ)=∞,Xt-1=0 (11)
Figure BDA0002308801610000073
wherein the content of the first and second substances,
Figure BDA0002308801610000074
representing the difference between the predicted upper revenue limit and the mean revenue for the current arm. As the number of rounds t increases lnt increases, which means that the uncertainty of the estimate is larger. If the branch with the highest confidence bound is selected, this indicates that the policy is exploratory (exploratory). At the same time, since one is selectedThe "arm", and correspondingly time, increases, resulting in a decrease in the value of this term and a decrease in the uncertainty of the arm. As the number of passes increases, the overall uncertainty is controlled to be within a limited range. The reward of the selected branch is gradually closer to the actual expected reward, which means that the selected branch is the best choice by the collected environmental data at each round.
Also, since this is a contextual problem, the value of the data may change from run to run. The problem is therefore defined as a contextual dobby slot machine model, the core idea being the dobby slot machine algorithm mentioned above. In the model, there are a total of three variables, a two-dimensional feature vector X determined by the observed environmental factorst,i=(vt-1,1)TWherein v ist-1Indicating the value of a particular datum in the t-1 round. In addition, with IpThe arm with the price p is represented,
Figure BDA0002308801610000075
indicating arm IpThe number of times of selection in t-1 rounds is
Figure BDA0002308801610000076
Fθ(p) represents the probability of acceptance of price p by "worker".
Figure BDA0002308801610000081
Representing an unknown parameter vector.
In the model, the feature vectors are independent variables and the expected reward is a dependent variable. Thus, the problem is modeled as a linear regression problem, with the mapping between the historical feature vectors and the rewards as training samples. In particular when selecting the price piWhen it is used, order
Figure BDA0002308801610000082
For this round the price is selected. Let Di ∈ RlX2Is at the arm piThe following l contexts are observed, with:
Figure BDA0002308801610000083
ci∈Rlis that each price is in niThe corresponding reward vector observed in the wheel. Estimating an optimal solution of the coefficient vector by least squares estimation using training data (Di, ci)
Figure BDA0002308801610000084
Using ridge regression, there are:
Figure BDA0002308801610000085
wherein, I2Is a two-dimensional identity matrix.
In this model, the reward is expected
Figure BDA0002308801610000086
Is evaluated as
Figure BDA0002308801610000087
The standard deviation is expressed as
Figure BDA0002308801610000088
Wherein A isi,tFor parameters, initialize I2In each round represented by formula Ai,t←Xt,iXt,i TIterations are performed and eventually converge. Therefore, there is an optimal arm at the t-th round:
Figure BDA0002308801610000089
to constant quantity
Figure BDA00023088016100000810
δ is any value greater than zero.
And step 3: and (4) determining the optimal worker for purchasing the data according to the data value evaluation result obtained in the step (1). Then, according to step 2, the data transaction price evaluation result is obtained from the selected workers, the optimal price is determined, and the data information is purchased.
Examples
In an embodiment, we do two parts, the first part is the many-to-one relationship that is formed after the data collector determines the seller with the highest profit, and the second part is the one-to-one transaction relationship that illustrates how the data collector uses the LinUCB learning strategy to complete the transaction at a near-ideal price.
In the example, there are 10 data to be collected for a total of 10 data collectors and 50 "workers", each round of transaction lasting 10 units of time for a total of 100 rounds.
FIG. 3 shows the direct contribution of data in the "worker" hand, with the z-axis representing the amount of data held by the "worker". It can be seen that significant stratification occurs in the graph because the value of different data is different for different collectors, and because the collector's data set is not empty initially, the data collector has a preference in selecting "workers", which results in differences in the value of the data in the hands of the "workers", thus forming stratification.
Fig. 5 and 6 are direct and indirect contributions of data to the same collector in different rounds of the transaction, respectively. The value of the data does not generally fluctuate much in different transaction rounds, and because the data collector has some data in nature, the direct contribution of partial data is always 0; the indirect contribution being Ri,jAnd DjIs the average of the direct contributions of the history of (a). Indirect contribution is small compared to direct contribution because the probability that the surplus data of "workers" in each round hits the collector just in possession of the data is small, but indirect contribution of data is still not negligible.
In the decision section of the data collector, the value v of the data collected from above is 200,300]Evenly distributed in this interval; the expected price theta in the "workers" heart is subject to N (mu)θ1) normal distribution, and μθV/2; price given by data collector obeys 0,400]Uniform distribution of (2); a total of 1000 transactions were conducted.
The results using the LinUCB strategy are shown in fig. 4, with the vertical axis representing the average cumulative revenue for the data collector. In the first 100 rounds, the curve had a distinct oscillation and a minimum of-10 occurred near 100 rounds. Then the yield curve starts to increase steadily, and the number of rounds T has
Figure BDA0002308801610000091
The relationship (2) of (c). The yield is approximately 83.5 when 1000 rounds are reached, but by 3000 rounds the yield only increases to 91.1. And it can also be seen from the figure that, all the rounds that fail the transaction are because the price p offered by the collector is lower than the price θ expected by the "workers", the number of failed rounds is approximately 5% of the total number of rounds.

Claims (3)

1. A crowd sensing data dynamic transaction method based on a multi-arm slot machine and a Shapley value is characterized in that:
the crowd-sourcing aware scene includes two subject objects: the method comprises the steps of collecting buyers purchasing perception data and workers selling perception data; there is always a transaction relationship between "buyer" and "worker"; dividing the transaction process into a plurality of time rounds, wherein the time of one transaction is regarded as one time round;
step 1: evaluating the value of the perception data;
using the Shapely value, determining the profitability that the perception data provided by each "worker" can bring to the "buyer", i.e. the marginal contribution, which includes two parts, the direct contribution of the new data and the indirect contribution of the redundant data; evaluating the value of the data according to the marginal contribution, wherein the total contribution value of the data set of a new 'worker' to the 'buyer' is equal to the sum of the direct contribution of the new data in the data set and the indirect contribution of the redundant data;
step 2: evaluating the bargaining price of the data by using a multi-arm slot machine algorithm as a pricing model between buyers and workers;
and step 3: determining the optimal worker for purchasing data according to the data value evaluation result obtained in the step 1; then, according to step 2, the data transaction price evaluation result is obtained from the selected workers, the optimal price is determined, and the data information is purchased.
2. The method for dynamically trading crowd sensing data based on multiple-armed slots and sharley values according to claim 1, wherein the method for obtaining the marginal contribution in step 1 is as follows:
at time round t, define all data on the market as
Figure FDA0002308801600000011
The market is a non-empty subset of the original market, after being segmented, with no obstacles to communication between entities;
let "worker" uiThe data set is preserved as
Figure FDA0002308801600000012
"buyer" CjThe data set is preserved as
Figure FDA0002308801600000013
Wherein 0<Ωi<<Ωj<N, N represents all data quantity on the market;
definition of "buyer" CjThe demand for data at time round t is
Figure FDA0002308801600000014
Defining functions
Figure FDA0002308801600000015
v (N) represents the value of the limited data set N,
Figure FDA0002308801600000016
i.e. real number field, defining data diFor a data set
Figure FDA0002308801600000017
Is not limited byThe actual contribution is as follows:
Δdi(v,S)=v(S∪{di})-v(S) (1)
for Shapely values, the following are defined:
Figure FDA0002308801600000021
ψi(v, N) is the average of all marginal contributions, i.e. the contribution of new data to the original data set; new data is data that "buyer" does not have and "worker" owns; for data held in one 'worker' hand, the new data set is represented as
Figure FDA0002308801600000022
For single data
Figure FDA0002308801600000023
It represents a direct contribution:
Figure FDA0002308801600000024
for a "worker", its direct contribution is equal to the sum of the contributions of all new data, i.e.:
Figure FDA0002308801600000025
indirect contribution, which is the contribution of redundant data to the "buyer" indirectly in the transaction due to the reduced price of the same type of data in the market; redundant data refers to data owned by an old "worker" held in the hand of the new "worker"; the indirect value is defined as follows:
Figure FDA0002308801600000026
wherein the content of the first and second substances,
Figure FDA0002308801600000027
means "workers" uiTo "buyer" cjRedundant data of phijRefers to "buyer" cjA collection of "workers" who have accessed or purchased data,
Figure FDA0002308801600000028
representing old "workers" ulOwned data sets.
3. The method for dynamically trading crowd sensing data based on multiple arm slots and sharley values according to claim 1, wherein the step 2 of evaluating the bargaining price of the data comprises the following steps:
estimating and approaching a psychological price bottom line of a worker by using a confidence interval upper limit model in a multi-arm slot machine algorithm to obtain maximum return; defining the historically committed price as the "arm" of the slot machine, X for one armtThe sequence representing the benefit of its selection in the previous t rounds, then has the actual mean r and the sample mean
Figure FDA0002308801600000029
Figure FDA00023088016000000210
Where n represents the number of times the arm is selected; xi-r is a random variable obeying a gaussian distribution of degree- σ, represented by the chebyshev inequality:
Figure FDA0002308801600000031
wherein the content of the first and second substances,
Figure FDA0002308801600000032
for the variance of all the samples X,
Figure FDA0002308801600000033
represents the mathematical expectation of all samples X, epsilon being any value greater than 0; the above formula, under gaussian distribution, is equivalent to:
Figure FDA0002308801600000034
formula (9) has, after finishing:
Figure FDA0002308801600000035
meanwhile, considering that the buyer is only collected in the sample X of the first t-1 turns at the time of the t turn1-Xt-1(ii) a For each "arm", the maximum likelihood candidate for the unknown mean of this "arm" is obtained, i.e. the upper confidence interval limit UCB:
UCBi(t-1,δ)=∞,Xt-1=0 (11)
Figure FDA0002308801600000036
wherein the content of the first and second substances,
Figure FDA0002308801600000037
representing the difference between the predicted income upper limit and the income mean value for the current arm;
in the model, there are a total of three variables, a two-dimensional feature vector X determined by the observed environmental factorst,i=(vt-1,1)TWherein v ist-1Representing the value of a specific datum in the t-1 round; in addition, with IpThe arm with the price p is represented,
Figure FDA0002308801600000038
indicating arm IpThe number of times of selection in t-1 rounds is
Figure FDA0002308801600000039
Fθ(p) represents the probability of acceptance of price p by "worker";
Figure FDA00023088016000000310
representing an unknown parameter vector;
when selecting price piWhen it is used, order
Figure FDA00023088016000000311
Rounds in which the price is selected for this purpose; let Di ∈ RlX2Is at the arm piThe following l contexts are observed, with:
Figure FDA00023088016000000312
ci∈Rlis that each price is in niThe corresponding reward vector observed in the wheel; estimating an optimal solution of the coefficient vector by least squares estimation using training data (Di, ci)
Figure FDA0002308801600000041
Using ridge regression, there are:
Figure FDA0002308801600000042
wherein, I2Is a two-dimensional identity matrix;
in this model, the reward is expected
Figure FDA0002308801600000043
Is evaluated as
Figure FDA0002308801600000044
The standard deviation is expressed as
Figure FDA0002308801600000045
Wherein A isi,tFor parameters, initialize I2At each timeIn the wheel is composed ofi,t←Xt,iXt,i TIteration is carried out, and convergence is finally carried out;
there is an optimal arm under the t-th round:
Figure FDA0002308801600000046
to constant quantity
Figure FDA0002308801600000047
δ is any value greater than zero.
CN201911250169.1A 2019-12-09 2019-12-09 Multi-arm slot machine and Shapley value-based crowd sensing data dynamic transaction method Pending CN111028080A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911250169.1A CN111028080A (en) 2019-12-09 2019-12-09 Multi-arm slot machine and Shapley value-based crowd sensing data dynamic transaction method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911250169.1A CN111028080A (en) 2019-12-09 2019-12-09 Multi-arm slot machine and Shapley value-based crowd sensing data dynamic transaction method

Publications (1)

Publication Number Publication Date
CN111028080A true CN111028080A (en) 2020-04-17

Family

ID=70208164

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911250169.1A Pending CN111028080A (en) 2019-12-09 2019-12-09 Multi-arm slot machine and Shapley value-based crowd sensing data dynamic transaction method

Country Status (1)

Country Link
CN (1) CN111028080A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112256739A (en) * 2020-11-12 2021-01-22 同济大学 Method for screening data items in dynamic flow big data based on multi-arm gambling machine
CN112668721A (en) * 2021-03-17 2021-04-16 中国科学院自动化研究所 Decision-making method for decentralized multi-intelligent system in general non-stationary environment
WO2023082969A1 (en) * 2021-11-11 2023-05-19 重庆邮电大学 Data feature combination pricing method and system based on shapley value and electronic device

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112256739A (en) * 2020-11-12 2021-01-22 同济大学 Method for screening data items in dynamic flow big data based on multi-arm gambling machine
CN112256739B (en) * 2020-11-12 2022-11-18 同济大学 Method for screening data items in dynamic flow big data based on multi-arm gambling machine
CN112668721A (en) * 2021-03-17 2021-04-16 中国科学院自动化研究所 Decision-making method for decentralized multi-intelligent system in general non-stationary environment
CN112668721B (en) * 2021-03-17 2021-07-02 中国科学院自动化研究所 Decision-making method for decentralized multi-intelligent system in non-stationary environment
WO2023082969A1 (en) * 2021-11-11 2023-05-19 重庆邮电大学 Data feature combination pricing method and system based on shapley value and electronic device

Similar Documents

Publication Publication Date Title
Jiao et al. Toward an automated auction framework for wireless federated learning services market
US7536338B2 (en) Method and system for automated bid advice for auctions
Boyacı et al. Pricing when customers have limited attention
CN111028080A (en) Multi-arm slot machine and Shapley value-based crowd sensing data dynamic transaction method
US7627514B2 (en) Method and system for selecting an optimal auction format
US20070043770A1 (en) Discovery method for buyers, sellers of real estate
JPH11504455A (en) Negotiating Network Using Satisfaction Density Profile
JP2003529139A (en) Efficient portfolio sampling method and system for optimal underwriting
Lim et al. Incentive mechanism design for resource sharing in collaborative edge learning
Gupta et al. A hybrid approach for constructing suitable and optimal portfolios
JP2003526147A (en) Cross-correlation tool to automatically calculate portfolio description statistics
CN112966189A (en) Fund product recommendation system
CN110634043A (en) Supply and demand matching model obtaining method, supply and demand matching method, platform and storage medium
CN109242533A (en) The online motivational techniques of car networking intelligent perception user based on Game Theory
An et al. Crowdsensing data trading based on combinatorial multi-armed bandit and stackelberg game
Ward et al. Developing competitive bids: a framework for information processing
WO2001033464A1 (en) Customer demand-initiated system and method for on-line information retrieval, interactive negotiation, procurement, and exchange
CN110930259A (en) Creditor right recommendation method and system based on mixed strategy
CN110533528A (en) Assess the method and apparatus of business standing
CN115271092A (en) Crowd funding incentive method for indoor positioning federal learning
Carmona et al. High frequency market making
CN110298684A (en) Vehicle matching process, device, computer equipment and storage medium
US10402921B2 (en) Network computer system for quantifying conditions of a transaction
KR20050087766A (en) System and method for evaluating brand value based on the internet
Tang et al. Competitive-Cooperative Multi-Agent Reinforcement Learning for Auction-based Federated Learning.

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20200417

RJ01 Rejection of invention patent application after publication