Differential Privacy-Based Double Auction for Data Market in Blockchain-Enhanced Internet of Things
With the rapid development of the Internet of Things (IoT), large amounts of data are collected, which constitute a valuable business resource. Hence, a suitable IoT data market needs to be established, and the provision of safe and effective trading services for multiple buyers and sellers is required. This paper introduces an IoT data market framework supported by blockchain. It focuses on a transaction realization scheme for multiple buyers and sellers. In the scheme, the mechanisms are designed to determine the corresponding data providers and recipients for the buyers and sellers, respectively, and the transaction prices of both parties. When the data market runs, an inference attack will raise bid information leakage issues. We study a transaction scheme that enables differential privacy protection of bids based on an exponential mechanism. This paper theoretically proves the individual rationality, weak budget balance, and truthfulness of the normal transaction scheme and differential privacy-based transaction scheme. This paper also theoretically proves the effectiveness of the differential privacy protection for bids of transaction participants. Furthermore, this paper verifies the performances of the two schemes through digital simulation experiments. From the experiments, we can also prove that these schemes occupy reasonable social welfare and computational overhead.
With the widespread use of the Internet of Things (IoT), large amounts of data are collected and stored . These are shared such as with data analysts, IoT service providers, and artificial intelligence developers, who wish to maximize their benefits , which has given rise to the IoT data market [3, 4]. This is generally online, allowing buyers and sellers to enter at any time, and it satisfies properties such as individual rationality and balanced budgets .
Further analysis and decision-making based on the acquired data are needed to generate benefits for buyers. This requires the characteristics of data integrity, authenticity, and security. In addition, IoT data collection is often carried out by sensors at different locations, and data are stored nearby on a local edge server or base station. Therefore, distributed data transactions must conform to the characteristics of IoT data, in accordance with the characteristics of blockchain technology, and this has spurred interest in the research of blockchain-based IoT data market [6–9].
The key problem in the IoT data market is how to efficiently and reasonably determine transaction prices [10–12]. Liu et al.  studied the price optimization mechanism in the data market in the context of blockchain-enhanced IoT, where the abstract objects are multiple sellers and one buyer, and a two-stage Stackelberg game is used to solve the pricing and purchasing problem of the data consumer and market agency. However, there are always multiple sellers and buyers, whose efficient pricing and purchases are a more general problem, which this paper approaches with a double auction.
The current blockchain-based data market does not consider privacy leakage in transactions, which can easily lead to price information leakage under an inference attack . The differential privacy-based method shows promise to prevent such leakage while ensuring lower computation and communication overhead and good auction performance. We adopt differential privacy in auctions to protect the commercial interests of both parties in a transaction.
This paper addresses the above problems from three aspects: (1)We describe the framework of a blockchain-enhanced IoT data market and propose a double-auction normal transaction method (DANTM) for multiple sellers and buyers(2)To protect the price privacy of all parties, we upgrade DANTM to a double-auction transaction method based on differential privacy (DADPM). We demonstrate that our algorithms are individually rational, weakly budget-balanced, and truthful and prove that DADPM preserves the price privacy of buyers and sellers(3)Simulations show that the proposed pricing schemes have the desired properties at a low time cost
The remainder of this paper is structured as follows. Section 2 reviews related research. Section 3 introduces a blockchain-enhanced IoT data market framework and a related data transaction model. Section 4 describes a double-auction scheme for data transactions in the IoT data market. Section 5 presents a differential privacy-based double-auction scheme. In Section 6, we evaluate our proposed schemes. We conclude our work in Section 7.
2. Related Work
With the development of the IoT, more and more data are collected from various IoT node devices, and these are important assets. IoT data have the requirements of real time and privacy, and related research of its market is proceeding [14, 15]. Blockchain, as an information infrastructure that provides transaction credibility and distribution, has been considered [16–19]. However, due to the limited resources of the IoT, related data markets supporting the application of blockchain technology are constructed by coordinating cloud and edge servers . We present a data market framework and trading process in this context.
Much effort has been made to develop secure, efficient, pricing models of low complexity for the IoT data market [13, 20–22]. Wang et al.  presented two pricing models for data transactions in device-to-device communication networks: a Stackelberg game based on one buyer and multiple sellers, and an alternative ascending clock auction based on one seller and multiple buyers. Liu et al.  formulated a two-stage Stackelberg game to solve the pricing and purchasing problem of one buyer and multiple sellers, considered competition between sellers, and proposed a competition-enhanced pricing scheme. However, there are no pricing models for multiple sellers and buyers in the IoT data market, which is the more general case.
An auction market can be run fairly and efficiently via a trading process. A simple auction has one seller and multiple buyers, and a double auction has multiple sellers and buyers. From the perspective of resource trading, there is research on power and spectrum, adopting the pricing mode of an auction and sometimes considering the privacy protection of transaction information [5, 23]. Zhu and Shin  presented a differentially private and strategy-proof spectrum auction mechanism with approximate revenue maximization. Li et al.  proposed a differential privacy-based online double-auction scheme for energy trading in the smart grid, consisting of a Laplace-based winner determination rule and exponential-based allocation rule. There are certain differences between data and energy, as energy can only be consumed once, and data multiple times. There is no good or bad energy, but data have differences in quality. Our differential privacy-based double auction differs from previous schemes.
3. Framework, Model, and Desired Properties
3.1. Blockchain-Enhanced Data Market Framework for IoT
Figure 1 displays the system architecture of a blockchain-enhanced IoT data market, targeting the challenges of security and efficiency. The components are IoT sensors and an edge server, base station, cloud server, and data user. IoT sensors collect original data and upload them to the edge server, which refines them and uploads them to a nearby base station, where they are stored for selling. Base stations are always connected to a cloud server through wired networks. The formal public data trading platform is set up on the cloud server. Data users can buy IoT data using a web browser. We set up blockchain on the base station and cloud server, using consortium blockchain for efficiency and data protection .
In this system, base stations and cloud servers are the core part of the blockchain, in charge of commercial data storage and transactions. Base stations act as sellers, and users as buyers. Data trading rules are predefined in the blockchain as smart contracts. We design algorithms as data trading rules and protect the privacy of every participant. The IoT data market is a distributed system. Data can be sent from a base station directly to a buyer, which assures efficient trading. Blockchain records and stores transaction data and maintains the authenticity of transactions through its consensus mechanism. We adopt the proof-of-work (PoW) consensus protocol. Data transmission from seller to buyer can be assured by blockchain’s key mechanisms. The process of blockchain-enhanced IoT data trading is presented in Figure 2.
We assume the following actions in Figure 2 occur in a time slot: (1)Buyers (data users) and sellers (base stations) submit buying and selling requirements to the data trading platform (cloud server)(2)The data trading platform runs data trading algorithms and decides on a trading scheme(3)The first winning buyer (data user 1) and its data providers (base stations 1 and 2) are notified of the trading result(4)The second winning buyer (data user 2) and its data providers (base stations 2 and 3) are notified of the trading result(5)The first winning buyer (data user 1) and its data providers (base stations 1 and 2) directly implement data trading(6)The second winning buyer (data user 2) and its data providers (base stations 2 and 3) directly implement data trading(7)The blockchain network audits the transaction data
3.2. Designed Auction Model
n our data market, multiple buyers and sellers are involved in data transactions. A double-auction scheme is adopted to efficiently match the requirements of buyers and sellers. One trading process is finished in a time slot.
Buyers and sellers that need data transactions in a time slot are referred to as active buyers and sellers. A data buyer’s bid information includes data kind, bid, data quality, and data quantity. A data seller’s asking information includes data kind, asking price, data quality, and data quantity.
We assume below that the data market will match the data kind through a smart contract in blockchain, allow buyers and sellers of the same data kind to meet, and start the process of a double auction. Table 1 lists the key notation used in our paper.
3.3. Desired Properties of Model
Definition 1. (Seller payoff). The payoff to the winning seller for time slot is
Definition 2. (Buyer payoff). The payoff to the buyer in time slot is
Definition 3. (Social welfare). Social welfare in time slot is
Social Welfare Maximization: in double-auction markets, the goal is to maximize the total social welfare, i.e.,
Individual Rationality: each seller and buyer must receive a nonnegative payoff, i.e.,
In other words, a winning seller is not rewarded with less than its asking price , and a winning buyer must not be charged more than its bid .
Weak Budget Balance: for the data market in time slot , if there exists then the auction process satisfies the property of weak budget balance, which ensures that the auctioneer makes a tiny profit.
Truthfulness: an auction is truthful if and only if for each bidder , whose true and false bids are and , respectively. This property ensures that bidders obtain their maximum payoff when and only when their truthful bids are reported.
4.1. Double Auction with Valid Price Mechanism
We present the concept of a valid price.
Definition 4. (Valid Price). A valid price is a threshold value of a buyer’s bid and a seller’s asking price. It can determine whether a buyer or seller wins in an auction. A valid price is determined by bids, asking prices, and data quality of buyers and sellers.
In a double-auction data market, active buyers and sellers in slot submit their bidding information to the data market administrator, which can be realized in a smart contract in blockchain. The data market administrator assigns each buyer with seller-provided data through Algorithms 1–3 (see below). The following information is obtained for buyers: the winning buyer and its trading price, the data providers , and whether its requirement is satisfied (i.e., whether ). For sellers, the following information is obtained: the winning sellers, their trading price, the corresponding buyer, and the amount of data provided. The results are uniformly described by Algorithm 1 (DANTM) describing the auction scheme. We use Algorithm 2 to calculate the valid price for each active buyer and seller and use this to filter the active buyers and sellers to determine candidate participants and . We sort the candidate buyers in descending bid order and candidate sellers by ascending asking prices. We call Algorithm 3 to obtain a winning buyer (stored in ) and the related , delete that buyer from active buyers , and serve the next active buyer.
4.2. Valid Price Algorithm
Algorithm 2 is used to calculate a valid price. We match the prices of active buyers and sellers in to obtain the refined active buyers and sellers , and call (see below) to match the data quality of among , so as to select the proper prices of buyers and sellers and use them as valid price for buyers (when ) and sellers (when ) at slot .
We provide an intuitive explanation of Algorithm 2 in Figure 3. To calculate the valid price, Algorithm 2 first ranks active buyers in descending order of price and active sellers in ascending order of price. On this basis, the prices of active buyers and active sellers are matched. As shown on the left of Figure 3, the buyer price (marked red) and the seller price (marked green) are the critical points, and the buyer and seller starting from the critical points to the left become refined active buyers and sellers. Next, refined active sellers are sorted in descending order of data quality and refined active buyers in ascending order of data quality. Then, the data quality of refined active buyers and refined active sellers are matched. As shown on the right of Figure 3, the buyer data quality (marked red) and the seller data quality (marked green) are the critical points, and we take the prices of the buyer and the seller located at the critical points as the corresponding valid price.
4.3. Normal Data Trading Method
We provide the normal data trading method as Algorithm 3. In the data trading process, we select the buyer with the highest price in as the winning buyer and assign the trading price for the sellers providing data. We record the data trading information for the winning buyer and for the sellers providing data (i.e., the winning sellers). The winning sellers provide data with all they can (), except that the last winning seller needs to provide data with the one the winning buyer needs left (), whose nonnegative value will indicate that the winning buyer has the enough data demanded.
As shown in Algorithm 3, when the number of candidate buyers exceeds 1, the trading price of the winning buyer is equal to the second highest price among the bids of these candidate buyers. When only one candidate buyer exists, the trading price of the winning buyer is equal to its bid. The former case follows the Vickrey-Clarke-Groves auction model  to ensure the truthfulness of our algorithm.
The elements in and are sorted according to their asking prices in a positive sequence, where represents the winning data sellers and represents the left sellers (nonwinning data providers). When there exist elements in , we use the asking price of the first element in it as the trading price of all the sellers in . When there are no elements in , we use the asking price of the last element in as the trading price of all the sellers in . In this pricing scheme, data provided to a buyer from different sellers are given the same price, reflecting the principle of fairness.
As shown in Figure 4, a data trading start pointer is adopted to indicate where a provider starts providing data. By default, the provider provides data starting from the location indicated by the pointer. Among those providing data to a buyer, data that can be provided by the last data provider may not be all that is required, so the cutoff point is recorded and fed back to the provider.
4.4. DANTM Scheme Analysis
We theoretically analyze the DANTM scheme properties.
Lemma 5. The DANTM scheme is individually rational.
Proof. For the buyer, the DANTM scheme considers two cases. First is , where winning buyer is assigned trading price as the second-highest bid from . Since , there exists . Second is , where winning buyer is assigned trading price as bid of itself, so . Hence, the scheme is individually rational for buyers.
For the seller, trading price of the winning sellers in is assigned as asking price of the first seller in . Since the asking prices of sellers in are sorted in ascending order, . Therefore, the scheme is individually rational for sellers.
Lemma 6. The DANTM scheme is weakly budget-balanced.
Proof. A winning buyer is provided data by a group of winning sellers with a single trading price. The data quantity is the same for both the buyer and sellers. To prove the lemma, we only need to show that . From the DANTM scheme, winning buyers and sellers come from candidate buyers/sellers, and .
Lemma 7. The DANTM scheme is truthful.
Proof. We consider two cases for a buyer. First is . If the initial bid of the winning buyer is lower than its real value , the buyer loses the opportunity to first become a candidate buyer and receives utility . If the initial bid is greater than , a successful transaction may bring about negative utility because the transaction price is greater than . In the second case, , the secondary bid is selected as , in accordance with the Vickrey second price auction rule, which is known to be truthful .
The DANTM scheme also considers two cases for a seller. First, only one qualified seller provides data to the buyer. A seller with an asking price greater than its real value loses the opportunity to become a winning seller and receives utility . If the asking price is less than , a successful transaction may have utility because is less than . Second, multiple qualified sellers provide data to the same purchaser. Among them, the offer with the highest asking price, following the Vickrey second price auction rule, guarantees a truthful ask , and other winning sellers will not get more utility. A seller whose asking price exceeds loses the opportunity to become a winning seller, and , or becomes the last winning one, which maybe provide limited data when , and impact its utility . If a seller gives an asking price below its real value , its utility is not affected.
Theorem 8. The DANTM scheme is individually rational, truthful, and weakly budget-balanced.
5.1. Differential Privacy Data Trading Method
An inference attack can exist in a double auction , and when it works, data buyers or sellers can infer the bidding information of other buyers or sellers, which compromises their privacy. To protect privacy requires an obstacle to prevent the guessing of original bids from data trading results, in which case differential privacy is used. If two nearly identical inputs are input to a function, the probability distribution of their outputs is limited, which is the effect of differential privacy, which we define as follows.
Definition 9 (Differential Privacy). A function has -differential privacy (-differential privacy) if, for any two input sets and with a single input difference, the outputs are within a fixed range , where and are small positive values.
To maintain the privacy of bids for both winning buyers and sellers, we randomly select them from candidate buyers and sellers, while preserving some valuable properties. We present an exponential-based privacy preserving mechanism to choose winning buyers and sellers.
We calculate the quality value for candidate buyers and determine the probability distribution of winning buyers. We want the buyer with a higher bid to be the winning buyer with priority. As the bids of candidate buyers are arranged in descending order, is the highest bid. We set the quality value of buyers as determine the probability distribution of winning buyers, where is the set of candidate buyers, and choose a winning buyer.
We choose a method to calculate the quality value for sellers. We want a seller with a lower asking price to be the winning seller with priority. The asking prices of candidate sellers are arranged in ascending order, so is the lowest asking price. The quality value of a seller is
We calculate the probability distribution of winning sellers among candidate sellers and choose a winning seller.
We assume that the quality function of buyer (seller) is bounded by , and the difference between maximum and minimum value of the quality function of buyers (sellers) is ().
According to Theorem 9.36 in , while a mechanism is truthful, there is a critical value such that if a buyer’s bid is higher than the critical value, the buyer’s trading price is equal to the value; if a buyer’s bid is less than the value, the buyer will lose in the transaction. In our situation, we can conclude that the bid threshold is just our valid price , and this is the trading price of the winning buyer. We similarly use valid price as the trading price of the winning seller.
We define Algorithm 4 using differential privacy method as DADPM. To call , we add an input parameter, valid price , and differential privacy-related parameters and .
To achieve differential privacy, we calculate the quality value and probability distribution for each candidate buyer and select a data buyer according to this distribution, with trading price as the trading price of .
We calculate the quality value for each candidate seller and obtain probability distribution , selecting sellers according to this distribution to provide data for . We use as the trading price of sellers. Similar to the normal trade method, we record and return and adopt a data trading start pointer in Algorithm 4 as Algorithm 3.
5.2. DADPM Analysis
We theoretically analyze DADPM.
Lemma 10. The DADPM scheme is individually rational.
Proof. The winning buyer is assigned trading price as valid price . From the calculation of the active buyers (Algorithm 1), So, . Therefore, DADPM is individually rational for buyers.
The trading price of the winning sellers in is assigned as valid price . From the calculation of the active sellers (Algorithm 1), . So, . Therefore, DADPM is individually rational for sellers.
With a proof similar to that of the DANTM scheme, we can conclude Lemma 11.
Lemma 11. The DADPM scheme is weakly budget-balanced.
Lemma 12. The DADPM scheme is truthful.
Proof. For buyers, we can prove the conclusion directly using Theorem 9.36 in , according to which, if there is a critical trading price for buyers, the scheme is truthful. From Algorithm 1, is obtained by sorting in descending order, so the elements in are monotone in . Also from Algorithm 1, is obtained by filtering buyers with . So, there exists a critical value .
Recall that Vazirani et al.  discussed the situation of a simple auction, with one seller and multiple buyers. We discuss a double auction, with multiple sellers and buyers. A buyer’s bid is a preference to choose a buyer, and an asking price is a preference to choose a seller. Then, if there is a critical value of a trading price for sellers, the scheme is truthful.
Based on the above discussion, we can prove that the DADPM scheme is truthful for sellers. From Algorithm 1, is obtained by sorting in ascending order, so the elements in are monotone in , and is obtained by filtering sellers with . So, there exists a critical value .
Theorem 13. The DADPM scheme is individually rational, truthful, and weakly budget-balanced.
Now, we prove that DADPM preserves the data buyer’s valuation privacy.
Theorem 14. For data buyers, DADPM preserves (, ) differential privacy for bidders’ quality values when .
Proof. We can prove our conclusions using a proof method similar to Theorem 8 in . Let and be vectors of a quality function that differ for a single bidder, and let and be corresponding bid vectors. We show that DADPM can preserve bid privacy even if the order of winning bidders is revealed. Assume we get an arbitrary sequence of winning buyers with length . The relative probability of obtaining the sequences for given vectors of quality function and is