Research Article  Open Access
Lijun Chen, Jiakui Zhao, Qun Huang, Liang Huai Yang, "Effective Space Usage Estimation for SlidingWindow Skybands", Mathematical Problems in Engineering, vol. 2010, Article ID 828035, 15 pages, 2010. https://doi.org/10.1155/2010/828035
Effective Space Usage Estimation for SlidingWindow Skybands
Abstract
Skyline query computes all the “best” elements which are not dominated by any other elements and thus is very important for decisionmaking applications. Recently, it is generalized to skyband query and a kskyband query returns those elements dominated by no more than k, of other elements. To incorporate the skyband operator into the stream engine for monitoring skybands over sliding windows, space usage estimation for skyband operator becomes a critical issue in the query optimizer. In this paper, we firstly introduce the skyband sketch as the cost model. Based on the cost model, we propose an approach for estimating the space usage of skyband operator over sliding windows of data streams under the assumptions of statistical independence across dimensions, no duplicate values over each dimension, and dimension domains totally ordered. Experiments verify that our approaches can estimate the space usage effectively over arbitrarily distributed data. To the best of our knowledge, this is the first work that attempts to address the issue and proposes effective approaches to solve it.
1. Introduction
Skyline queries [1] are very important for multicriteria decisionmaking applications, as the queries can return all the “best” elements which are not dominated by any other element. However, skyline queries may eliminate elements which are valuable but dominated by few other elements, for dimensions commonly can not cover all user’s consideration. Therefore, Papadias et al. [2] generalized the skyline to skyband, and a skyband query returns all the elements which are dominated by no more than of other elements.
By using the common hotel example in the literature, assuming that each hotel has the information of its distance from the beach and its price, and that one prefers the hotels which are cheap and close to the beach, Figure 1 demonstrates the difference between the skyline (the 0skyband) and the 1skyband. Three hotels, that is, , , and , are returned by the skyline query, but additional four hotels, that is, , , , and , are returned by the 1skyband query because they are dominated by only one of other elements. Buchta [3] proposed that the expected number of the skyline elements in a dimensional space which contains elements is ; therefore, lowdimensional skyline queries commonly return a small number of skyline elements to the user, and some valuable elements may be eliminated, the reason is that each element has a high probability of being dominated by other elements in a lowdimensional space. Skyband queries may return the elements which are valuable but dominated by few other elements to the user, hence, are widely used by decisionmaking applications in lowdimensional spaces.
Recently, the database research community witnessed a paradigm shift to continuous queries, and much attention has been put on slidingwindow skyline queries [4, 5] in the stream environment. However, the issue of space usage estimation, which is very important for extending the query optimizer's cost model to accommodate skyline queries in the stream engine, is still left untouched. In this paper, we propose some effective approaches to estimate the space usage of slidingwindow skyband queries. Since the skyline query is a special case of skyband queries, our proposed approaches can be naturally applied to slidingwindow skyline queries as well.
Monitoring slidingwindow skybands needs to extract all skyband elements from the live elements in the window and continuously report skyband changes as the window slides. In this paper, we first introduce the skyband sketch as the cost model and present effective policies for the sketch maintenance. As such, the skyband sketch has the quality of good space efficiency because it only stores the skyband elements along with the potentialskyband elements which do not belong to the skyband currently and are not guaranteed to be excluded from the skyband in their remaining lifespan. Next, under the assumption of statistical independence across dimensions, which is commonly used by query optimizers, and that no duplicate values exist over each dimension and domains are all totally ordered, we propose an approach for estimating the space usage of monitoring skybands over sliding windows. Experimental study verifies that our approaches can estimate the space usage effectively over arbitrarily distributed data. To the best of our knowledge, this is the first work that attempts to address the issue of space estimation and proposes effective approaches to solve it.
The rest of this paper is organized as follows. Section 2 summarizes the related work; Section 3 introduces some preliminary knowledge; Section 4 details our approaches for estimating the space usage; experimental results are given in Section 5 and followed by our conclusions in Section 6.
2. Related Work
Many algorithms have been proposed for computing static skylines, including the nonindexbased algorithms [1, 6, 7] and the indexbased algorithms [8–10], where the indexbased algorithms uniformly outperform the nonindexbased algorithms. Skyline computation under some certain conditions also received much attention, including skyline computation with partially ordered domains [11] and lowcardinality domains [12], subspace skyline computation [13, 14], skyline cube maintenance [15–19], and skyline computation in the distributed environment [14, 20–23]. Some skyline variations have also been proposed, including the dominant skyline [24], the top subspace skyline [25], the reverse skyline [26], the most representative skyline [27], the probabilistic skyline [28], and the skyband [2].
Under the assumptions of statistical independence across dimensions, no duplicate values over each dimension, and dimension domains being all totally ordered, the problem of estimating the number of the skyline elements, that is, the skyline cardinality, has been addressed in the works [3, 29, 30]. Chaudhuri et al. [31] relaxed the assumption of no duplicate values over each dimension by allowing two possible values (e.g., 0 and 1).
As stated before, continuous skyline queries over sliding windows in data streams [4, 5] have important applications such as environment monitoring and trends sensing. To accommodate skyline operator in the stream processing engine, the issue of space usage estimation needs to be solved. Motivated by this ambition, under the similar assumptions, we propose robust approaches to estimate the number of the skyband and potentialskyband elements over continuously distributed data.
3. Preliminaries
In this section, we present some preliminary results that will be used in the next section. In addition, we also describe a data structure called the skyband sketch. Theorem 3.1 characterizes the number of the elements in a finite set which just satisfy of the properties. It is based on the generalized form of the InclusionExclusion Principle [32]. Similarly, Theorem 3.2 characterizes the number of the elements in a finite set which satisfy no more than of the properties; the theorem will be used for our theoretical analysis of the space usage in the next section.
Theorem 3.1. Suppose that is a finite set, are properties, and are subsets of , where consists of all those elements in with property . Let be the number of the elements in which just satisfy of the properties, it can be characterized as where is characterized as follows:
Theorem 3.2. Suppose that is a finite set, are properties, and are subsets of , where consists of all the elements in which satisfy ; the number of the elements in which satisfy no more than of the properties, that is, , can be characterized as where is the same as that in Theorem 3.1.
Proof. By Theorem 3.1, can be characterized as We have thus proved the theorem.
In a dimensional space, for simplicity and without loss of generality, an element is said to dominate another element if it is smaller than or equal to over each dimension and strictly smaller than over at least one dimension and is noted as . In a slidingwindow, if no more than of other live elements can dominate an element, the element is a skyband element; if an element is not a skyband element and no more than of the succeeding elements can dominate it, the element is a potentialskyband element.
Now we are able to describe a data structure called the skyband sketch for keeping the skyband elements or the potentialskyband elements. The skyband sketch is a memory resident synopsis. The potentialskyband elements are the elements which do not belong to the skyband currently but are not guaranteed to be excluded from the skyband in their remaining lifespan. Hence the skyband sketch has the quality of good space efficiency for monitoring skybands over slidingwindows. The space usage in this paper is measured by the numbers of the skyband and the potentialskyband elements stored by the sketch.
Figure 2 shows the architecture of the skyband sketch; the sketch changes occur only when a new element arrives or a current skyband element expires. When a new element arrives, if no more than skyband elements can dominate it, it is probably a skyband element; otherwise, it is a potentialskyband element. If the new element appears to be a skyband element, all the skyband elements which are dominated by more than succeeding skyband elements and all the potentialskyband elements which are dominated by more than succeeding skyband and potentialskyband elements should be deleted because they will be dominated by the succeeding elements during their remaining lifespan; in addition, the skyband elements which are dominated by no more than succeeding skyband elements but are dominated by more than live skyband elements will appear to be potentialskyband elements. If the new element appears to be a potentialskyband element, all potentialskyband elements which are dominated by more than succeeding skyband and potentialskyband elements should be deleted. When a skyband element expires, all the potentialskyband elements which are dominated by no more than skyband and potentialskybad elements will appear to be skyband elements. In this paper, since we focus on the problem of space usage estimation, we leave out the detailed implementation issues of the skyband algorithm.
4. Space Usage Estimation
In this section, we present our robust approaches for estimating the space usage of slidingwindow skybands under the assumption of statistical independence across dimensions based on the preliminary results in the previous section.
4.1. DistributionConstrained Data
Here, we give our theoretical analysis for the space usage of slidingwindow skybands over data which is distribution constrained, that is, there are no duplicate values over each dimension. By mapping the problem of evaluating the number of the elements in a finite set which satisfy no more than of the properties to the problem of evaluating the probability that no more than of other elements can dominate an element, Lemma 4.1 gives the probability that at most of other elements in a dimensional space can dominate an element. Based on Lemma 4.1, Theorem 4.2 gives the expected number of the skyband elements in a sliding window which contains dimensional live elements.
Lemma 4.1. Suppose that are elements in a dimensional space, under assumptions of statistical independence across dimensions, no duplicate values over each dimension, and data domains being all totally ordered; let be the fact that no more than of other elements can dominate , then the probability of , that is, , can be characterized as
Proof. We map , , and in Theorem 3.2 to the full probability space, , and , respectively; is mapped to , which can be characterized as
Under assumptions of statistical independence across dimensions, no duplicate values over each dimension, and domains being all totallyordered, an element has a probability of being dominated by all other elements; therefore, can be further characterized as
By Theorem 3.2, can be characterized as
We have thus proved the lemma.
Theorem 4.2. Suppose that there are dimensional live elements in a sliding window, under assumptions of statistical independence across dimensions, no duplicate values over each dimension, and dimension domains being all totallyordered, the expected number of the skyband elements, that is, , can be directly characterized as and can be recursively characterized as with initial conditions where and where .
Proof. By Lemma 4.1, can be characterized as can further be recursively characterized as with initial conditions We have thus proved the theorem.
Theorem 4.3 shows that there exists inherent correlation between the expected number of the skyband elements in case of monitoring a dimensional skyband over a sliding window which contains elements and the expected number of the elements stored by the skyband sketch in case of monitoring a dimensional skyband over a sliding window which contains elements , that is, . In addition, the expected number of the potentialskyband elements in case of monitoring a dimensional skyband over a sliding window which contains elements equals . Therefore, by a minor revision, Theorem 4.2 can also be used to characterize the expected number of the potentialskyband elements.
Theorem 4.3. Under assumptions of statistical independence across dimensions, no duplicate values over each dimension, and domains being all totallyordered, the expected number of the skyband elements in case of monitoring a dimensional skyband over a sliding window which contains live elements, that is, , equals the expected number of the elements stored by the skyband sketch in case of monitoring a dimensional skyband over a sliding window which contains live elements, that is, .
Proof. By Lemma 4.1, can be characterized as To see why the theorem holds, suppose are the live elements in the sliding window, which are ascendingly ordered by the element sequence number, and , where , are the elements stored by the skyband sketch for monitoring a dimensional skyband over the sliding window. We map each of the live element into a dimensional elements , where is the sequence number of the element, then are just the skyband elements in the dimensional space.
4.2. A Dynamic Programming Algorithm
In this subsection, based on the theoretical analysis proposed in the above subsection, we propose an efficient dynamic programming algorithm to estimate the space usage. Since there exist inherent correlations among the expected number of the skyband elements, the expected number of the potentialskyband elements, and the expected number of the elements stored by the skyband sketch, we only consider how to estimate the number of the skyband elements.
Estimating the number of the skyband elements using (4.5) is infeasible in most cases because combination numbers are used to characterize the expected number of the skyband elements; for example, the number of the different ways of selecting 50 elements from 100 different elements can not be stored by a 64bit integer. Based on (4.6), we can design a recursive algorithm to estimate the number of the skyband elements, which will not encounter integer overflow. The recursive algorithm can be characterized by a binary tree with the depth of , where , , and are the same as those in Theorem 4.2. Therefore, estimating the number of the skyband elements using the recursive algorithm has the computational complexity of , which is unacceptable in most cases. Actually, there exists a large amount of duplicate computations in the binary tree; therefore, if duplicate computations can be eliminated, the computational complexity can be reduced. Algorithm 1 is a nonrecursive algorithm for estimating the number of the skyband elements, which is based on (4.6), and all the duplicate computations are eliminated. The algorithm is a dynamic programming algorithm [33], because although the algorithm is based on a recurrence, it is nonrecursive, and each step of the algorithm gives an exact answer for the corresponding subproblem.

Algorithm 1 functions as follows. First, two vectors and with size are created, and the values of are initialized to , respectively. According to the initial conditions, we have , hence all the values of are initialized to . Then, we evaluate and store the values to respectively. According to the initial conditions, we have , hence is set to . According to the recurrence, we have , that is, , hence we can evaluate and store the value to . By the same principle, we may evaluate sequentially and store the values to . We may continue to evaluate using the values in and store the values of to , respectively, until we evaluate and store the values to or . At last, the value of or is returned as the value of . It is apparent that the algorithm is space and time efficient, because the space complexity and the time complexity are and , respectively.
5. Experiments
In this section, we verify our theoretical results on space usage estimation of the kskyband operator monitoring skybands over sliding windows in the stream environment by extensive experiments. The algorithms have been implemented by the C++ programming language and run on a 2.0 GHz Intel CPU with 2 GB of memory, and the data over each dimension is generated by the (GNU Scientific Library GSL: http://www.gnu.org/software/gsl). We test the space performance in a lower dimensional (4dimensional) and a higher dimensional (8dimensional) space, respectively. According to the probability theory, if the data over a dimension is continuously distributed, the probability that there are duplicate values over the dimension is zero. Therefore, for each space, we generate a dataset; the data over the first dimension is normally distributed with , and the data over other dimensions is normally distributed with . At the same time, the slidingwindow size increases from 500 to 1000 stepped by 50; for each step, we compute the maximal, average, and minimal skyband sketch size, number of the skyband elements, and number of the potentialskyband elements during the moving of the sliding window over one million elements. Since there is no previous work that evaluates the space usage over continuous data, thus we compare our corresponding theoretical results with the experimental results.
Figures 3 and 4 show the comparisons between experimental results and the theoretical results for 4dimension space and 8dimension space. We can see that the experimental results are almost the same as we expected in the theories. What is more is that the maximal values are not twice as much as the minimal value and they are all close to the theoretical results. For the given parament (skyband) and , both of the actual space usage and the estimated space usage increase with the window size, as more objects need to be evaluated. At the same time, the skyband cardinality also increases when the value of parament increases. The comparison between 4dimension space and 8dimension space, as Figures 3(a) and 4(e) show, illustrates that the skyband sketch size in highdimension space is much more than that in lowdimension space, when the window size and the parament are given. This is because less elements are likely to be dominated by other objects in highdimension space compared with in lowdimension space. As there are sufficient skylines for users to make a decision in the higherdimensional space, skybands query shows its efficiency in lowdimensional space.
(a) Theoretical results
(b) Experimental results
(c) Experimental results
(d) Experimental results
(e) Theoretical results
(f) Experimental results
(g) Experimental results
(h) Experimental results
(a) Theoretical results
(b) Experimental results
(c) Experimental results
(d) Experimental results
(e) Theoretical results
(f) Experimental results
(g) Experimental results
(h) Experimental results
6. Conclusions and Discussions
Skyband query is of great importance for multicriteria decisionmaking applications. To support skyband query in the stream engine, the problem of effective space usage estimation must be solved, which is important for extending the query optimizers cost model. In this paper, under the assumption of statistically independent [34, 35] across dimensions, no duplicate values over each dimension, and dimension domains being all totally ordered, we propose effective methods to address this issue; since the skyline query is just a special case of skyband queries, it is obvious that our approaches apply to slidingwindow skyline queries either. We also put forward a dynamic programming algorithm to estimate the space usage, which is space and time efficient. In addition, if only the distribution function is given, we can also use the similar approach to evaluate the skyband cardinality over a space, where there are duplicate values over some dimensions. Finally, we carried out extensive experiments which verified that our proposed approaches can estimate the space usage accurately, hence, can be used to extend the optimizer's cost model for incorporating the skyband operator.
Acknowledgments
This work is partially supported by China “863” Hitech Program (Grant no. 2007AA01Z153), Zhejiang Provincial NSF (Grant no. Y1090096), and the National Natural Science Foundation of China (NSFC) under Grant no. 60573125 and 60873264.
References
 S. Börzsönyi, D. Kossmann, and K. Stocker, “The skyline operator,” in Proceedings of the International Conference on Data Engineering (ICDE '01), pp. 421–430, 2001. View at: Google Scholar
 D. Papadias, Y. Tao, G. Fu, and B. Seeger, “Progressive skyline computation in database systems,” ACM Transactions on Database Systems, vol. 30, no. 1, pp. 41–82, 2005. View at: Publisher Site  Google Scholar
 C. Buchta, “On the average number of maxima in a set of vectors,” Information Processing Letters, vol. 33, no. 2, pp. 63–65, 1989. View at: Publisher Site  Google Scholar  MathSciNet
 X. Lin, Y. Yuan, W. Wang, and H. Lu, “Stabbing the sky: efficient skyline computation over sliding windows,” in Proceedings of the International Conference on Data Engineering (ICDE '05), pp. 502–513, 2005. View at: Google Scholar
 Y. Tao and D. Papadias, “Maintaining sliding window skylines on data streams,” IEEE Transactions on Knowledge and Data Engineering, vol. 18, no. 3, pp. 377–391, 2006. View at: Publisher Site  Google Scholar
 J. Chomicki, P. Godfrey, J. Gryz, and D. Liang, “Skyline with presorting,” in Proceedings of the International Conference on Data Engineering (ICDE '03), pp. 717–719, 2003. View at: Google Scholar
 P. Godfrey, R. Shipley, and J. Gryz, “Maximal vector computation in large data sets,” in Proceedings of the 31st International Conference on Very Large Data Bases (VLDB '05), vol. 1, pp. 229–240, 2005. View at: Google Scholar
 D. Kossmann, F. Ramsak, and S. Rost, “Shooting stars in the sky: an online algorithm for skyline queries,” in Proceedings of the International Conference on Very Large Data Bases (VLDB '02), pp. 275–286, 2002. View at: Google Scholar
 D. Papadias, Y. Tao, G. Fu, and B. Seeger, “An optimal and progressive algorithm for skyline queries,” in Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD '03), pp. 467–478, 2003. View at: Google Scholar
 K.L. Tan, P.K. Eng, and B. C. Ooi, “Efficient progressive skyline computation,” in Proceedings of the International Conference on Very Large Data Bases (VLDB '01), pp. 301–310, 2001. View at: Google Scholar
 C.Y. Chan, P.K. Eng, and K.L. Tan, “Efficient processing of skyline queries with partiallyordered domains,” in Proceedings of the International Conference on Data Engineering (ICDE '05), pp. 190–191, 2005. View at: Google Scholar
 M. Morse, J. M. Patel, and H. V. Jagadish, “Efficient skyline computation over lowcardinality domains,” in Proceedings of the International Conference on Very Large Data Bases (VLDB '07), pp. 267–278, 2007. View at: Google Scholar
 Y. Tao, K. Xiao, and J. Pei, “SUBSKY: efficient computation of skylines in subspaces,” in Proceedings of the International Conference on Data Engineering (ICDE '06), p. 65, 2006. View at: Publisher Site  Google Scholar
 A. Vlachou, C. Doulkeridis, Y. Kotidis, and M. Vazirgiannis, “SKYPEER: efficient subspace skyline computation over distributed data,” in Proceedings of the International Conference on Data Engineering (ICDE '07), pp. 416–425, 2007. View at: Publisher Site  Google Scholar
 C. Li, B. C. Ooi, A. K. H. Tung, and S. Wang, “DADA: a data cube for dominant relationship analysis,” in Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD '06), pp. 659–670, 2006. View at: Publisher Site  Google Scholar
 J. Pei, A.W.C. Fu, X. Lin, and H. Wang, “Computing compressed multidimensional skyline cubes efficiently,” in Proceedings of the International Conference on Data Engineering (VLDB '07), pp. 96–105, 2007. View at: Publisher Site  Google Scholar
 J. Pei, W. Jin, M. Ester, and Y. Tao, “Catching the best views of skyline: a semantic approach based on decisive subspaces,” in Proceedings of the 31st International Conference on Very Large Data Bases (VLDB '05), vol. 1, pp. 253–264, 2005. View at: Google Scholar
 T. Xia and D. Zhang, “Refreshing the sky: the compressed skycube with efficient support for frequent updates,” in Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD '06), pp. 491–502, 2006. View at: Publisher Site  Google Scholar
 Y. Yuan, X. Lin, Q. Liu, W. Wang, J. X. Yu, and Q. Zhang, “Efficient computation of the skyline cube,” in Proceedings of the 31st International Conference on Very Large Data Bases (VLDB '05), vol. 1, pp. 241–252, 2005. View at: Google Scholar
 W.T. Balke, U. Güntzer, and J. X. Zheng, “Efficient distributed skylining for web information systems,” in Proceedings of the 9th International Conference on Extending Database Technology (EDBT '04), pp. 256–273, 2004. View at: Google Scholar
 E. Lo, K. Y. Yip, K.I. Lin, and D. W. Cheung, “Progressive skylining over Webaccessible databases,” Data and Knowledge Engineering, vol. 57, no. 2, pp. 122–147, 2006. View at: Publisher Site  Google Scholar
 S. Wang, B. C. Ooi, A. K.H. Tung, and L. Xu, “Efficient skyline query processing on peertopeer networks,” in Proceedings of the International Conference on Data Engineering (ICDE '07), pp. 1126–1135, 2007. View at: Publisher Site  Google Scholar
 P. Wu, C. Zhang, Y. Feng, B. Y. Zhao, D. Agrawal, and A. El Abbadi, “Parallelizing skyline queries for scalable distribution,” in Proceedings of the 10th International Conference on Extending Database Technology (EDBT '06), pp. 112–130, 2006. View at: Google Scholar
 C.Y. Chan, H. V. Jagadish, K.L. Tan, A. K. H. Tung, and Z. Zhang, “Finding kdominant skylines in high dimensional space,” in Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD '06), pp. 503–514, 2006. View at: Publisher Site  Google Scholar
 C.Y. Chan, H. V. Jagadish, K.L. Tan, A. K. H. Tung, and Z. Zhang, “On high dimensional skylines,” in Proceedings of the 10th International Conference on Extending Database Technology (EDBT '06), pp. 478–495, 2006. View at: Publisher Site  Google Scholar
 E. Dellis and B. Seeger, “Efficient computation of reverse skyline queries,” in Proceedings of the 33rd International Conference on Very Large Data Bases (VLDB '07), pp. 291–302, 2007. View at: Google Scholar
 X. Lin, Y. Yuan, Q. Zhang, and Y. Zhang, “Selecting stars: the k most representative skyline operator,” in Proceedings of the International Conference on Data Engineering (ICDE '07), pp. 86–95, 2007. View at: Publisher Site  Google Scholar
 J. Pei, B. Jiang, X. Lin, and Y. Yuan, “Probabilistic skylines on uncertain data,” in Proceedings of the International Conference on Very Large Data Bases (VLDB '07), pp. 15–26, 2007. View at: Google Scholar
 J. L. Bentley, H. T. Kung, M. Schkolnick, and C. D. Thompson, “On the average number of maxima in a set of vectors and applications,” Journal of the Association for Computing Machinery, vol. 25, no. 4, pp. 536–543, 1978. View at: Publisher Site  Google Scholar  Zentralblatt MATH  MathSciNet
 P. Godfrey, “Skyline cardinality for relational processing: how many vectors are maximal?” in Proceedings of the 3rd International Symposium Foundations of Information and Knowledge Systems (FoIKS '04), pp. 78–97, 2004. View at: Google Scholar
 S. Chaudhuri, N. Dalvi, and R. Kaushik, “Robust cardinality and cost estimation for skyline operator,” in Proceedings of the International Conference on Data Engineering (ICDE '06), p. 64, 2006. View at: Publisher Site  Google Scholar
 K. H. Rosen, Discrete Mathematics and Its Applications, WCB/McGrawHill, Boston, Mass, USA, 4th edition, 1999.
 T. H. Cormen, C. E. Leiserson, R. L. Rivest, and C. Stein, Introduction to Algorithms, MIT Press, Cambridge, Mass, USA, 2nd edition, 2001. View at: MathSciNet
 M. Li, “Fractal time series—a tutorial review,” Mathematical Problems in Engineering, vol. 2010, Article ID 157264, 26 pages, 2010. View at: Publisher Site  Google Scholar  MathSciNet
 M. Li and W. Zhao, “Representation of a stochastic traffic bound,” IEEE Transactions on Parallel and Distributed Systems, 2009. View at: Google Scholar
Copyright
Copyright © 2010 Lijun Chen et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.