Abstract

As an important component of China’s transportation data sharing system, high-speed railway data sharing is a typical application of data-intensive computing. Currently, most high-speed railway data is shared in cloud computing environment. Thus, there is an urgent need for an effective cloud-computing-based data placement strategy in high-speed railway. In this paper, a new data placement strategy named hierarchical structure data placement strategy is proposed. The proposed method combines the semidefinite programming algorithm with the dynamic interval mapping algorithm. The semi-definite programming algorithm is suitable for the placement of files with various replications, ensuring that different replications of a file are placed on different storage devices, while the dynamic interval mapping algorithm ensures better self-adaptability of the data storage system. A hierarchical data placement strategy is proposed for large-scale networks. In this paper, a new theoretical analysis is provided, which is put in comparison with several other previous data placement approaches, showing the efficacy of the new analysis in several experiments.

1. Introduction

With the development and popularity of information technology, internet is gradually growing into various computing platforms. Cloud computing is a typical network computing mode, which emphasizes the scalability and availability of running large-scale application in virtual computing environment [1]. A large-scale network application based on cloud computing demonstrates features of distribution, heterogeneousness, and the trend of data intensity [2]. In cloud computing environment, data storage and operation is provided as a service [3]. There are various types of data, including common files, large binary files such as virtual machine image file, formatted data like XML, and relational data in database. Thus, a distributed storage service of cloud computing has to take large-scale storage mechanism for various data types into account, as well as the performance, reliability, security, and simplicity of data operation. As an important component of China’s transportation science data sharing system, high-speed railway data is the key to optimizing the operating organization. High-speed railway data sharing system has the characteristics of typical data-intensive application [46], to which the management of a large amount of distributed data is crucial. It is mainly reflected in the fact that the data size it processes is often up to TB or even PB level, which includes both existing input data source and the intermediate/final result data produced in the process.

In the process of implementing and executing data-intensive applications under the environment of cloud computing, as well as the process of establishing a large-scale storage system to meet the demand of a fast growing data storage volume, the main challenge is how to effectively distribute data at Petabyte level to hundreds of thousands of storage devices. Thus, an effective data placement algorithm is needed.

2. Goal of Designing High-Speed Railway Data Placement Strategy

The network storage system under cloud computing environment consists of thousands, and even ten thousands of storage devices. Different systems have different underlying devices, for example, the storage device set could be chunk device disk for SAN and GFPS, or OSD (object storage device) for object storage systems Lustre and ActiveScale, or PC for PVFS and P2P [7]. A data placement strategy mainly solves the problem of selecting storage device for data storage. An effective mechanism shall be adopted to establish the mapping relationship between data sets and storage device sets. Then, data sets generated by applications in the storage system are placed into different storage device sets. Meanwhile, certain particular goals need to be met, and different data placement strategies are designed for different purposes. For example, the stripping technology in RAID is mainly designed to acquire aggregated I/O bandwidth. The strategy of placing several replications of the data into different devices is mainly for the purpose of fault tolerance and data reliability improvement. Distributing data equally could realize a more balanced I/O load.

The high-speed railway data placement strategy under cloud computing environment is designed to meet the following goals.

2.1. Fairness

The size of the data stored in each device is proportional to the storage volume of that device [8].

2.2. Self-Adaptability

With the elapse of time, the volume of storage devices is dynamic and varied. Take the case of adding a new device and the case of deleting an existing device for example. When the scale of the storage system changes, a data placement strategy is applied to reorganize the data, making the data distributed to device sets satisfy the fairness criteria over again. Furthermore, it needs to be ensured that the migrated data volume is close to the optimal migration data volume. This would reduce the overheads of data migration. The optimal data volume to migrate is equal to the data volume that is acquired by the added device, or equal to the data volume on the deleted device. The self-adaptability of the data placement strategy is measured by the ratio of its actual migrated data volume to the optimal migration data volume. Therefore, the ratio value of 1.0 represents the optimal condition.

2.3. Redundancy

Getting several replications copied for the data, or enabling the data remain accessible through the use of erasure code when one of the replications is lost. So that fairness can balance the IO loads, self-adaptability can reensure fairness in accordance with storage scale change, and the data size migrated and the IO bandwidth occupied can also be decreased. Finally, the data reliability can be improved.

2.4. Availability

It is crucial that a system could be normally accessed in all cases. Once the system is unavailable, all functions would fail to perform normally. To improve system availability, it is necessary to regularly have the data location adjusted according the availability of storage devices, thus maximizing the system’s availability [9].

2.5. Reliability

It indicates whether the system could be normally accessed during a certain period of time. As the large-scale storage system contains thousands of storage devices, the probability of disk failure is rather high. When applying a data placement strategy, indicators of reliability such as data size need to be used in designing parameters of the placement strategy. Thus, a storage system with higher reliability is obtained.

2.6. Space-Time Effectiveness

It means that few time and space is used in calculating data location along with the data placement strategy.

When designing the data placement strategy for large-scale network storage system, certain particular goals need to be met depending on different application demands. However, it is impossible to meet all goals at the same time.

Some data management systems under the cloud computing environment have already been emerged currently, for example, Google File System [10] and Hadoop [11, 12], both of which have hidden the infrastructure used to store application data from the user. Google File System is mainly used for Web search application, but not for process application under the cloud computing environment. Hadoop is a more commonly distributed file system, which is used by many companies including Amazon [13] and Facebook. When Hadoop file system receives a file, the system will automatically separate the file into several chunks, each of which is randomly placed into a cluster. Cumulus project [14] has proposed a cloud architecture of single data center environment. However, the above-mentioned cloud data management systems did not study the data placement problem of data-intensive process applications under the cloud environment. Finally, let us look into several examples of existing popular large-scale data storage systems. Commercial Parallel File System (CPFS) [15, 16] divides a file into data chunks of the same size, which are stored on different disks of the file system in the form of rotation. Parallel Virtual File System (PVFS) [17] with open-source codes divides the file into strip and chunk and adopts the method of placing sliced data on multiple IO nodes in rotation. The data slice size of PVFS is a constant. PVFS data does not have any fault tolerance function. Panasas [18] is an object-oriented file system, where the data are allocated to underlying smart object storage device (OSD) in the unit of object [19]. A file is divided into strips, and each strip unit is stored on multiple OSD in the form of object. Upon initial placement, objects are fairly distributed between OSD devices using the random method.

PanFS, developed by the Panasas Company, is a Linux cluster file system based on object storage [20]. It is the core part of ActiveScale storage system. At first, these file systems divide the file into strips, and then allocate each strip to the underlying smart OSD in unit of object. The distribution of files across multiple OSDs is realized based on the round-robin algorithm. The size of the data object is random, and it could increase accordingly with the increase of file size without modifying the metadata mapping chart on the metadata server.

The object-oriented file system Lustre is a transparent global file system. The Lustre file system treats the file as an object that is located by the metadata server, which then directs the actual file I/O request to the corresponding object storage targets (OSTs). Since a technology is adopted where the metadata is separated from the storage data, the computing resources could be fully separated from storage resources [21]. Thus, the client could focus on the I/O request from users and applications. Meanwhile, the OST and metadata server could focus on data reading, transmitting, and writing.

All storage nodes in the COSMOS parallel file system [22] are divided into several strips. Each COSMOS file is stored in a certain strip. And the length and logic chunk length of the strip are related to the disk speed and file access mode of applications. This type of data placement strategy has features such as high performance, large file suitability, and high degree of parallelism. Through the stripped subfile, COSMOS is directly saved on local disks in the form of common JFS file. Thus, the expression direct management of disks is avoided while increases the overheads when entering into the core of VFS/Vnode for the second time.

4. Study and Analysis of Existing Data Placement Strategy

Here are some currently popular data placement algorithms. Standard hashing is the simplest homogeneous (indicating that all storage devices have the same volume) placement algorithm, which can ensure fairness. But when the storage scale varies, the locations of all the data have to be changed as well.

Consistent Hashing [23] uses the function to map a device to the continuum, and then the hash function   is used to evenly map the data to that continuum. Then data is allocated to that device represented by the node which is nearest to the data itself. Since devices are not evenly distributed on the continuum, each device is virtualized to devices (where is a constant) to ensure the fairness of data allocation. The data size of this device is equal to the total size of data allocated to virtual nodes. When a storage device is added to the system, only parts of the data located on the left and right neighbor nodes are to be migrated to that device. Consistent Hashing has a high degree of self-adaptability, and this mechanism takes up a space of .

As a matter of fact, the data storage under the cloud computing environment is heterogeneous, which means that there is great volume discrepancy between storage devices. Therefore, the consistent Hashing algorithm is improved as follows: the virtual nodes on the continuum are allocated based on the weight of a device. The device with greater weight covers more virtual nodes on the continuum. However, this approach would introduce large amount of virtual nodes in a heterogeneous storage system with extremely significant weight discrepancy, and this would increase the space complexity of the algorithm.

In order to solve the problem of space waste with consistent hashing, a segmentation method based on the unit interval is brought forward. In this method, the interval is divided into unit subintervals with identical length, and each device occupies an interval. When a device is added, part of the data on other devices is migrated to this new device. When a device is deleted, the data on the last device is equally migrated to remaining devices, and the data on the device to be deleted is migrated to the last device, and then the very device is finally deleted. In this way, the fairness could be guaranteed. During the device addition, the data migration volume is 1 time the optimal data migration volume. During the device deletion, the data migration volume is 2 times the optimal data migration volume. steps are needed to locate a specific data, which takes longer time than locating data with consistent hashing, but only a space of digit is occupied here. Compared to the consistent hashing, this algorithm exchanges the time for space. It is not suitable for a storage system that has a demanding requirement for rapid data search. Furthermore, this algorithm’s self-adaptability is not as high as consistent hashing.

In order to solve the problem of space waste resulting from the consistent hashing’s introduction of virtual nodes, the linear method and logarithm method are proposed. In the linear method, the weight for a device is introduced similarly. Suppose indicates the weight for device , and indicates the distance between the hash values of device and data . The linear method would select the device, which has the lowest value of , to store data .

As the storage scale changes, the linear method could guarantee that data would only be migrated between the added/deleted device and other devices. There would be no data migration between other devices. The logarithm seeks to find a device that brings the the minimum value for the function   . In the absence of the virtual nodes, the logar ithm performs with better fairness than linear one, but it would take a longer time to locate data.

As a result, a data object placement algorithm based on the dynamic interval mapping is proposed [22]. The unit space is divided into multiple subintervals according to the device’s weight. And then a mapping relationship between the device and subinterval is established. Based on the interval where the data falls in, data is allocated to the device corresponding to that interval. This approach has better fairness and self-adaptability, where the time consumption in locating data will increase with the expansion of the number of storage devices. But if the number of storage devices is extremely large, when a device is added or deleted, the system is required to communicate with all other storage devices for data migration, which will bring tremendous overheads. Furthermore, the time consumption in locating data will increase with the expansion of the number of storage devices as well.

5. A Hierarchical Structure Based on Cloud Computing

With the expansion of network scale, the number of data storage devices keeps increasing. The existing data placement algorithm is insufficient to address the system’s self-adaptability. Adding new or removing existing devices could lead to a new data placement over again, which will result in an increase of data migration overheads so that the occupation of IO bandwidth is inevitable [24, 25]. Therefore, the data reliability cannot be guaranteed, and the overheads are too large to use a duplicate copy for data reliability assurance [26]. Thus, a data placement strategy based on a hierarchical structure is brought forward in this paper for the purpose of making up for the shortage of existing data placement algorithm, addressing the system’s self-adaptability, guaranteeing data reliability, and improving the efficiency of data access.

In the proposed approach, each single storage device is directly managed through a common data placement strategy, as shown in Figure 1.

The hierarchical structure could reduce the time consumption of data query and location. As a result, a data placement strategy of hierarchical structure is more suitable for data management under cloud computing environment, as shown in Figure 2.

It is assumed in this paper that large amount of storage devices are heterogeneous in the storage system under cloud computing environment. That is to say, the storage volume of every single device is different from one another. These storage devices are grouped into several device sets that count relatively less in number. When storing the file data, it is first located on a device set, and then the file data is stored inside the device set. So that the locality of file data within this device set is ensured, and this helps improve the speed for data reading and writing.

In the case of data placement for file with several replications, different replications of the same file should be placed onto different device sets as many as possible. So that when a certain storage device within a storage device set cannot function properly, the client could get the target file data located on other device sets as usual. Thus, it could improve the availability and reliability of files.

In the data placement strategy of hierarchical structure, when a storage device is to be added, it is designed to allocate the newly added storage device to a device set; when a storage device is to be deleted from a device set, the migration data could be constrained to other different storage devices within that device set. This would reduce the overheads in communication with large amount of storage devices within other storage device sets. The I/O bandwidth occupation would be reduced during data migration. When an aged storage device needs to be replaced with a new one, firstly the data on the original device is transferred to the new device. Since the new storage device in replacement outperforms the original one in both the storage volume and read/write performance, fairness is disrupted between each storage device within the device set in regard of data storage. Therefore, data is migrated between the new storage device and other ones within that device set in order to meet fairness criteria between each storage device within that set.

6. Algorithm Description

We would group larger amount of heterogeneous storage devices into less amount of device sets. The number of already grouped sets is to be kept unchanged. The total storage volume of different storage device sets should remain the same. Files and their various numbers of duplicate copies are to be mapped to different device sets for storage using an algorithm based on semidefinite programming. Files are sliced within the device set, and then the data slices are mapped to devices with different volumes in the set using a dynamic interval mapping method.

6.1. The Semidefinite Programming Algorithm

So that the problem of data copies placement is converted into a problem of seeking semidefinite programming, different copies of a file are placed on different storage device sets. Meanwhile, according to the algorithm, the file is located on a device set and stored into various devices within the set strip by strip; thus, the file locality is ensured. The file data could be immediately accessed upon one time of locating, so that the file access speed is improved.

Function is right only when and are two different copies of the same file, or when stands for the copy of . If not, . Also, when equals . An associated matrix is constructed using . can represent the relationship between all files, that is, which file owns and which file copies. The Algorithm 1 converts the problem of data copy placement into the formalized description of a semidefinite programming problem.

Formal description of a semi-definite programming problem
Solution:
  
 Satisfying:
    are unit vectors;
    and form the matrix and all its
eigenvalues are greater than or equal to 0,
  that is, matrix is semi-definite.

Solution to the semidefinite programming problem can produce a semidefinite matrix . And further processing of the semidefinite matrix can obtain the device sets, where each file copy is stored in the storage system.

6.2. Dynamic Interval Mapping Algorithm

Supposing some device set contains devices, that is, . All these n devices have different volumes, respectively, , so that the ratio of the weight of each device volume to the total volume within this device set is , where and . It is known that . So then we segment a subinterval with the length of for each device in the interval . When the file is allocated into a device set, it is divided into data chunk sets with the same size, and, then data chunks are mapped to devices with different weight values in the set (Algorithm 2).

 Pseudocode of the algorithm
Initialization:
 Device set , Data set , and subinterval set ;
Input: data chunk
Program main: ;
  
  
  Placing the data chunk on the device
  Output: data volume stored on the device

The hash function is used to map the data chunks to the interval . If , then the data chunk   is allocated to the device mapped by the interval .

7. Experiment and Analysis

In this paper, the two key algorithms in hierarchical data placement, namely, Semidefinite programming (SDP) algorithm and dynamic interval mapping algorithm, are implemented on Matlab platform. Matrix is the basic unit of Matlab language, which could be directly used in matrix calculation. Therefore, Matlab could be directly applied to solve complicated problems such as optimization or linear programming. The semidefinite programming problem we need to solve in this paper will be described in a mathematic formalized matrix. Furthermore, it is easy to formalize a dynamic interval mapping problem into a formalized matrix, which is suitable to implement in Matlab environment. Meanwhile, Matlab features an abundant toolbox and module set. In order to seek a solution to the semidefinite programming problem, a toolbox that provides support for Matlab to solve SDP problem should be installed.

7.1. Fairness Analysis on Semidefinite Programming Algorithm

Suppose that each file has 5 copies. Then, respectively, distribute 100, 200, 300, and 400 files into 10 device sets and 20 device sets using the semidefinite programming method. The deployment is shown in Figures 3 and 4. The experiment has shown that files can be fairly evenly distributed to multiple device sets using the semidefinite programming. It has been illustrated that this approach could ensure the fairness of file data layout.

7.2. Reliability Analysis on Semidefinite Programming Algorithm

Now let us further discuss the situation for placement of 5 copies of the same file, that is, the problem regarding whether all the 5 copies of the same file have been placed into different device sets. As shown in Table 1, when 400 files (with 2,000 copies) are distributed to 10 and 20 device sets, all the 5 copies of, respectively, 299 and 372 files are completely distributed to 5 different device sets. Other files which do not include these copies failed to do this. There are 2 of 5 copies of one file that are allocated to the same device set. As a result, the semidefinite programming algorithm has shown a better performance to allocate different copies of a file into different storage device sets. Thus, the probability of data loss due to device failure is reduced, and the data reliability is improved.

Based on the principle of random function, it can be inferred that the probability for data allocated to each subinterval using the dynamic interval mapping algorithm is proportional to the length of each interval. Similarly, the data volume of all the devices inside the device set is proportional to its overheads. It has been proved that when the storage nodes inside the storage device set change, the dynamic interval mapping method can minimize the overheads of data migration under the condition that the number of storage nodes is not extremely high. So it eliminates the overheads of communicating and migrating data caused by the change of the number of storage nodes, when directly managing a very large amount of storage devices. As a new device is added, the subinterval occupied by each device within a device set changes correspondingly, reallocating the interval occupied by existing devices and the corresponding data chunk to the new device, in order to realize fairness over again. The overheads of communicating and transferring data are constrained to only those few devices inside the device set.

7.3. Fairness Analysis on Dynamic Interval Mapping Algorithm

Firstly, the fairness of dynamic interval mapping algorithm is tested. Let us take a look at the file data volume stored on each storage device within a device set. When 1,000 files are stored in 10 device sets, 100 files are stored in the no. 5 device set as indicated in Figure 3. Then, we assume that there are 10 storage devices inside the no. 5 device set. And the 100 files are stripped into 1,500 data strips, which are stored onto the 10 storage devices by means of dynamic interval mapping algorithm. For all these 10 storage devices, the percentage of each device’s storage volume in their total storage volume, as well as the interval length ( ) corresponding to that percentage are all shown in Table 2.

Based on the dynamic interval mapping algorithm and above-mentioned volume of each storage device, the stripped 1,500 data strips are equally stored into these 10 storage devices. The theoretical allocation situation is shown in Table 3.

When implementing the dynamic interval mapping algorithm, the hash function is used to map data chunk to a random number between . If , , the data chunk is placed into the storage device . Consequently, all the 1,500 data strips are stored into the 10 storage devices. Comparison between the actual allocation situation and theoretical situation is shown in Figure 5.

7.4. Self-Adaptability Analysis on Dynamic Interval Mapping Algorithm

Let us test the self-adaptability of dynamic interval mapping algorithm. The cases of removing a storage device and adding a new storage device are, respectively, considered.

7.4.1. Removing a Storage Device

Let us examine the file data volume migrated between other storage devices when a storage device is deleted from a device set. For example, Table 4 shows the situation when no. 7 device is deleted from the device set. The percentage of each remaining device’s storage volume, and the interval length corresponding to that percentage are shown in the Table 4.

When removing the no. 7 storage device, the data on that device is migrated to the remaining 9 storage devices. The situation of change relating to actual data migration is shown in Figure 6.

From Figure 7 above, we can see that after deletion of no. 7 device, those remaining storage devices in the device set can still store data according to the percentage of each one’s storage volume in the total remaining storage volume.

7.4.2. Adding a New Storage Device

Now let us examine the situation when a storage device is added to that device set. The case is similar to the above-mentioned situation when a storage device is removed. We would follow these steps below.(1)First, when a new storage device is added to that device set, the percentage of each device relative to the total storage volume is recalculated. And the interval length  ( ) corresponding to that percentage is redefined as well.(2)Then the difference between the original interval length and the revised one following the addition of a storage device is calculated. And the data corresponding to that length difference is what to be migrated into the new storage device. (3)Dynamic interval mapping algorithm is used to place the migrated data into the newly added storage device. The Hash function is used to map the data strip to a random number between (0, 1). If , , the data chunk is placed onto device .

8. Conclusion

A hierarchical structure data placement algorithm under the cloud computing environment is proposed in this paper. The proposed algorithm combines the semidefinite programming algorithm with the dynamic interval mapping method. The semidefinite programming method would distribute the data of a file with replications to grouped device sets. Experiments have demonstrated that this method could guarantee the data reliability and high-speed file accessibility. The dynamic interval mapping method could distribute data fairly to devices with different volumes inside the device set. The self-adaptability of this method is proved theoretically.

Acknowledgments

This research was supported by the National Natural Science Foundation of China (General Projects) (Grant no. 61272029), National Key Technology R&D Program (Grant No.: 2009BAG12A10), China Railway Ministry Major Program (2008G017-A), and State Key Laboratory of Rail Traffic Control and Safety, Beijing Jiaotong University, China (Contract no. RCS2009ZT007).