Virtual Net: A Decentralized Architecture for Interaction in Mobile Virtual Worlds

Shen, Bingqing; Guo, Jingzhi

doi:https://doi.org/10.1155/2018/9749187

Wireless Communications and Mobile Computing

On this page

Abstract Introduction Related Work Results Conclusions Appendix Data Availability Conflicts of Interest Acknowledgments References Copyright Related Articles

Special Issue

Interactions in Mobile Sound and Music Computing

View this Special Issue

Research Article | Open Access

Volume 2018 | Article ID 9749187 | https://doi.org/10.1155/2018/9749187

Virtual Net: A Decentralized Architecture for Interaction in Mobile Virtual Worlds

Bingqing Shen¹and Jingzhi Guo¹

Guest Editor: Federico Fontana

Received23 Jul 2018

Revised21 Oct 2018

Accepted30 Oct 2018

Published08 Nov 2018

Abstract

With the development of mobile technology, mobile virtual worlds have attracted massive users. To improve scalability, a peer-to-peer virtual world provides the solution to accommodate more users without increasing hardware investment. In mobile settings, however, existing P2P solutions are not applicable due to the unreliability of mobile devices and the instability of mobile networks. To address the issue, a novel infrastructure model, called Virtual Net, is proposed to provide fault-tolerance in managing user content and object state. In this paper, the key problem, namely, object state update, is resolved to maintain state consistency and high interaction responsiveness. This work is important in implementing a scalable mobile virtual world.

1. Introduction

Virtual worlds, including multiplayer online games and virtual social worlds, allow users to inhabit in virtual environments, create their own content, and interact with each other. Mobile virtual worlds allow users to access the simulated environments through mobile devices, achieving the possibility to play anywhere. Mobile virtual worlds have gained large attraction from the development of mobile devices. They have become an important market and revenue source for the game industry and attracted a large number of users. For example, Fortnite has earned $1,996,917 gross daily revenue [1] and reported 3.4 million concurrent players [2] in 2018. The success and expansion of mobile virtual worlds raise new challenges in infrastructure development; one of them is the scalability problem. In virtual worlds, interaction is implemented by sending events to servers for processing and receiving updates from the servers for rendering and state synchronization. With the increase of concurrent online users, more computing load is imposed on game infrastructures. Servers have to process and respond to more client requests within a short period for high responsiveness. Also, network bandwidth consumption is increased to pack multiple game states in an update. For scaling, more computing resources have to be invested. Otherwise, user experience will be affected.

Peer-to-peer (P2P) virtual worlds, firstly introduced in [3], explore the possibility of running a virtual world without a central server. In P2P virtual worlds, user devices run both the client program and server program for event handling and state update. Thus, computing resources naturally scale along with the change of user population. Mobile applications, however, have different characteristics with respect to their desktop counterparts. One outstanding issue is client failure. Compared to desktop PCs, mobile devices are more prone to failure, due to, for example, battery depletion or application conflict. Moreover, the access to mobile networks, such are MANETs and VANETs, is also unstable. Client unreliability may cause content loss or state inconsistency, if user content and object state are not properly saved or backed up before failure. Yet, existing P2P virtual worlds do not concern the peer device unreliability problem [4]. Thus, they cannot be directly applied in mobile settings.

In this paper, a Virtual Net model is proposed to address the client unreliability problem for mobile P2P virtual worlds. The model utilizes the cloud-fog structure, but totally decentralized. To avoid content loss, the cloud layer stores user contents for content persistency. The fog layer caches object states for client recovery and maintains state consistency. The separation of content storage and state caching can improve responsiveness, since operations direct on P2P storage have more communication overhead [5]. Based on the P2P content storage, a content addressing scheme is devised, which can facilitate content integrity check.

To avoid reinventing the wheel, this paper mainly focuses on the state update problem to maintain object state consistency. At the fog layer, object states are replicated on several nodes for fault-tolerance. Thus, all replicas must maintain the same state in event handling so that interaction can be performed within a consistent shared environment. Yet, the requirement of high responsiveness in virtual world interaction makes the problem difficult. To attack the difficulty, an opportunistic approach, called fast event delivery, is proposed. Based on the approach, a virtual world interaction model is then designed. In short, the main contributions of the paper are listed as follows.(1)A new P2P cloud-fog structure, called Virtual Net model, is proposed to resolve the client unreliability problem, which can provide fault-tolerance in playing a mobile virtual world.(2)A fast event delivery approach is proposed to maintain both replica state consistency and high responsiveness in the process of handling user events.(3)A new virtual world interaction model is designed to achieve game state consistency and high responsiveness when interacting with different neighbors.

The remainder of the paper is organized as follows. The related works are introduced in Section 2. The overall Virtual Net model is described in Section 3. Section 4 studies the state update problem in detail. Based on the solution of the problem, the virtual world interaction model is provided in Section 5 with neighbor change management. The correctness of the solution is proved in Section 6. Sections 7 and 8 evaluate the performance through theoretical analysis and experiments. Section 9 concludes the paper.

Mobile P2P virtual worlds combine the characteristics of mobile virtual world and P2P virtual world problems. Due to the lack of study in this field, the related work in P2P virtual worlds and cloud-fog mobile applications is surveyed to shape the distinct characteristics of the combined problem.

2.1. P2P Virtual Worlds

P2P MMORPGs and P2P virtual environments have been amply surveyed in [4, 6]. Previous works mainly focus on inter-player consistency management, including peer connectivity, interest management, event dissemination, and cheat prevention. Peer connectivity [7] studies the connection of all user devices within an overlay network such that any peer can be reached from another peer. Interest management [8] restricts the range of message receipt to reduce communication overhead in state update. Event dissemination [9] reduces the number of communication channels on event senders to avoid overwhelming them in hotspot areas. Cheat prevention [10] is needed to achieve fairness without the arbitration from a central server. In these works, a desktop environment is assumed such that a client is always reliable in storage and connection. In contrast, the Virtual Net targets the mobile environments in which both devices and connections are unreliable, which is the new problem and orthogonal to the above studies. Thus, a complete implementation of Virtual Net can employ existing P2P solutions in inter-player consistency management, such as peer connectivity and interest management, to avoid reinventing the wheel.

Early work on P2P state persistency is related to the content storage in this work. State persistency studies the reliable storage and efficient retrieval of user state [11]. Each time a state is updated, it has to be persisted in the overlay network, and the state has to be queried from the overlay network when the client is recovered from a failure. Same as the above argument, the work in [11] only assumes a reliable client, which is not applicable in a mobile setting. The Virtual Net model not only solves the unreliable client problem, but also reduces storage and retrieval overhead through content caching. Moreover, content integrity check is included in Virtual Net, which is not mentioned in previous works.

2.2. Fog Computing

Firstly introduced in [12], cloud gaming moves the game engine functions to the cloud to simplify development, distribution, access, and update [13]. However, the measurement study [14] shows that the current cloud gaming infrastructure is unable to meet the latency requirement for end-users distant from data centers. To improve latency, fog computing [15] has been introduced to move the time-critical functions to the locations near clients. Fog computing has been widely discussed in both Internet of Things (IoT) [16] and mobile computing [17] to offload server burden [18], enable location awareness, and provide real-time interaction. Among its many applications, mobile gaming [19] and mobile reality [20] are two important examples. Similar to the cloud-fog structure, the Virtual Net solution also employs the cloud layer for content storage and the fog layer for latency improvement. But differently, Virtual Net explores a totally decentralized solution, with no central control at the cloud layer.

3. Virtual Net Model

The proposed Virtual Net structure is based on the commonly used three-layer structure shown in Figure 1. Similar to some existing cloud-fog structures, it is divided into three layers: the cloud layer (L1), the fog layer (L2), and the client layer (L3). The cloud layer provides persistency service, which stores the files of user content and the state of virtual objects (avatars, accessories, achievements, etc.). The fog layer caches object states in play and provides state recovery for clients in case of short-term failure. It also periodically checks object states and saves them to the cloud layer for state persistency, which is asynchronous to event handling. When a user leaves a game, the cached state of user object will be saved to the cloud layer. The client layer provides user interfaces for receiving user operations and displaying updated states for user interaction. Virtual worlds are latency-sensitive applications. Yet, on one hand, clients require fast state update in user interaction [21]. On the other hand, the complexity of peer-to-peer routing slows down the process of content storage and retrieval [22]. Thus, the fog layer is padded between L1 and L3 to improve responsiveness in fault-tolerance.

The three-layer architecture is resilient. First, L1 and L2 can be individually scaled without affecting each other, since they are built for different purposes. The cloud layer focuses on the long-term storage of user content, which is only accessed at user login, logout, and periodic state checkpoint by the fog nodes. On the other hand, the fog layer maintains the latest state of user content and provides state recovery from intermittent client failure. Except for state initialization and checkpoint, L1 and L2 do not need to interact with each other. Besides, the model provides some extent of isolation of failure. The failure of one layer can be recovered by another layer, since each layer has a separate copy of content.

Different from the existing cloud-fog computing paradigm, computing resources in the cloud and fog layer are P2P nodes, like BitTorrent or eDonkey. Specifically, users contribute part of the computing resources from their devices which can be smartphones, laptops, desktop PCs, or even servers. A device is divided into one or several virtual nodes [23] for fine-grained load balancing. All virtual nodes are managed by a node pool. For different computing purposes, there are two types of virtual nodes: storage nodes and cache nodes. The storage nodes construct the cloud layer and the cache nodes construct the fog layer. Thus, Virtual Net is a decentralized computing paradigm. A client could be on the same device of a virtual node, like BitTorrent, or on a separate lightweight device.

3.1. P2P Cloud Layer

Object files are stored on the cloud layer through P2P file storage. Based on the file storage system, a content addressing scheme is devised, which can not only provide flexibility in content identification and addressing but also provide integrity in object management.

3.1.1. File Storage

The TotalRecall [5] storage architecture is applied to manage the storage nodes for file storage. The details of the design and performance can be found in [5]. Here, only the overall mechanism is introduced. In TotalRecall, each node is assigned a unique hash code as the node ID. Also, each file has a file ID which is the hash checksum of the file. When a new file is created, the file is associated with a storage node, called the master node whose ID is closest to the file ID. Other nodes hosting the data of the file are called host nodes. Master nodes manage the location of host nodes and the version control for the associated files. Each storage node can be the master node for some files and the host node for other files. Thus, the entire storage node network forms a distributed hash table (DHT) for file lookup. To request a file, its master node is found first with the file ID. Then, based on the reply from the master node, the host nodes are located and the file can be retrieved (or reconstructed).

3.1.2. Content Addressing

To retrieve the objects from the cloud layer, object content needs to be identified and addressed. A hierarchical content addressing scheme is devised, which can facilitate content integrity check. The devised content addressing scheme has four hierarchies: inventory, objects, components, and files, as illustrated in Figure 2.

Inventory-level: each user has an inventory file, identified by the inventory ID which is the hash code of the user ID. An inventory contains all the object descriptions, consistently managing content identification and modification. Thus, to retrieve the object contents, the inventory file needs to be retrieved first. Object-level: an object is identified by the object hash code and composed of one or multiple components. Component-level: each component is identified by the component hash code. Object components are the categories of object resource files, which are classified into animation, sound, texture, script, etc. File-level: the actual files of objects are addressed by file IDs in object descriptions. Through files ID, the actual file can be retrieved either from the local cache or from the DHT of the cloud storage.

Based on the structure of the content addressing scheme, a Merkle tree [24] (Figure 2) can be hierarchically constructed with the file hash code, component hash code, object hash code, and inventory hash code. With the Merkle tree, the integrity of user content can be recursively checked and the number of hash comparisons can be largely reduced [25]. Typically, a client caches more than 500,000 files of user-created contents [26]. Thus, an exhaustive search of updated files will be inefficient. We conduct an experiment with 200 objects and more than 5,000 files. Compared with the file-level and object-level content integrity check [26], Figure 3 shows that the proposed four-level content integrity verification has fewer hash comparisons, especially with respect to a small number of file changes.

3.2. P2P Fog Layer

The fog layer is added between the cloud layer and the client layer to mask the latency of content storage and meanwhile provide fault-tolerance. From the user perspective, each user is allocated some cache nodes, when he/she is playing in a virtual world. These cache nodes provide the user some computing resources, forming a logical computing unit. We call it mesh computer, as illustrated in Figure 4. When a user logs to the system, his/her client firstly initializes the mesh computer by requesting for some cache nodes from the node pool. The cache nodes then retrieve the content from the cloud layer. The client also retrieves the saved content from the cloud layer for content rendering and state synchronization. When the mesh computer receives a quit instruction from the client or the client is experiencing a long-term failure, the mesh computer will release the cache nodes to the node pool. Optimal resource allocation and cost minimization have been studied in [23], which is out of the scope of this paper.

Due to the unreliability characteristic of P2P nodes, they are subject to (either temporarily or permanently) failure. Thus, for reliability purpose, a mesh computer maintains multiple cache nodes which are the replicas of the same user content, called a replica group. Content will be transferred from failed nodes to live nodes. Replica group management has been studied in our previous work [27]. This paper focuses on replica state management in the following sections.

4. Object State Update

At the fog layer, it is important that all replicas of the same group maintain the same state of user objects so that any failure of a replica will not invalidate a user’s current state. The problem becomes challenging, since replicas could receive different sets of concurrent events from different senders and events could be received in different orders. The state machine replication (SMR) [28] approach is adopted to manage object state. SMR is a fault-tolerance model replicating a deterministic finite state machine on a set of distributed nodes, each of which has the same input, output, and state transfer. In an asynchronous cycle, firstly, clients send requests to all the nodes. On receiving the requests, a consensus protocol is triggered to determine the sequence of requests. Then, all nodes process the requests in the decided sequence so that they can reach the same new state. To reduce communication overhead, one replica is elected as the leader, coordinating membership reconfiguration and request ordering.

In virtual worlds, however, the consensus process adds a large delay in user interaction, because a requested event must be agreed on by all replicas after at least two communication rounds (i.e., four communication steps) [29] to reach an agreement, before it can be handled and replied to clients. The interaction delay issue in event handling is addressed based on the following observations. Due to users’ limited perception range and motion speed, the number of event senders within a small period is fixed. Thus, the number of concurrent event senders within the period can be known a priori. Based on this observation, we propose a fast event delivery approach.

4.1. Fast Event Delivery

Fast event delivery allows a replica to directly deliver a received event through a cycle-event mapping, if it can ensure that the same event will eventually be delivered by all replicas. Specifically, the timeline is divided into infinite cycles of length ∆t. From cycle c₀, an event sender s periodically broadcasts an event to the replicas in each cycle. Each event is identified by the sender ID and the sequence number. The sequence number of the first event at cycle c₀ is 0. If there is no operation, s just broadcasts a no-op event. At the receiving end, c₀ is also known by all replicas. Each replica delivers an event with sequence number c - c₀ from s for cycle c, which is called the event of cycle c. Events for cycle c from different senders will be ordered by sender ID. If a replica does not receive the event for cycle c from s, it will start an instance of consensus for the cycle. In the consensus, if a replica has received the event for cycle c, that event will be decided by the leader and delivered by all replicas. Otherwise, they will decide and deliver an empty event for cycle c. Events will be delivered to a queue Q_d first and then sent to the application from the queue for handling in sequence.

The relation of cycle (c_i), sender ID, event sequence number, delivery queue (Q_d), and delivery sequence (λ) are illustrated in Figure 5. Specifically, in Q_d, the subscript of event e denotes the event sequence number which is equal to the event sending cycle. Thus, for the same cycle, the events in Q_d are sorted by the sender ID and mapped to the local index numbers in Q_d (i.e., the second member in the tuples of Q_d). λ represents the global index of events delivered to the application, which will be introduced in Section 4.1.4.

Figure 6 illustrates the fast event delivery process with one sender s and three replicas r₁, r₂, and r₃ in the replica group g. s broadcasted four events, e₁, e₂, e₃, and e₄, at cycles c₁, c₂, c₃, and c₄ to g. All replicas received e₁ at cycle c₂ and e₄ at c₅. Only replica r₁ received e₂ at c₃. No replica received e₃ at c₄, but r₁ and r₂ received e₃ at c₅. r₁, r₂, and r₃ deliver e₁ for c₁. They then collectively decide e₂ for c₃ and an empty event for c₄ through consensus. The first problem is how to decide c₅ and e₃. According to the cycle-event mapping principle (i.e., one-cycle-one-event), the replicas should only deliver e₄ for c₅ and discard e₃, leading to event loss. Before discussing the late events handling problem in detail, the settings and assumptions of the system will be introduced first.

4.1.1. Settings and Assumptions

For fault-tolerance, a replica group contains at least n replicas. The minimal group size n is determined by the content availability requirement [27] and the replica failure rate. To reduce replication overhead, each group also has an extra number e of nodes for lazy repair [5]. Once e + 1 replicas fail, new replicas will be added to recover the group size to n + e.

In each replica group, there is one non-replica node monitoring the state of all replicas, called Rendezvous [30]. A Rendezvous uses timeout to determine the state of replicas and then broadcasts their states to all replicas. Monitoring replica state is implemented by exchanging heartbeat messages between a replica and a Rendezvous. If the Rendezvous does not receive one heartbeat message within a cycle, the replica is treated as failed and removed from the group. New replicas are also added by the Rendezvous, once the group size is smaller than n. Rendezvouses are reliable nodes, or called super-peers [19], since the existence of a group is determined by the Rendezvous. Once a Rendezvous fails, a new Rendezvous must be assigned to the replica group, which then rebuilds the replica group and recovers the object states from the cloud storage. By exchanging heartbeat messages, each replica learns the current membership of the group g, denoted by G, which contains all live replicas of group g. When the Rendezvous tells a replica that a member has failed or a new member is added, the replica will remove the member from G or add the member into G.

The system is assumed to be live. The SMR model contains three types of group-wide activities: leader election, group reconfiguration, and consensus. The liveness assumption ensures that when an activity is needed, it will eventually succeed after a finite number of failures.

Each replica group maintains a set of event senders. It is assumed that each event sender has an ID which is globally unique and sender IDs are comparable. Moreover, a replica group will append the join timestamp to sender IDs to distinguish a sender in two different joins of a sender set.

Let s be the ID of an event sender, c be a cycle number of a replica group g, and r_i be the i^th replica in group g. Some important relations of event, event sender, event sequence number, and cycle number are defined in Table 1. Other notations used throughout the subsequent sections are listed in Table 1. Besides, event names are capitalized.

4.1.2. Late Event Handling

An event is late if the event of cycle c is received after c on all replicas. Formally, a late event e satisfies Seq(e) = Seq(s, c) ReceiveCycle(e) > c on r_i∈ G and Seq(e) = Seq(s, c) (ReceiveCycle(e) > c ReceiveCycle(e) = ) on r_j∈ G ∖ . For example, e₃ in Figure 6 is a late event.

To ensure the agreement of cycle event delivery on all replicas, a late event can be simply discarded, since any event can be re-sent by a client with a new sequence number if the client does not receive the reply for the event for a period. However, if a sender’s clock is temporarily out-of-sync with the replicas’ clock or a large sending delay is experienced, a large number of events could be discarded and need to be re-send, as shown in Figure 7.

To address the late event handling problem, a dynamic cycle event delivery approach is proposed, which includes two conditions for late event delivery. The purpose of the approach is to minimize the number of event discards, and meanwhile each replica can decide the delivery of late events with only local information. Below are the conditions of late event delivery. In short, only late and out-of-order events will be discarded.(1)At cycle c, all events from sender s with sequence number less than c will be deliverable. Formally, ∀e(s, j₁), e(s, j₂), …, e(s, j_n), (2)At cycle c, an event will be nondeliverable, if one of its subsequent events has been delivered before c. Such event is a late and out-of-order event. Formally, ∀e(s, j),

To implement the dynamic cycle event delivery approach, the lowest deliverable sequence number from any sender needs to be determined first for all cycles. Specifically, at cycle c, let MinSeq(s, c) be the lowest sequence number of all undelivered events from sender s and MinSeq(s, c) ≤ seq(s, c). Also, let MaxSeq(s, c) be the sequence number of the last delivered nonempty event in cycle c from s. Then, MinSeq(s, c) = MaxSeq(s, c − 1) + 1, where MaxSeq(i, c − 1) is determined by the event delivery for cycle c - 1. Define the set of expected deliverable events from s at cycle c by Ω(s, c) = [MinSeq(s, c), Seq(s, c)] and the set of actual received events Π(s, c). The actual deliverable events from s at cycle c can be filtered by , which excludes the late and out-of-order events.

4.1.3. Total-Order Event Delivery

Total-order event delivery is the key mechanism in object state update to ensure that all replicas in the same group can reach the same state along the same path of state transfer, if no more event is received. By applying the dynamic cycle event delivery approach, the event delivery for one cycle is described in Algorithm 1, where E(c) stores the events from the consensus for cycle c and γ is the event delivery index in cycle c. γ is calculated by . (c, γ) and the calculation of γ ensure that the events in Q_d are sorted first by cycle number, then by sender index, and lastly by event sequence number, which can sort all events in the same order on all replicas.

1. For cycle c,
2. If E(c) ≠ ∅, then
3.
4. c ← c + 1
5. Else,
6. For ∀s ∈ S,
7. If Π(s, c) = Ω(s, c), then
8. Q_t ← Q_t∪
9. Else,
10. Q_t ← ∅
11. Consensus for (c, )
12. End the loop
13. If Q_t ≠ ∅, then
14. Q_d ← Q_d ∪ Q_t
15. c ← c + 1

In a run for cycle c, each replica firstly checks whether there are any events decided for the cycle from any consensus instance. If they exist, these events will be directly moved to Q_d for event handling by the application. Otherwise, Algorithm 1 checks the condition Π(s, c) = Ω(s, c) for each sender s to ensure whether all expected deliverable events have been received. If there is an expected deliverable event not received in this cycle, then the replica will trigger a consensus instance to determine the event delivery for cycle c. Note that the cycle number c will only be increased if the events of the cycle have been delivered.

The proposed consensus algorithm is described in Algorithm 2 and illustrated in Figure 8. The consensus request is composed of the cycle number c and the sequence number of all the expected deliverable events in c from all senders. By receiving the consensus request, each replica proposes the actual deliverable events to the leader. If a replica does not receive an expected event, it will propose for the event. On receiving all proposals, the leader then decides the event for each sender and each expected sequence number. If at least one replica proposes a non- and nonempty event e(s, j) for j ∈ Ω(s, c), then e(s, j) will be decided for sequence number j from s. Otherwise, an empty event will be decided for the slot. After the events of all slots have been decided, the leader will broadcast the decision to all replicas, and they will move the decided events to E(c) for event delivery after receiving the decision. It is assumed that reliable point-to-point and multicast communication channels [29] are applied in the consensus protocol.

1. Given (c, ),
2. r_i proposes:
3.
4. r_L decides:
5. For each and j∈Ω(s, c),
6. // Proposals: the set of proposed events for c from all replicas;
7. If ∄e(s, j) ≠ ⊥ ∧ e(s, j) ∈ Proposals, then
8. e(s, j) ←Empty
9.
10.
11. r_n applies the decision: E(c) ←D(c)

The complete algorithms of total-order event delivery, including event collection, event delivery, and consensus, can be found in Appendix A.

4.1.4. Garbage Collection

To avoid buffer overflow, events which have been delivered to the application need to be removed from Q_d to avoid Q_d from unlimited growth or even overflow. Due to asynchronous event handling, however, a replica r_i cannot safely remove a delivered event only with local information, because another replica r_j later may request the removed event for a given cycle if r_j does not receive it. Thus, determining the removable events is a challenging problem.

Learned from Algorithm 1, since all events in Q_d can be uniquely identified by (c, γ), there exists a relation that maps each (c, γ) to a unique integer number λ∈, called delivery sequence. Thus, events in Q_d can also be identified by (λ, e) and each (c, γ, e) and (λ, e) has a one-to-one mapping for the same e. Let Q_d be a sequence e_ie_i+1... e_j...e_k mapped to λ_i λ_i+1… λ_j… λ_k. It can be observed that the events which can be safely pruned satisfy the following characteristics.(1)If e_j can be pruned from Q_d, then all events before e_j can all be pruned from Q_d.(2)Let L = e_ie_i+1...e_j be the subsequence of Q_d. If L can be pruned from Q_d on one replica, then L can be pruned from Q_d of all the replicas in group g.(3)If e_j can be pruned from Q_d, then λ_j ≤ λ_c, where λ_c is the sequence of the last event handled by the application, returned by the function LastApplied(Q_d).(4)Trailing rounds with empty events cannot be removed from Q_d, since the events of these rounds may be used in consensus for deciding the value of late events.

and imply that there is a common latest applied event e_cle such that all the events delivered before e_cle (including e_cle) have been applied on all the replicas, whereas the events after e_cle are undecidable. The delivery sequence of e_cle is denoted by λ_cle. implies that λ_cle cannot exceed λ_c. restricts the range of garbage collection. Thus, all the events before λ_cle, which are not empty trailing events, can be safely removed from Q_d.

Based on the above observation, a gossip protocol is devised to learn λ_cle by exchanging λ_c of all replicas for safely estimating the earliest removable event, which is described in Algorithm 3. In the protocol, each replica periodically sends its λ_c to other replicas. A replica also caches the received λ_c. Based on the latest received λ_c from all replicas, λ_cle can be determined by the minimal λ_c. Then, λ_cle is adjusted to exclude the trailing empty events (Lines 12-16). Lastly, all the events before λ_cle are removed from Q_d. In Algorithm 3, Λ caches the received λ_c’s from all replicas in G. For a λ_c from r_i, if λ_c is greater than the cached value of r_i in Λ, the new λ_c can safely replace the existing one since λ_c from the same replica are monotonically nondecreasing.

1. On replica r_i:
2. Upon Timer TIMEOUT
3. λ_c←LastApplied(Q_d)
4. Broadcast λ_c to all r∈G
5. Reset Timer
6.
7. On replica r_j:
8. Upon λ_c from
9.
10. If , then
11.
12. If () = Q_d.last // last event in Q_d
13. Map λ_cle to (c, γ)
14. While
15. λ_cle ←λ_cle -
16. c ← c - 1
17. Q_d ← Q_d ∖

4.1.5. Time Synchronization

In the fast event delivery approach, another key component is the synchronization of the start and end time of a cycle on event senders and recipients (all the replicas in group g) to minimize the chance of handling late events through consensus. Specifically, let ∆t_n be the amount of network latency. Assume that the upper bound and lower bound of ∆t_n, denoted by ∆T_n^L and ∆T_n^H, can be estimated such that most ∆t_n falls within the range [∆T_n^L, ∆T_n^H]. Then, cycle length ∆t can be determined by ∆t = (∆T_n^H – ∆T_n^L).

Let t_start,s be the start time of the first cycle for event sender s designated by the replicas. Also, let t_send,s(n) be the send time of the n^th event from s to the replicas. s firstly calculates t_send,s(1) by t_send,s(1) ≤ t_start,s – ∆T_n^L. Then, in the n^th cycle, it calculates the sending time of the n^th event by t_send,s(n) = t_send,s(1) + (n – 1)·∆t. At the receiving end, all replicas are timed to receive the n^th event from s at t_recv,s(n) = t_send,s(n) + ∆t = t_start,s + n·∆t. To timely collect received events, event collection and event delivery can be run by different threads with different buffers (see Appendix A for details).

A time server, such as a NTP server [31], can be deployed to the system for synchronizing the clock of event senders and recipients, which can improve the performance of the system.

4.2. Leader Election and Group Reconfiguration

Once the leader fails, a new leader is elected through leader election. Since both group reconfiguration and event handling rely on group leader, leader election has the highest priority in the three routines. It interrupts any ongoing group reconfiguration or event delivery process. The leader election criterion is replica age, which increases by one after each group reconfiguration. Based on the assumption that node failure rate increase with time, the new leader is the youngest replica in the group.

Group reconfiguration adds new members for fault-tolerance. A group reconfiguration will be triggered once the group size is lower than n and recover group size to n + e. Group reconfiguration has higher priority than event delivery, so that new members can be quickly added to a group. After group reconfiguration, the leader will also notify all senders of the new configuration.

In both leader election and group reconfiguration, the leader will decide the current states, namely, Q_d, E, and G, and synchronize them to all replicas so that all replicas will load the same state after a leader election or a group reconfiguration, which is called state synchrony. For new replicas, the application state, the sequence of the last applied event λ_c, the time of the first cycle t₀, the start time t_start,s for each sender s, and the sender set S are also synchronized from the leader for initialization. The detailed algorithms of leader election and group reconfiguration are in Appendix B.

5. Virtual World Interaction

Virtual world interaction describes how users manipulate the state of virtual objects and perceive the state change in a shared simulated environment. Following the definition, a virtual world interaction includes two steps. First, a user modifies the state of an object through operations. Second, the new object state is synchronized to other interested users. This section extends the proposed event delivery approach for supporting interactions in Virtual Net.

5.1. Flow of Events and Updates

An object is replicated on multiple hosts (i.e., clients and mesh computers) if multiple users operate the same object. To facilitate object state consistency management in interaction, the copies of objects are classified into authoritative copies and nonauthoritative copies. Each object can only have one authoritative copy but multiple nonauthoritative copies. An authoritative copy is maintained by one mesh computer, e.g., the object owner’s. The nonauthoritative copies are maintained by the clients and the mesh computer of other interested users for fault-tolerance. Interest management has been intensively studied in [8] and thus is not discussed in this paper. It is only assumed that a user’s interest scope is determined by his/her perception range in a virtual world, as illustrated in Figure 10.

By distinguishing authoritative copies from nonauthoritative copies, object state management is simplified to managing the state of an authoritative copy and synchronizing the updated state from the authoritative copy to nonauthoritative copies. For managing the authoritative copy, since the data is replicated to multiple nodes in a mesh computer, the fast event delivery approach is applied for maintaining the same state among these replicas.

From the perspective of an authoritative copy, an interaction includes receiving the event from one client, handling the event after it is delivered to the application, and multicasting the updated state to all interested hosts. Figure 9 illustrates the flow of events and updated states. The events of an object are only sent from the clients to the mesh computer which maintains the authoritative copy of the object, while updated states are broadcast by the mesh computers to all the nonauthoritative copies. To support interaction, each mesh computer maintains two sets: the event sender set S and the update recipient set U.

(a)

(b)

5.2. Neighbors

To reduce overhead, each user only communicates with a limited number of peer users, called the neighbors. Due to user mobility, a user’s neighbor may be frequently changed. A neighbor join happens when another user enters the perception range of a user. Likewise, a neighbor leave happens when a neighbor moves out of a user’s perception range. For neighbour change, the key problem is to determine the same cycle of neighbor change on all replicas for their agreement on the cycle events. The join/leave cycle can be simply synchronized through consensus. However, high neighbor dynamics will increase the number consensus, resulting in high communication overhead and high interaction latency.

To apply the fast event delivery approach in neighbour change, the connectivity maintenance approaches of mutual notification [6] are employed. Specifically, two types of neighbour are introduced:(1)Perception neighbor set (N_p): the set of users and their virtual objects appearing in the perception range of a user.(2)Connectivity neighbor set (N_c): the set of users logically connected to the user.

Assume each user maintains a set of connectivity neighbors N_c. How to achieve it in a P2P virtual world can be found in [6]. A user (called User i) periodically exchanges its perception neighbor set N_p with the connectivity neighbors. Once a connectivity neighbor finds that another user should/should not be in N_p, it will notify User i. To facilitate description, some abstract functions are introduced:(i)Multicast(e, y, g): event e with sequence number y is sent to all replicas of group g.(ii)Handle(e): event e, which has been delivered to Q_d, is handled by the application.(iii)Time(t): set the timer to t, which will trigger a timeout event at t.(iv)EVENT ← c: assign content c to event EVENT.

5.3. Neighbor Join

Suppose User j is one of the connectivity neighbors of User i. User j discovers that another user User k is in the perception range but not in the N_p of User i; it will notify User i for adding the new neighbor with the following procedure. To distinguish clients from mesh computers, let p_i be the client of User i and r_i be the replica of group g_i (i.e., the mesh computer of User i), the same for p_j and p_k.

Step 1. p_j Multicast(ADD_NEIGHBOR ← (p_k, G_k), y, g_i).

Step 2. r_i Handle(ADD_NEIGHBOR) for cycle c, ∀r_i∈ G_i.(a)r_i modifies S← S∪ , U← U∪ .(b)r_i calculates t_start,k = t_recv,j(y) + n·∆t and t_recv,k(1) = t_start,k + ∆t.(c)r_i calculates cycle c_k = ⌈(t_recv,k(1) – t_now) / ∆t⌉ + c.(d)r_i Time(c_k) for receiving the first event from p_k.(e)r_i sends (HANDSHAKE ← (t_start,k, G_i)) to p_k.

Step 3. p_k adds G_i to the recipient list.

Step 4. p_k Calculate t_send,k(1) ← t_start,k − ∆T_n^L.

Step 5. p_k Time(t_send,k(1)).

Step 6. p_k Multicast(e, 0, g_i) at t_send,k(1).

Step 7. p_k Time(t_start,k + ∆t) for the next event.

Step 8. r_i receives EVENT at c_k.

Step 9. r_i delivers EVENT for c_k.

The neighbor join process is illustrated in Figure 10(a). First, the connectivity neighbor p_j sends the ADD_NEIGHBOR event to all replicas in g_i for adding new neighbor p_k. Then, each replica r_i modifies the event sender and the update set recipient set, calculates the first event start time for event sending and receiving, and notifies the new neighbor p_k. The client p_k is timed to send the first event at t_recv,j(y) + n·∆t − ∆T_n^L (where n, ∆t, and ∆T_n^L are preconfigured). Meanwhile, the replicas of g_i are waiting for the event of the first cycle c_k at t_recv,k. Since the start time t_start,k and the first event sequence number j₀ = 0 are known to both communication ends, they can individually calculate the time of subsequent event sending and receiving. At last, client p_i learns the new neighbor through the update from g_i and renders it to User i.

5.4. Neighbor Leave

The procedure of neighbor leave is similar to but simpler than the neighbor join procedure. Suppose User j is one of the connectivity neighbors of User i. When c_i discovers that another user User k is out of the perception range but still in N_p of User i, it will notify User i with the following procedure.

Step 1. p_j Multicast(RM_NEIGHBOR ← (k), y, g_i).

Step 2. r_i Handle(RM_NEIGHBOR) for cycle c, ∀r_i∈ G_i.(a)r_i modifies S ← S ∖ , U ← U ∖ for cycle c + 1.

The neighbor leave process is illustrated in Figure 10(b). The connectivity neighbor p_j sends the RM_NEIGHBOR event to all replicas in g_i for removing the neighbor p_k, which can then remove p_k and G_k in both the event sender set and the event recipient set for cycle c + 1. Through the update from g_i, then client p_i learns the leave of neighbor p_k and removes p_k from display.

6. Theoretical Verification

The correctness of the state update design is determined by the state of all replicas in a group, as well as the clients. Firstly, the correctness of leader election and group reconfiguration are verified, since they support the other propositions. The proof of all lemmas and theorems can be found in Appendix C.

Lemma 1 (leader election synchrony). All the live replicas in G maintain the same Q_d, E, and G after leader election.

Lemma 2 (group reconfiguration synchrony). All the live replicas in G maintain the same Q_d, E, and G after a group reconfiguration.

Next, without loss of generality, the correctness of the consensus protocol is verified for an arbitrary cycle c. The validity property and the integrity property [29] are not verified here, since they are not related to the main result and easy to be verified. Interested users can prove them. Here, only the agreement property is verified.

Lemma 3 (consensus agreement). If a live replica r_i∈ G delivers an event e to E(c) from a consensus instance for cycle c, then e is eventually delivered to E(c) by all the live replicas.

With the above lemmas, the main result can be obtained. But before it, an important property of the late event handling approach needs to be verified first.

Lemma 4 (Ω(s, c) agreement). All the live replicas in G expect delivering the same set of events Ω(s, c) for sender s∈ S and cycle c.

Now, the main result of theoretical verification can be presented with the following theorem and corollary.

Theorem 5 (total-order event delivery). If a live replica r_i∈ G delivers two different events e₁ and e₂ into Q_d with λ₁ and λ₂, then e₁ and e₂ will eventually be delivered into Q_d on all the live replicas with λ₁ and λ₂ being two non-negative integer numbers and λ₁ ≠ λ₂.

Corollary 6 (replica synchronization). All the live replicas in G maintain the same state of their virtual objects.

Another important result is the correctness of garbage collection, which is verified in Theorem 7.

Theorem 7 (garbage collection safety). If event e is removed from Q_d on r_i∈ G, then e has been handled by the application on all the live replicas in G.

Based on Theorem 5, the correctness of the neighbor change procedures is shown with the following corollaries.

Corollary 8 (total-order event delivery with sender join). All the live replicas in G deliver the same first event e₀ from a neighbor s with the same delivery sequence λ₀.

Corollary 9 (total-order event delivery with sender leave). All the live replicas in G deliver the same last event from a neighbor s with the same delivery sequence .

7. Performance Analysis and Comparison

The performance of the proposed fast event delivery approach is studied in terms of synchronization delay and update loss rate. Three alternative approaches are introduced and compared with the proposed approach: the primary-backup approach, the reliable primary-backup approach, and the consensus-based total-order approach.

In the primary-backup approach [11], one replica is the primary replica and the rest are the backup replicas. The primary receives and handles all events and then broadcasts updates to recipients. Meanwhile, the primary replica sends the received events to backups for fault-tolerance. In a reliable primary-backup, the primary broadcasts the update only after the events have been reliably synchronized to all backups. Note that the unreliable primary-backup approach does not ensure state consistency in case of primary failure. The consensus-based total-order approach [29] is similar to the proposed design, except that all events are delivered through consensus. Specifically, in each cycle, all replicas propose the received events within the cycle; the leader decides the events delivery order for the cycle.

Synchronization delay describes the time consumed in synchronizing the events over all live replicas. The primary-backup approach does not have a synchronization delay. In the reliable primary-backup approach, only 2 communication steps are involved in event synchronization: the primary broadcasts the events to all backups and collects the response from the backups. The consensus-based total-order approach needs one more communication step, as shown in Figure 8. In the proposed approach, synchronization delay is factored by the probability p_sync of triggering the consensus protocol.

Update loss rate describes the probability that a client does not receive the corresponding update after it sends an event to a mesh computer, due to event loss or update loss. In the primary-backup approach, update loss will occur as long as the channel between an event sender/update recipient and the primary replica is failed. In the consensus-based approach and the proposed approach, update loss occurs only when no replica receives the event or all replicas fail to send the update to a recipient. Moreover, assume that late and out-of-order events are discarded in all the approaches.

The performance comparison of different approaches is shown in Table 2, in which d_c denotes the delay in collecting a message from all replicas, d_m denotes the reliable multicast delay, and p_loss denotes the probability of message loss on a link.

The comparison result shows that if p_sync is small, i.e., transmission latency and clock offset are low, then the synchronization delay of the proposed approach is small and may even be close to that of the unreliable primary-backup approach. Thus the proposed approach can opportunistically provide higher responsiveness than the consensus-based total-order approach and the reliable primary-backup approach.

For update loss rate, for n ≥ 2, which can be proved as follows.

First, let n = 2. Then, Thus, < for n = 2.

Second, let , x ∈. Then, Thus, monotonically decreases along with n.

Therefore, for n ≥ 2. This shows that the consensus-based total-order approach and the proposed approach have lower update loss rate than the primary-back approaches.

8. Experiments and Results

8.1. Simulation Setup

The proposed model is evaluated by simulating distributed computing. Experiments are run in OMNeT++ to simulate message transmission in a network and event-based programming (simulation code at https://github.com/sunniel/VirtualNetEventHandling). The simulation is run by sending events from 10 clients to a replica group, representing 10 neighbors. The replica group size is configured to 5. Cycle length is set to 200ms. In experiments, each client sends more than 9000 events to the replica group, which can simulate a half-hour game session with 200ms user operation interarrival time, applicable to most game genres [32]. After events are sorted and handled, updates will be transmitted to clients for collecting the statistic result. New replicas are generated by the Rendezvous; if the group size is lower than the availability threshold new replicas will be generated by the Rendezvous.

The network traffic model includes packet latency and packet loss. The packet loss rate is varied to simulate different rate of message loss rate p_loss. The packet delay is calculated by one-trip communication delay and network jitter. To facilitate simulation, network traffic is generated by a generating function from the analytical result of real data. Reference [33] suggests that the one-way delay between two hosts H₁ and H₂ can be modelled by delay(H₁, H₂) = D_min + jitter, where D_min is the minimum single-trip delay. jitter is the network jitter caused by network congestion. In the experiments, D_min is configured to 50ms and jitter is modelled by an exponential distribution and varied to simulate the variation of network latency.

To simulate replica failure, replica dynamics is characterized by session length which measures the length of time that a peer is continuously connected to a given P2P network, from its arrival to its departure [34]. Session length of P2P applications can be depicted by different stochastic models. Reference [34] shows that Weibull distribution or log-normal distribution fits the observation best. In this study, replica session length is modelled by a Weibull distribution with the mean value of half hour.

8.2. Experiment Results for Overall Performance

To verify the overall performance of event handling, three alternative event delivery approaches are implemented, including primary-backup, consensus-based total-order, and fast event delivery, for comparing their performances. Reliable primary-backup is not included in the comparison, since its performance is in-between the unreliable primary-backup approach and the consensus-based approach.

First, responsiveness is evaluated by comparing the interaction latencies in different approaches. Interaction latency includes both the round-trip end-to-end delay between an event sender and a group of replicas and the synchronization delay. The mean value of network jitter is fixed to 50ms, and the standard deviation is changed from 50ms to 250ms to simulate the scenario that events occasionally come late and out-of-order. The experiment result in Figure 11 shows that the fast event delivery approach provides much lower latency than the consensus-based total-order approach. Especially when the network latency is small, the responsiveness of the proposed approach is close to the primary-backup model. This is because the rate of triggering the consensus protocol decreases, when most events arrive before the end of cycles.

Second, end-to-end update delivery rate is evaluated by varying the message drop rate to simulate the change of p_loss from 0.3 to 0.7. In an asynchronous network, message loss cannot be distinguished from long message delay. Thus, update delivery timeout is used to cover both situations, which is configured to 5 seconds. The mean value of network jitter is fixed to 50ms to eliminate the interference of late events.

Figure 12 shows the update delivery rate of the three different approaches. Specifically, the update delivery rate of the primary-backup approach is much lower than the other two approaches. Moreover, it drops quickly from around 0.7 to below 0.3 with the increase of p_loss, showing the rapid increase of update loss rate. In contrast, the update delivery rate of the consensus-based total-order approach and the proposed approach (overlapped) can remain high. Before p_loss is lower than 0.5, the update delivery rate of these two approaches is close to 1. When p_loss is lower than 0.5, the drop of the update delivery rate of them becomes evident. This is because, with the increase of message drop rate, more messages are received by none of the replicas.

In the same experiment, interaction latency is also studied with the change message drop rate. Figure 13 shows that, with the increase of message drop, a replica has a higher chance to miss the cycle events, such that Π(s, c) ≠ Ω(s, c) and more consensus instances are triggered by replicas for event synchronization. This means that increasing message drop has a similar effect of increasing network jitter on the fast event delivery approach in interaction latency.

8.3. Experiment Results for Individual Improvements

8.3.1. Performance of Late Event Handling

The experiment result in this section verifies the late event handling approach. The proposed approach is compared with the simple discard approach which simply discards any late events. In the experiment, the timing of event sending is modified with clock error which is modelled with a normal distribution (μ = 0). The standard deviation of the clock error is changed from 0 to 400ms to increase the rate of late event. Moreover, network jitter is fixed to 50ms and the message drop rate p_loss is fixed to 0 to eliminate their interference to the result.

Figure 14 shows that the proposed approach has a much higher update delivery rate than the simple discard approach. Especially when the clock error is higher than 300ms, simply discarding late events results in that almost no message is delivered, since most events will come late. On the other hand, through the proposed approach, the update delivery rate can be maintained as high as close to 1. But the update delivery rate is lower than 1, because a few late events do not meet the deliverability condition.

8.3.2. Performance of Garbage Collection

Garbage collection is tested to verify its effectiveness. The main purpose of the experiment is to show that the proposed mechanism can effectively limit the length of Q_d from overgrowth. Thus, the experiment is conducted with two different settings: one with garbage collection and the other without garbage collection. The increase of the delivery queue length (Q_d) is observed and compared for the two settings. Network jitter is fixed to 50ms and the message drop rate p_loss is configured to 0 to remove the interference of event loss in both settings, so that the length of Q_d is only determined by the number of events and garbage collection. In the second setting, the length of the garbage collection cycle is fixed to 5 seconds. The experiment result is shown in Figure 15. In the case that garbage collection is not applied, the length of Q_d quickly increases from several thousand events to tens of thousands of events within 500 seconds. On the other hand, if garbage collection is applied, the change of Q_d is restrained within 300 hundred events. The comparison result shows that the proposed garbage collection protocol can effectively prevent Q_d from unlimited growth or even overflow.

Moreover, the cycle length of the gossip protocol is changed to show the control of the protocol on Q_d length. The experiment result is shown in Figure 16. When the cycle length of the gossip protocol increases from 1 second to 10 seconds, the length of Q_d changes from around 50 events to around 500 events. This experiment result shows that the length of Q_d is approximately linear to the cycle length of the gossip protocol. It implies that the length of Q_d can be effectively controlled by changing the cycle length. This control is useful because different virtual world applications could have different size of an event. If an application is required to cache large events, the length of Q_d will be reduced for the same space of event cache.

8.3.3. Performance of Time Synchronization

The performance improvement through time synchronization is shown in Figures 17 and 18. The main purpose of the experiment is to show that the time synchronization mechanism has positive effect on interaction latency reduction and garbage collection. Thus, the experiment compares the performance of event handling in two different settings: one with time synchronization and the other without time synchronization. Clock error is added to the timing of event sending to simulate the scale of clock synchronization loss between the event senders and recipients. Network jitter is fixed to 50ms and the message drop rate p_loss is fixed to 0 to eliminate their interference to the result.

Figure 17 shows that if the clocks between event senders and event recipients are out-of-sync, there is an increase of interaction latency along with clock offset, because time synchronization loss increases the chance of triggering consensus for event delivery. Note that the interaction latency increase with clock offset is almost linear, clearly showing the impact of synchronization loss. In contrast, if time synchronization is applied before event transmission, the interaction latency is less than 1 second. This is because the number consensus can be reduced and thus the communication steps for event delivery can be minimized accordingly.

Figure 18 shows the increase of the delivery queue (Q_d) also along with clock offset. Due to clock synchronization loss, more cycles are delivered with empty events, and thus more empty events are at the trail of Q_d. According to the rule of garbage collection, trailing rounds with empty events cannot be removed from Q_d. Thus, clock synchronization loss weakens the effectiveness of the garbage collection mechanism. Note that when the clock offset exceeds 300ms, the increase of Q_d will become large. In contrast, if time synchronization is applied, the length of Q_d does not increase along with clock offset.

8.4. Discussion

The experiment results for the overall performance show similar results to the theoretical analysis. Specifically, the proposed fast event delivery approach is reliable and can provide opportunistically high responsiveness, compared with the consensus-based total-order approach and the primary-backup approach, because this approach has the highest update delivery rate, and almost the same interaction latency as the primary-backup approach when the network latency is low. Thus, the overall performance of the proposed approach is better than the other two alternative approaches. The results also imply that, in practice, it is better to select cache nodes which are close to clients so that most cycle events can arrive on time at all replicas and they can be delivered without consensus. Then, interaction latency can be minimized.

The experiment results for the individual improvements show their effectiveness on system performance improvement. Specifically, the experiment result of late event handling shows that the proposed dynamic cycle event delivery approach can largely increase update delivery rate. High update delivery rate can reduce the chance of event resending which will lower down responsiveness in interaction and impact user experience. The experiment result of garbage collection shows its effectiveness in limiting the length of the event delivery queue. This is important, because it can not only avoid buffer overflow, but also restrict the time in traversing Q_d, if searching for a specific event is needed. Traversing a large buffer is slow and reduces system responsiveness. Lastly, the evaluation result of time synchronization shows its importance. Without the time synchronization between event senders and event recipients, the advantage of the fast event handling approach and the effect of the garbage collection mechanism diminishes.

9. Conclusions

With the popularity of mobile virtual worlds, scalability becomes an outstanding challenge in infrastructure development. The possibility of P2P technology is discussed to address the scalability problem. Different from existing P2P virtual worlds, client unreliability raises a new problem in mobile settings. This paper tries to solve the problem with a new hierarchical P2P computing model. Yet, rather than introducing every detail of the computing model, we focus on object state update to avoid reinventing wheel. The core problem of state update is to maintain the replica state consistency without compromising system responsiveness. To address the problem, a fast event delivery approach is proposed. Based on this approach, we introduce the new virtual world interaction model to enable the interaction between multiple users.

Our work is important in providing a scalable infrastructure for mobile P2P virtual worlds. Based on the proposed Virtual Net architecture, there are some new research problems for building virtual world applications. First, our current approach still has the limitation in high responsiveness, since it belongs to the opportunistic category. To further improve system responsiveness without compromising state consistency, we plan to employ conflict-free replicated data types (CRDT) [35] to replace the consensus approach in event handling. With CRDT, events can be delivered in any sequence. However, events delivery in different sequences may cause user confusion with respect to continues events, such as avatar movement. Thus, it can be expected that the problem will be a combination of human-computer interaction (HCI) distributed computing. Moreover, the future study also includes the application and adaptation of cloud-fog computing techniques for contributed resource management, including cache node allocation, and P2P virtual world techniques to provide a complete and practical mobile P2P virtual world solution.

Appendix

A. Fast Event Delivery Protocols

The full set of the fast event delivery protocols are described in this appendix, which includes event collection, event delivery, and consensus. The payload of an event could be the operation of the event sender, Empty, or (called event). Note that if a reliable communication channel is required in a function, the keyword Reliably will be added before the send or broadcast operation. The implementation of a reliable channel can be found in [29]. Message names are capitalized and messages could contain some parameters. The notations used in the pseudocodes are listed in Table 3. Particularly, G∩ R denotes the set of live replicas.

In each cycle ∆t, the event collection protocol (Algorithm 4) periodically collects the received events from all senders in S in the receiving buffer Q_r and to a temporary buffer Q_p. As described in the Late Event Handling section, not only the cycle events (i.e., Seq(e) = Seq(i, c), Lines 8-11) but also the late and still deliverable events are collected (Lines 12-15). If any expected event is not received, a event will be assigned to the corresponding event sequence number (Line 10). Likewise, a late event replaces an empty event with the same sequence number in Q_p (Lines 14-15). All undeliverable events will be discarded in event receiving (Line 3).

1. On replica :
2. Upon EVENT e
3. If Seq(e) > MaxSeq(Sender(e), c - 1), then
4. Q_r ← Q_r∪
5.
6. Upon cycle TIMEOUT
7. c ← c + 1
8. For each s∈S,
9. If e(s, Seq(s, c)) ∉Q_r, then
10. e← (s, Seq(s, c), ⊥)
11. Q_p ← Q_p∪
12. For each c’∈ ∧ s ∈ S ∧ e(s, c’) ⊈Q_d, // Collect late and deliverable events
13. If e(s, Seq(s, c’)) ∈Q_r
14. Q_p ← Q_p ∖
15. Q_p ← Q_p∪
16. Q_r ← Q_r ∖
17. Reset Timer cycle ← ∆t

The event delivery protocol (Algorithm 5) is executed by detecting the condition satisfaction of a cycle. If the events of cycle c – 1 have been delivered and events of cycle c have either (1) been collected or (2) been decided from a consensus instance (Line 2), then the cycle c satisfies the condition of triggering the event delivery protocol. Thus, the execution of event delivery is asynchronous to event collection. For distributed agreement, the protocol firstly checks the second condition to ensure that the consensus result will be applied on all replicas. If the cycle is decided by a consensus instance, the decided events will be delivered no matter whether there is any nonempty event newly received for the cycle (Lines 3-4). Otherwise, events will be delivered from Q_p. If all expected events have been collected (Lines 6-8, 13-14), then they will be delivered to Q_d in the sequence of γ for cycle c. The range [MinSeq(s, c), Seq(s, c)] specifies the deliverable sequence number for each sender s and cycle c. The calculation of γ (Line 16) ensures that all replicas can deliver the concurrent events from different senders in the same sequence. If there is any event, a query message will be sent to the group leader. The set union and ) specify the lowest deliverable sequence number and the sequence number of the cycle respectively for all senders.

1.
2. Upon
¬Consensus(c)
3. If E(c) ≠ ∅, then
4. D ← E(c)
5. Else
6. For each s∈S ∧ j∈ [MinSeq(s, c), Seq(s, c)],
7. c’ ← c – (Seq(s, c) – j)
8. , e(s, c’)) ∣ (c’, e(s, j)) ∈
9. If , e) ∣ Payload(e) = ⊆T, then
10. QUERY ←
11. Reliably send QUERY to r_L
12. End the procedure
13 Else
14. D ← D∪T
15. For each (c, e) in D,
16.
17. Q_d ← Q_d∪
18. c ← c + 1
19.
20.
21. Upon QUERY(c, , ) from r_i
22. For each s∈S ∧ j∈ [MinSeq(s, c), Seq(s, c)],
23. c’ ← c – (Seq(s, c) – j)
24. ∣ (c’, e(i, j)) ∈Q_p ∨ (c, γ, e(s, j)) ∈

25. If ∣ Payload(e) = ⊆R, then
26. QUERY_REPLY ← (c, R)
27. Reliably send QUERY_REPLY to r_i
28 Else
29.

The leader checks locally the receipt of the events for the requested cycle in Q_p and Q_d. If all the expected events (for each sender s and each expected sequence number [MinSeq(s, c), Seq(s, c)]) of the requested cycle have been received, it will reply to them with the requesting replica (Lines 22-27). Otherwise, the leader will initialize a new consensus instance for the cycle (Lines 28-29).

The consensus protocol (Algorithm 6) is run and instantiated for each requested cycle c. Note that the consensus protocol is only executed when there is no leader election (LE) or group reconfiguration (GE). Also, a message from a previous leader or a previous group configuration is not processed for consistency. Thus, these preconditions are added in all message handling procedures in the consensus protocol (Lines 8, 11, 19, and 26). First, the flags LE and GE are checked to ensure that the previous leader election and group reconfiguration have been finished. Then, the message sender’s epoch and configuration ID are compared with the local epoch and configuration ID via adding the sender epoch and configuration ID to each message.

1.
2. Upon P ≠ ∅ ∧ CR = false ∧ LE = false
3. For each () ∈P ∧ c∉Z,
4.
5. QUERY ← ()
6. Reliably send QUERY to
7.
8. Upon QUERY_RESULT(epoch’, cid’, r, W_i) from r_i ∧ epoch’ = epoch ∧
cid’ = cid ∧ CR = false ∧ LE = false
9.
10.
11. Upon G∩R⊆
12. R’ ← Decide(c, Q)
13.
14.
15. DECISION ← (epoch, cid, c, W’)
16. Reliably broadcast DECISION to
17.
18.
19. Upon QUERY(epoch’, cid’, r,) from
epoch’ = epoch ∧ cid’ = cid ∧ GR = false ∧ LE = false
20. For each ,
21. c’ = c – (Seq(s, c) – j)
22. W ← W∪
23. QUERY_RESULT ← (epoch, cid, c, W)
24. Reliably send QUERY_RESULT to r_L
25.
26. Upon DECISION(epoch, cid’, c, W’) from r_L ∧ epoch’ = epoch ∧ cid’ =
cid ∧ GR = false ∧ LE = false
27. E ← E ∪ W’
28.
29. Decide(c, Q)
30. For each s∈S and ,
30. If ∃e(s, j): e(s, j) ≠ ⊥ e(s, j) ∈, then
31. W’ ← W’ ∪
32. Else
33. e← (s, j, Empty)
34. W’ ← W’ ∪
31. Return W’

The consensus protocol is described in the Total-Order Event Delivery section in detail. A replica replies to the leader for the query of the events for cycle c only when the replica has passed the event collection of cycle c, which requires the following: The decided events for cycle c - 1 have been delivered, if there is any. The events collection for cycle c has been done. The leader will decide the events for c only after all proposals are received from all live replicas (Line 11). The Decide function determined the events of a given cycle for each event sender (Lines 29-33) by the given set of received events for c from all replicas. For each sender s and event sequence number j, if all replicas propose , then the payload of the event e(s, j) will be decided with Empty (Lines 33-34). Otherwise, the event payload will be decided with the value of the proposal from any replica (Lines 30-31).

B. Leader Election and Group Reconfiguration Protocols

The notations that appeared in the leader election protocol and the group reconfiguration protocol follow the same convention listed in Table 3.

The leader election protocol (Algorithm 7) is triggered once the leader is not in the set of live replicas (Line 2). Each triggered replica checks whether it satisfies the condition to be the candidate by calling the SelectLeader function (Lines 8-13). As described in the Leader Election and Group Reconfiguration section, the candidate has the smallest age. If multiple candidates have the same age, the one with the smallest ID is selected.

1. On any replica:
2. Upon r_L∉R ∧ r_c ≠ self
3. LE ← true
4. r_c ← SelectLeader()
5. If r_c = self then
6. Reliably broadcast LE_QUERY to G∩R
7.
8. SelectLeader()
9. R_c :=
10. If = 1, then
11. Return r_i: r_i∈R_c
12. Else
13. Return r_i: r_i∈R_c ∧ r_i.ID =
14.
15.
16 Upon LE_QUERY from r_c
17. If r_c∈R ∧ r_c = SelectLeader(), then
18. LE ← true
19. LE_STATE ← (Q_d, E, cid, G, epoch)
20. Reliably send LAST_STATE to r_c
21. Else
22. Reliably send NACK to r_c
23.
24. Upon LOAD_LEADER(Q_d’, E’, epoch’, cid’, G’, Init)
from r_c ∧ r_c∈R ∧ r_c = SelectLeader()
25. (Q_d, E, cid, G, epoch) ← (Q_d’, E’, cid’, G’, epoch’ + 1)
26. (r_L, r_c) ← (r_c, ⊥)
27. LE ← false
28. If newReplica = true, then // New replica initialization is
29. Initialize(Init) needed, in case that a group
30. newReplica ← false reconfiguration is interrupted
by a leader election
31.
32.
33. Upon NACK from r_i
34. If self = Selectleader()
35. Reliably send LE_QUERY to r_i
36. Else
37. r_c ← ⊥
38.
39. Upon LE_STATE(Q_d,i, E_i, cid_i, G_i, epoch_i) from r_i
40. Events←Events∪
41. Decisions←Decision∪
42. Configs←Configs∪
43. Epochs←Epochs∪
44. Senders←Senders∪
45.
46. Upon G∩R ⊆ Senders
47. Q_d ← Longest(Events)
48. E ← Merge(Decisions)
49. (cid, G) ← Latest(Configs)
50. epoch ← Latest(Epochs)
51. Init← (t₀,, (λ_c, state), // state: current application
,S) state
52. LOAD_ LEADER ← (Q_d, E, epoch, cid, G, Init)
53. Reliably broadcast LOAD_LEADER to G∩R
54. Broadcast G to S

To achieve state synchrony, the candidate r_c sends the state query message (LE_QUERY) to all live replicas. If a replica has not learned the candidate or has learned a new candidate, the replica will reject the request from r_c by replying with the NACK message (Lines 17, 21-22). Otherwise, the replica will reply the state query with its state Q_d, E, epoch, and configuration (cid and G). r_c, on receiving the states from all live replicas, decides the latest consistent state (Lines 47-53) with the following functions.(i)Longest(): selects the longest Q_d from all replicas.(ii)Merge(): returns the union of the Decision sets from all the replicas for all cycles.(iii)Latest(): returns the largest configuration ID cid and the corresponding replica set G, which represents the latest configuration seen by the group.(iv)Latest(): returns the largest epoch which represents the latest leader election seen by the group.

Moreover, in case of any unfinished group reconfiguration, additional states (including the time of the first cycle t₀, the start time of the first event from all senders , the current state of the application and the corresponding delivered sequence of the event λ_c, the age of replicas, and the sender set (S)) are synchronized from r_c to new replicas for state initialization. After receiving the LE_STATE message from r_c, the replicas update their state to the decided value. Finally, all replicas load the r_c as the new leader and update the epoch by one.

The group reconfiguration protocol (Algorithm 8) is similar to the leader election protocol, except that it has lower priority, which is reflected by the precondition of checking the flag LE in all message handling procedures (Lines 11, 19, and 30). Group reconfiguration is triggered, when new replicas are added in the survival (i.e., R ∖ G ≠ ). G_T caches the latest triggered reconfiguration to preclude any unnecessary retriggering (Lines 2-3). At the end of the reconfiguration, each replica updates the age of all replicas by one (Lines 26-27).

1. On any replica:
2. Upon R ∖ G ≠ ∅ ∧ R ≠ G_T ∧ LE = false
3. G_T ← R
4. GR← true
5. If r_L = self, then
6. cid ← cid + 1
7. GR_QUERY ← (epoch, cid)
8. Reliably broadcast GR_QUERY to G_T ∩R
9.
10.
11. Upon GR_QUERY(epoch’, cid’) from r_L ∧ epoch’ ≥ // Use cid to discard messages from a
epoch ∧ cid’ > cid ∧ LE = false previous unfinished GR;
12. GR ← true // epoch’ ≥ epoch: for new members
13. cid ← cid’ // cid’ > cid: because the new cid has not
14. If epoch = 0, then been received
15. epoch ← epoch’
16. GE_STATE ← (Q_d, E, cid, epoch)
17. Reliably send GE _STATE to r_L
18.
19. Upon LOAD_CONFIG(Q_d’, E’, epoch’, cid’, G_T, Init)
from r_L ∧ epoch’ = epoch ∧ cid’ = cid ∧ LE = false
20. (Q_d, E) ← (Q_d’, E’)
21. G ← G_T
22. LE ← false
23. If newReplica = true, then
24. Initialize(Init)
25. newReplica←false
26. For each r∈G∩R,
27. r.age ← r.age + 1
28.
29.
30. Upon GE _STATE(Q_d,i, E_i, cid_i, epoch_i) from r_i ∧ epoch_i
= epoch ∧ cid_i = cid ∧ LE = false
31. Events←Events∪
32. Decisions←Decision∪
33. Senders←Senders∪
34.
35. Upon G_T ∩R⊆Senders
36. Q_d ← Longest(Events)
37. E ← Merge(Decisions)
38. Init ← (t₀, , (λ_c, state), // state: current application state
,S)
39. LOAD_ CONFIG ← (Q_d, E, epoch, cid, G_T, Init)
40. Reliably broadcast LOAD_ CONFIG to G_T ∩R
41. Broadcast G_T to S
42. G_T ← ∅

C. Proposition Proofs

See Lemma 1.

Proof. First, only one leader will eventually be elected by all the live replicas. It can be inferred by two cases. In the first case, the group is not partitioned. Then all the live replicas know each other, and the SelectLeader function ensures that only one leader is elected by all live replicas. In the second case, the group is partitioned. Without loss of generality, suppose there are two different leaders, denoted by r_L,1 and r_L,2. r_L,1 is elected by replica set P and r_L,2 is elected by replica set Q. r_L,1∉ Q, r_L,2∉ P, and P = G ∖ Q. Following the partial synchrony assumption, if the replicas in P never know Q and vice versa, then either P or Q is removed by the Rendezvous of the group. Following the assumption that there is only one Rendezvous for each replica group, then only one partition, either P or Q, will eventually survive. Thus, eventually there is only one leader; either r_L,1 or r_L,2 is the leader of the group.
When a new leader is elected by all replicas, it will determine the Q_d, E, and G and broadcast them to all the live replicas. Through the reliable underlying channel, all replicas will eventually load the same Q_d, E, and G after leader election. Moreover, a monotonic epoch number is used to avoid a replica load state from an old leader. Thus, all live replicas will eventually load the same Q_d, E, and G after the leader election of the largest epoch.

See Lemma 2.

The proof of group reconfiguration synchrony is the same as that of leader election synchrony. Thus, it is not repeated here.

See Lemma 3.

Proof. If there is a leader election or a group reconfiguration before the consensus instance terminates, then Lemmas 1 and 2 ensure that all replicas will have e in E(c). If there is no leader election or group reconfiguration before the consensus instance terminates, the reliable underlying communication channel ensures that all the live replicas will eventually receive the same decision from the leader. Since r_i has delivered e into E(c), e is in the decision for cycle c. Therefore, e will be eventually received and delivered by all live replicas.

With the above lemmas, the main result can be obtained. But before it, an important property of the late event handling approach needs to be verified first.

See Lemma 4.

Proof. The lemma can be proved by induction.
Basis Step. When c = c₀, i.e., the cycle of receiving the first event from s based on t_recv,s(1), then Ω(s, c) = .
Induction Step. Assume all the live replicas in G expect delivering the same set of events Ω(s, c_k) for sender s∈ S and cycle c_k (c_k ≥ c₀). Then, for cycle c_k + 1, there are two cases for discussion. (1)If there is no consensus instance for cycle c_k, then MaxSeq(s, c_k) = Seq(s, c_k) and Ω(s, c_k + 1) = on all replicas.(2)If there is consensus instance for cycle c_k, then, following Lemma 3, all live replicas will eventually deliver the same events to E(c_k). Let Seq(s, j) be the maximal sequence number of nonempty events in E(c_k). Then, MaxSeq(s, c_k) = Seq(s, j) and Ω(s, c_k + 1) = on all the live replicas. By the principle of mathematical induction, it follows that the lemma is true for all cycles after c₀.

See Theorem 5.

Proof. Since all replicas share the same sender set S, Lemmas 3 and 4 ensure that all the live replicas will eventually deliver the same set of events for any cycle, either directly from received events (Lines 5-14 of Algorithm 1) or from the consensus result (Lines 2-3 of Algorithm 1).
Let e₁(s₁, j₁) and e₂(s₂, j₂) be delivered on r for the cycle c₁ and cycle c₂. If c₁ = c₂ = c, then (c, γ₁, e₁) and (c, γ₂, e₂) will be eventually delivered into Q_d of all replicas. If c₁ ≠ c₂, then (c₁, γ₁, e₁) and (c₂, γ₂, e₂) will be eventually delivered into Q_d of all replicas. Moreover, since γ₁ and γ₂ are determined only by s₁, s₂, j₁, and j₂, γ₁ ≠ γ₂ for different e₁ and e₂. Since Q_d is linearly ordered by c and then by γ, there exists mapping from each unique (c, γ) to a unique nonnegative integer number λ and let φ(c, γ) = λ be such mapping function. Let φ(c₁, γ₁) = λ₁ and φ(c₂, γ₂) = λ₂. Then, all replicas will eventually deliver (λ₁, e₁) and (λ₂, e₂) and λ₁ ≠ λ₂.

See Corollary 6.

Corollary 6 can be directly inferred from Theorem 5.

See Theorem 7.

Proof. Theorem 5 ensures that if e is in Q_d of r_i, then e is or was in Q_d of all the live replicas in G with the same λ. In Algorithm 3, if e can be removed from r_i, then r_i must have received λ_c’s at least equal to λ from all the live replicas. Since events are delivered to the application in sequence, e must have been delivered to the application on all replicas.

See Corollary 8.

Proof. Theorem 5 ensures that the ADD_NEIGHBOR event is delivered to of all replicas with the same λ. Since (λ, e) and (c, γ, e) have a one-to-one mapping for the same event, all replicas deliver ADD_NEIGHBOR for the same cycle. Moreover, since n, ∆t are fixed, all replicas are timed to deliver the first event e₀ from s for the same future cycle c_k. Theorem 5 ensures that e₀ is delivered with the same delivery sequence λ₀ on all live replicas.

See Corollary 9.

Proof. Theorem 5 ensures that the RM_NEIGHBOR event is delivered to Q_d of all replicas with the same λ. Since (λ, e) and (c, γ, e) have a one-to-one mapping for the same event, all replicas handle RM_NEIGHBOR for the same cycle c. From cycle c + 1, s will be removed from S. Thus, all replicas will deliver the last event of s at c. Theorem 5 ensures that is delivered with the same delivery sequence on all live replicas.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

Acknowledgments

This research is partially supported by the University of Macau Research Grant No. MYRG2017-00091-FST and MYRG2015-00043-FST.

References

Thinking Gaming. (2018, Jul.). “Top grossing iPhone - Games,” https://thinkgaming.com/app-sales-data/.
Metro. (2018, Jul.). “Fortnite overtakes PUBG as the biggest video game in the world,” https://metro.co.uk/2018/02/09/fortnite-overtakes-pubg-biggest-video-game-world-7300442/.
B. Knutsson, . Honghui Lu, . Wei Xu, and B. Hopkins, “Peer-to-peer support for massively multiplayer games,” in Proceedings of the IEEE INFOCOM 2004, pp. 96–107, Hong Kong, PR China.
View at: Publisher Site | Google Scholar
A. Yahyavi and B. Kemme, “Peer-to-peer architectures for massively multiplayer online games: A survey,” ACM Computing Surveys, vol. 46, no. 1, 2013.
View at: Google Scholar
R. Bhagwan, K. Tati, Y. Cheng, S. Savage, and G. M. Voelker, “Total Recall: System Support for Automated Availability Management,” in In Proc. 1st Conf. Networked Systems Design and Implementation (NSDI), 2004.
View at: Google Scholar
E. Buyukkaya, M. Abdallah, and G. Simon, “A survey of peer-to-peer overlay approaches for networked virtual environments,” Peer-to-Peer Networking and Applications, vol. 8, no. 2, pp. 276–300, 2013.
View at: Publisher Site | Google Scholar
S.-Y. Hu, J.-F. Chen, and T.-H. Chen, “VON: A scalable peer-to-peer network for virtual environments,” IEEE Network, vol. 20, no. 4, pp. 22–31, 2006.
View at: Publisher Site | Google Scholar
T. Malherbe, A Comparative study of interest management schemes in peer-to-peer massively multiuser networked virtual environment, MEng. Thesis, Stellenbosch University, 2016, http://hdl.handle.net/10019.1/100061.
L. Ricci, L. Genovali, E. Carlini, and M. Coppola, “AOI-cast in distributed virtual environments: An approach based on delay tolerant reverse compass routing,” Concurrency Computation, vol. 27, no. 9, pp. 2329–2350, 2015.
View at: Publisher Site | Google Scholar
A. Yahyavi, K. Huguenin, J. Gascon-Samson, J. Kienzle, and B. Kemme, “Watchmen: Scalable Cheat-Resistant Support for Distributed Multi-player Online Games,” in Proceedings of the 2013 IEEE 33rd International Conference on Distributed Computing Systems (ICDCS), pp. 134–144, Philadelphia, PA, USA, July 2013.
View at: Publisher Site | Google Scholar
H. A. Engelbrecht and J. S. Gilmore, “Pithos: Distributed storage for massive multi-user virtual environments,” ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM), vol. 13, no. 3, 2017.
View at: Google Scholar
P. E. Ross, “Cloud computing's killer app: Gaming,” IEEE Spectrum, vol. 46, no. 3, p. 14, 2009.
View at: Publisher Site | Google Scholar
W. Cai, R. Shea, C.-Y. Huang et al., “A survey on cloud gaming: Future of computer games,” IEEE Access, vol. 4, pp. 7605–7620, 2016.
View at: Publisher Site | Google Scholar
S. Choy, B. Wong, G. Simon, and C. Rosenberg, “The brewing storm in cloud gaming: A measurement study on cloud to end-user latency,” in Proceedings of the 2012 11th Annual Workshop on Network and Systems Support for Games (NetGames), pp. 1–6, Venice, Italy, November 2012.
View at: Publisher Site | Google Scholar
L. M. Vaquero and L. Rodero-Merino, “Finding your way in the fog: Towards a comprehensive definition of fog computing,” Computer Communication Review, vol. 44, no. 5, pp. 27–32, 2014.
View at: Google Scholar
M. R. Anawar, S. Wang, M. Azam Zia, A. K. Jadoon, U. Akram, and S. Raza, “Fog Computing: An Overview of Big IoT Data Analytics,” Wireless Communications and Mobile Computing, vol. 2018, Article ID 7157192, 22 pages, 2018.
View at: Publisher Site | Google Scholar
H. T. Dinh, C. Lee, D. Niyato, and P. Wang, “A survey of mobile cloud computing: Architecture, applications, and approaches,” Wireless Communications and Mobile Computing, vol. 13, no. 18, pp. 1587–1611, 2013.
View at: Publisher Site | Google Scholar
E. C. P. Neto, G. Callou, and F. Aires, “An algorithm to optimise the load distribution of fog environments,” in Proceedings of the 2017 IEEE International Conference on Systems, Man, and Cybernetics, SMC 2017, pp. 1292–1297, Canada, October 2017.
View at: Google Scholar
Y. Lin and H. Shen, “CloudFog: Leveraging Fog to Extend Cloud Gaming for Thin-Client MMOG with High Quality of Service,” IEEE Transactions on Parallel and Distributed Systems, vol. 28, no. 2, pp. 431–445, 2017.
View at: Publisher Site | Google Scholar
N. Wang, B. Varghese, M. Matthaiou, and D. S. Nikolopoulos, “ENORM: a framework for edge node resource management,” IEEE Transactions on Services Computing, 2017.
View at: Publisher Site | Google Scholar
M. Claypool and K. Claypool, “Latency and player actions in online games,” Communications of the ACM, vol. 49, no. 11, pp. 40–45, 2006.
View at: Publisher Site | Google Scholar
E. K. Lua, J. Crowcroft, M. Pias, R. Sharma, and S. Lim, “A survey and comparison of peer-to-peer overlay network schemes,” IEEE Communications Surveys & Tutorials, vol. 7, no. 2, pp. 72–93, 2005.
View at: Publisher Site | Google Scholar
E. Carlini, L. Ricci, and M. Coppola, “Flexible load distribution for hybrid distributed virtual environments,” Future Generation Computer Systems, vol. 29, no. 6, pp. 1561–1572, 2013.
View at: Publisher Site | Google Scholar
R. C. Merkle, “A digital signature based on a conventional encryption function,” in Advances in Cryptology — CRYPTO’87. CRYPTO 1987. Lecture Notes in Computer Science, C. Pomerance, Ed., vol. 293, Springer, Berlin, Heidelberg, 1988.
View at: Publisher Site | Google Scholar
C. C. Erway, A. Küpçü, C. Papamanthou, and R. Tamassia, “Dynamic provable data possession,” ACM Transactions on Information and System Security, vol. 17, no. 4, article 15, 2015.
View at: Publisher Site | Google Scholar
C. Symborski, “Scalable user content distribution for massively multiplayer online worlds,” The Computer Journal, vol. 41, no. 9, pp. 38–44, 2008.
View at: Publisher Site | Google Scholar
B. Shen, J. Guo, and L. X. Li, “Cost optimization in persistent virtual world design,” Information Technology and Management, vol. 19, no. 3, pp. 155–169, 2018.
View at: Publisher Site | Google Scholar
F. B. Schneider, “Implementing fault-tolerant services using the state machine approach: a tutorial,” Computing Surveys, vol. 22, no. 4, pp. 299–319, 1990.
View at: Publisher Site | Google Scholar
C. Cachin, R. Guerraoui, and L. Rodrigues, Introduction to Reliable and Secure Distributed Programming, Springer Berlin Heidelberg, Berlin, Heidelberg, 2nd edition, 2011.
View at: Publisher Site
A. Chandler and J. Finney, “Rendezvous: supporting real-time collaborative mobile gaming in high latency environments,” in Proceedings of the 2005 ACM SIGCHI International Conference on Advances in computer entertainment technology (ACE’05). ACM, pp. 310–313, New York, NY, USA, June 2005.
View at: Publisher Site | Google Scholar
D. L. Mills, “Internet time synchronization: the network time protocol,” IEEE Transactions on Communications, vol. 39, no. 10, pp. 1482–1493, 1991.
View at: Publisher Site | Google Scholar
X. Che and B. Ip, “Packet-level traffic analysis of online games from the genre characteristics perspective,” Journal of Network and Computer Applications, vol. 35, no. 1, pp. 240–252, 2012.
View at: Publisher Site | Google Scholar
S. Kaune, K. Pussep, C. Leng, A. Kovacevic, G. Tyson, and R. Steinmetz, “Modelling the internet delay space based on geographical locations,” in Proceedings of the 17th Euromicro International Conference on Parallel, Distributed and Network-Based Processing, PDP 2009, pp. 301–310, Germany, February 2009.
View at: Google Scholar
D. Stutzbach and R. Rejaie, “Understanding churn in peer-to-peer networks,” in Proceedings of the 6th ACM SIGCOMM on Internet Measurement Conference, pp. 189–202, October 2006.
View at: Publisher Site | Google Scholar
M. Shapiro, N. Preguica, C. Baquero, and M. Zawirski, “Convergent and commutative replicated data types,” Bulletin of the European Association for Theoretical Computer Science. EATCS, no. 104, pp. 67–88, 2011.
View at: Google Scholar | MathSciNet

Copyright

Copyright © 2018 Bingqing Shen and Jingzhi Guo. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

PDF Download Citation

Download other formats

Order printed copies

Views

1040

Downloads

881

Citations

Wireless Communications and Mobile Computing

Interactions in Mobile Sound and Music Computing

Virtual Net: A Decentralized Architecture for Interaction in Mobile Virtual Worlds

Abstract

1. Introduction

2. Related Work

2.1. P2P Virtual Worlds

2.2. Fog Computing

3. Virtual Net Model

3.1. P2P Cloud Layer

3.1.1. File Storage

3.1.2. Content Addressing

3.2. P2P Fog Layer

4. Object State Update

4.1. Fast Event Delivery

4.1.1. Settings and Assumptions

4.1.2. Late Event Handling

4.1.3. Total-Order Event Delivery

4.1.4. Garbage Collection

4.1.5. Time Synchronization

4.2. Leader Election and Group Reconfiguration

5. Virtual World Interaction

5.1. Flow of Events and Updates

5.2. Neighbors

5.3. Neighbor Join

5.4. Neighbor Leave

6. Theoretical Verification

7. Performance Analysis and Comparison

8. Experiments and Results

8.1. Simulation Setup

8.2. Experiment Results for Overall Performance

8.3. Experiment Results for Individual Improvements

8.3.1. Performance of Late Event Handling

8.3.2. Performance of Garbage Collection

8.3.3. Performance of Time Synchronization

8.4. Discussion

9. Conclusions

Appendix

A. Fast Event Delivery Protocols

B. Leader Election and Group Reconfiguration Protocols

C. Proposition Proofs

Data Availability

Conflicts of Interest

Acknowledgments

References

Copyright