INTRODUCTION
Nowadays, organizations critically need to supply recent data to
users who may be geographically remote and to handle a volume of
requests of data distributed around multiple sites. One way to provide
access to such data is through replication. It is broadly installed in
disaster tolerance systems to replicate data from the primary system to
the remote backup system dynamically and online (Ren et al., 2003).
Replication provides user with fast, local access to shared data and
protects availability of applications because alternate data access
options exist (Dastgheib, 2010). Distributed database replication
involves the process of copying and maintaining database objects in
multiple databases that make up a Distributed Database Systems (DDS)
(Ren et al., 2003). Handling fragmented database replication becomes
challenging issue to administrator since the distributed database is
scattered into split replica partitions or fragments. Each partition or
fragment of a distributed database may be replicated into several
different sites in distributed environment. Changes applied at one site
are captured and stored locally before being forwarded and applied at
each of the remote locations. Fragmentation in distributed database is
very useful in terms of usage, efficiency, parallelism and also for
security. This strategy will partition the database into disjoint
fragments. If data items are located at the site where they used most
frequently, locality of reference is high. In fragmentations, similarly,
reliability and availability are low Distributed Database, 2011. But by
combining fragmentation with replication, performance should be good
Distributed Database, 2011. Even if one site becomes unavailable, users
can continue to query or even update the remaining fragments.
Data replication can be divided into three categories of fragmented
replication scheme which are all-data-to-all-sites,
some-data-to-all-sites and some-data-to-some-sites. The examples of
all-data-to-all-sites protocols are Read-One-Write-All (ROWA) (Ahmad et
al., 2010a; Deris et al., 2009) and Hierarchical Replication Scheme
(HRS) (Perez et al., 2010). ROWA has been proposed preserving replicated
data file in network environment (Ahmed et al., 2010a; 2010b).
Meanwhile, replication in HRS starts when a transaction initiates at
site 1. All the data will be replicate into other site. All sites will
have all the same data. For some-data-to-all-sites category, The
Majority Quorum protocol and Weighted Voting protocol employ voting to
decide the quorums techniques (Choi and Youn, 2010). A tree structure
has been assigned to the set of replicas in this technique. The replicas
are positioned only in the leaves, whereas the non-leaf nodes of the
tree are regarded as "logical replicas ", which in a way
summarize the state of their descendants (Storm and Theel, 2009).
Besides Voting Protocol, Tree Quorum (TQ) (Choi and Youn, 2010) can also
be categorized in some-data-to-all-sites. These replication protocols
make use of a logical tree structure. The cost and availability vary
according to the failure condition, whereas they are constant for other
replication protocols (11). One more protocol in this category is Branch
replication scheme (Perez et al., 2010). Its goals are to increase the
scalability, performance and fault tolerance. Replicas are created as
close as possible to the clients that request the data files. Using this
technique, the growing of the replica tree is driven by client needs.
Binary Vote Assignment on Data Grid (BVADG) (Ahmad et al., 2010b) is one
of the protocols in some-data-to-some-sites protocol. A data will
replicate to the neighboring sites from its primary site. Four sites on
the corners of the grid have only two adjacent sites and other sites on
the boundaries have only three neighbors. Thus, the number of neighbors
of each sites is less than or equal to four.
Research background: Data replication: Replication is the process
of sharing information to ensure consistency between redundant resources
such as software or hardware components. This process helps to improve
reliability, fault-tolerance, or accessibility of data (Gudiu et al.,
2010; Connolly and Begg, 1998). Data replication may occur if the same
data is stored in multiple storage devices. Meanwhile, computation
replication occurs when the same computing task is executed many times.
A computational task is typically replicated in space, i.e., executed on
separate devices, or it could be replicated in time, if it is executed
repeatedly on a single device. Whether one replicates data or
computation, the objective is to have some group of processes that
handle incoming events. If we replicate data, these processes are
passive and operate only to maintain the stored data, reply to read
requests and apply updates. When we replicate computation, the usual
goal is to provide fault-tolerance. For example, a replicated service
might be used to control a telephone switch, with the objective of
ensuring that even if the primary controller fails, the backup can take
over its functions (Storm and Theel, 2009).
Distributed database fragmentation: Fragmentation in distributed
database is very useful in terms of usage because usually, applications
study with only some of relations rather than entire of it (Connolly and
Begg, 1998). In data distribution, it is better to study with subsets of
relations as the unit of distribution. The other benefit from
fragmentation is the efficiency. Data is stored close to where it is
most frequently used and for data that is not needed, it is not stored.
By using fragmentation, a transaction can be divided into several
subqueries that operate on fragments. So, it will increase the degrees
of parallelism. Besides, it also good for security as data not required
for local applications is not stored. So, it will not available to
unauthorized users. There are two main types of fragmentation which are
horizontal and vertical. Horizontal fragments are subsets of tuples,
whereas vertical fragments are subsets of attributes. Figure 1a and b
show the horizontal and vertical fragmentations.
Horizontal fragmentation: Horizontal fragmentation groups together
the tuples in a relation that are used by the important transactions
(Atlas at the University of Chicago, 2011). A horizontal fragment is
produced by specifying a predicate that performs a restriction on the
tuples in the relation. It is defined using the Selection operation of
the relational algebra. Given a relation R, a horizontal fragment is
defined as:
[sigma]p (R)
where, p is a predicate based on one or more attributes of the
relation.
Vertical fragmentation: Vertical fragmentation groups together the
attributes in a relation that are used jointly by the important
transactions (Atlas at the University of Chicago, 2011). A vertical
fragment is defined using the Projection operation of the relational
algebra. Given a relation R, a vertical fragmentation is defined as:
IIa1, ..., an (R) where, a1, ...,an are attributes of the relation
R.
[FIGURE 1 OMITTED]
MATERIALS AND METHODS
Binary Vote Assignment Grid Quorum (BVAGQ) technique will be used
to approach the research. In BVAGQ, all sites are logically organized in
form of two-dimensional grid structure. Each site has a premier data
file. A site is either operational or failed and the state (operational
or failed) of each site is statistically independent to the others. A
data will replicate to the neighboring sites from its primary site.
Consider a case of 9 sites logically organized in 3 x 3 two-dimensional
grid structures. Four sites on the corners of the grid have only two
adjacent sites and other sites on the boundaries have only three
neighbors. Thus, the number of neighbors of each sites is less than or
equal to 4. In Fig. 2, data from site 1 will replicate to site 2 and 4
which are its neighbors. Site 5 has four neighbors, which are sites 2,
4, 6 and 8. So, site 5 has five replicas. Meanwhile, site 6 replicates
to site 3, 5 and 9.
Definition:
* V is a transaction
* S is relation in database
* [S.sub.i] is vertical fragmented relation derived from S, where i
= 1,2, ...,n
* PK is a primary key
* x is an instant in T which will be modified by element of V
* T is a tuple in fragmented S
* [[S.sub.[PK.sub.xx]].sup.i] is a horizontal fragmentation
relation derived from [S.sub.i]
* [P.sub.i] is an attribute in S where i = 1, 2,..., n
* [M.sub.i, j] is an instant in relation S where i and j =
1,2,...,n
* i represent a row in S
* j represent a column in S
* [eta] and [psi] are groups for the transaction V
* [gamma] = a or b where it represents different group for the
transaction V (before and until get quorum)
* V[eta] is a set of transactions that comes before V[psi]
* While V[psi] is a set of transactions that comes after V[eta]
* D is the union of all data objects managed by all transactions V
of BVAG
* Target set = {-1, 0, 1} is the result of transaction V; where -1
represents unknown status, 0 represents no failure and 1 represents
accessing failure
* BVAG transaction elements V[eta]h= {[V.sub.[eta]x,qr] |
r=1,2,...,k} where [V.sub.[eta]x,qr] is a queued element of Vh
transaction
* BVAG transaction elements V[psi] = {[V.sub. [psi]x,qr]|
r=1,2,...,k} where [V.sub. [psi]x,qr] is a queued element of V[psi]y
transaction
[FIGURE 2 OMITTED]
[FIGURE 3 OMITTED]
* BVAG transaction elements V[lambda] = {[V.sub.[lambda]x,qr]|
r=1,2,...,k} where [V.sub.[lambda]x,qr] is a queued element either in
different set of transactions [V.sub.[eta]] or [V.sub.[psi]]
* [V.sub.[lambda]x,q1] is a transaction that is transformed from
[V.sub.[lambda]x,qr]. [V.sub.ux,q1] represents the transaction feedback
from A neighbor site. [V.sub.ux,q1] exists if either
[V.sub.[lambda]x,qr] or [V.sub.[lambda]x,q1] exists
* Successful transaction at primary site [V.sub.[lambda]x,qr] = 0
* Where [V.sub.[lambda]x,qr] [member of] D (i.e., the transaction
locked an instant x at primary). Meanwhile, successful transaction at
neighbor site [V.sub.([u.sub.x,q1])] = 0, where ux, q1[epsilon] D
(i.e.,, the transaction locked a data x at neighbor)
RESULTS
To make it clearer on how we manage To make it clearer on how we
manage the transaction using BVAGQ, here we present the example case.
Each node is connected to one another through an Ethernet switch hub. A
cluster with 3 replication servers connected to each as shown in Fig. 3.
Using BVAG rules, each primary replica will copy database x to its
neighbor replicas. Client can access database x at any server that has
its replica. We assume that primary database a located in Server 1,
primary database b will be at Server 2 and so on. Based on BVAGQ model,
a for [V.sub.[lambda]a,q1] will be any instant a, b, c, d, e, f, g, h
and i.
DISCUSSION
For the first experiment, consider V[[lambda].sub.a,q1], lambda] =
[eta] request to update data a at server 1. The first request that get
lock which is V[[lambda].sub.a,q1] will proceed with the transaction and
V[[lambda].sub.a], qr+1,... V[[lambda].sub.a], qk aborted as shown in
Table 1. V[[lambda].sub.a,q1] is the write counter for
V[[lambda].sub.a,q1] that increases when it gets a lock. Next, the
V[[lambda].sub.a,q1] fragmented into [S.sub.2] and [S.sub.2] is
fragmented into [[S.sub.[PK.ssub.xx]].sup.2] Based on the primary key of
the fragmented tuple; instant a will be updated. After finish update,
the transaction will commit.
For second transaction, if two sets of transactions,
V[[lambda].sub.a,q1], [lambda] = [eta] and V[[lambda].sub.a,q1],
l[lambda] = [psi] initiates to update database a at replica 1,
transaction V[[lambda].sub.a,q1], [lambda] = [psi] will abort.
Transaction V[[lambda].sub.a,q1], [lambda] = [psi] is aborted because we
already fix the system will choose the first transaction that make
request based on timestamps. After identify which transaction will be
executed, we will fragmented the database using horizontal and vertical
fragmentation to get the instant that we want to update. From Table 2,
we can see that [V.sub.[lambda]a,q1] precede the transaction execution.
[V.sub.[lambda]a,q1] fragmented into [S.sub.2] and again [S.sub.2] is
fragmented into [[S.sub.[PK.sub.xx]].sup.2]. Instant a will be update.
After that, all replica will commit and unlock.
CONCLUSION
Handling fragmented database replication is very important in order
to preserve the data availability, consistency and reliability of the
systems. Therefore, a new Binary Vote Assignment on Grid Quorum
technique has been proposed to maintain and manage the fragmented
database replication. From the experiment result, it shows that the
system preserves the data consistency through the synchronization
approach for all replicated sites. Furthermore, it guarantees the
consistency since the transaction execution is obeyed the
one-copy-serializability.
ACKNOWLEDGEMENT
Appreciation conveyed to Ministry of Higher Education Malaysia for
supporting this project under Fundamental Research Grant Scheme,
RDU100109.
REFERENCES
Ahmad, N. A.A.C. Fauzi, N.M. Zin and A.H. Beg, 2010a. Lowest data
replication storage of binary vote assignment data grid. Proceeding of
the 2nd International Conference on 'Networked Digital
Technologies' (NDT 2010). Springer-Verlag Berlin Heidelberg, pp:
466-473.
Ahmad, N., R.M. Sidek, M.F.J. Klaib and T.L. Jayan, 2010b. A novel
algorithm of managing replication and transaction through
Read-One-Write-All Monitoring Synchronization Transaction System
(ROWA-MSTS). Proceedings of the 2nd International Conference on Network
Applications Protocols and Services, Sept. 22-23, IEEE Xplore Press,
Kedah, pp: 20-25. DOI:10.1109/NETAPPS.2010.11
Ahmed, A., A.N. Abdalla and R.M. Sidek, 2010a. Data replication
using read-one-write-all monitoring synchronization transaction system
in distributed environment. J. Comput. Sci., 6: 1095-1098. DOI:
10.3844/jcssp.2010.1095.1098
Ahmed, N., R.M. Sidek and M.F.J. Klaib, 2010b. Development of
ROWA-MSTS in distributed systems environment. Proceedings of the 2nd
International Conference on Computer Research and Development, May 7-10,
IEEE Xplore Press, Kuala Lumpur, pp: 868-871. DOI: 10.1109/ICCRD.2010.70
Choi, S.C. and H.Y. Youn, 2010. Dynamic hybrid replication
effectively combining tree and grid topology. J. Supercomput., 1-23.
DOI: 10.1007/s11227-010-0536-6
Connolly, T. and C. Begg, 1998. Database System A Practical
Approach to Design, Implementation and Management (International
Computer Science Series). 2nd Edn, Addison-Wesley Publishing Company,
ISBN: 10: 0201342871, pp: 848.
Dastgheib, A.M., 2010. Introducing a suggestive dynamic data
replication algorithm. Proceedings of the International Conference
Electronic Computer Technology, May 7-10, IEEE Xplore Press, Kuala
Lumpur, pp: 97-101. DOI: 10.1109/ICECTECH.2010.5479980
Deris, M.M., J.H. Abawajy, D. Taniar and A. Mamat, 2009. Managing
data using neighbour replication on a triangular-grid structure. Int. J.
High Perform. Comput. Network., 6: 56-65. DOI:
10.1504/IJHPCN.2009.026292
Gudiu, A., E. Voi an and F. Dragan, 2010. Database replication
driven communication model for distributed dedicated web hosting
systems. Proceeding of the International Joint Conference on
Computational Cybernetics and Technical Informatics, May 27-29, IEEE
Xplore Press, Timisoara, pp: 311-314. DOI: 10.1109/ICCCYB.2010.5491260
Perez, J.M., F. Garcia-Carballeira, J. Carretero, A. Calderon and
J. Fernandez, 2010. Branch replication scheme: A new model for data
replication in large scale data grids. Future Generation Comput. Syst.,
26: 12-20. DOI: 10.1016/J.FUTURE.2009.05.015
Ren, K., Z. Li and C. Wang, 2010. LBDRP: A low-bandwidth data
replication protocol on journal-based application. Proceedings of the
2nd International Conference on Computer Engineering and Technology,
Apr. 16-18, IEEE Xplore Press, Chengdu, pp: 89-92. DOI:
10.1109/ICCET.2010.5485707
Storm, C. and O. Theel, 2009. A general approach to analyzing
quorum-based heterogeneous dynamic data replication schemes. Proceedings
of the 10th International Conference on Distributed Computing and
Networking (ICDCN'09), Springer-Verlag Berlin, Heidelberg, pp:
349-361. DOI: 10.1007/978-3-540-92295-7_42
Ainul Azila Che Fauzi, A. Noraziah, Noriyani Mohd Zain, A.H. Beg,
Nawsher Khan and Elrasheed Ismail Sultan Faculty of Computer Systems and
Software Engineering, University Malaysia Pahang, 26300, Kuantan,
Pahang, Malaysia
Corresponding Author: Ainul Azila Che Fauzi, Faculty of Computer
Systems and Software Engineering, University Malaysia Pahang, 26300,
Kuantan, Pahang, Malaysia
Table 1: Experiment result
Time Replica 1 Replica 2 Replica 3
0.1 unlock (a) unlock (a) unlock (a)
0.2 Begin transaction Begin Begin
Transaction transaction
0.3 [V.sub.[eta]a,q1] get lock
0.4 [V.sub.[eta]a,qr +1... ],
[V.sub.[eta]a,qk] aborted
0.5 [V.sub.[eta]a,q1] fragmented
into [S.sub.2]
0.6 [S.sub.2] is fragmented into
[S.sub.PKxx].sup.2]
0.7 [S.sub.PKxx].sup.2] divided
into [T.sub.1] and
[T.sub.2]
Where. [T.sub.1] = [P.sub.1]
= Primary key
[T.sub.2] = [P.sub.6] =
instant a
0.8 update a
0.9 Commit Commit Commit
[[lambda]a,[q.sub.1]] [[lambda]a,[q.sub.1]]
[[lambda]a,[q.sub.1]]
0.10 unlock (a) unlock (a) unlock (a)
Table2: Experiment result
Time Replica 1 Replica Replica
2 3
0.1 unlock (a) unlock unlock
(a) (a)
0.2 Begin transaction
0.3 write lock (a) counter (a)=1
0.4 wait
0.5 [V.sub.[eat]a,q1] propagate lock: 2
[V.sub.[psi]a,q1] aborted
0.6 [V.sub.[eat]a,qr+1]...., [V.sub.[eat]a,qk]
aborted
0.7 [V.sub.[eat]a,q1] fragmented into [S.sub.2]
0.8 [S.sub.2] is fragmented into
[S.sub.PKxx.sup.2]
0.9 [S.sub.PKxx.sup.2] divided into T1 andT2
Where, T1 = P1 = Primary key T2 = P6 = instant
a
0.10 update a
0.11 Commit [[lambda].sub.a],[q.sub.1]
0.12 unlock (a) unlock unlock
(a) (a)