Active Passive cluster consists of two nodes: active node and passive node. Active passive cluster ensures constant service of a certain application keeping redundant nodes to back up system in the event of a failure. When the active node goes down one of the backup nodes takes the responsibility and runs those services. In the process of fail over, the services are restarting on another node immediately without administration intervention.
Implementation
This section explains how to set up an Active/Passive cluster using Pacemaker and Corosync and setup highly available oracle cluster with DRBD and XFS (high performance file system) to store data. Here we have used following IP addresses.
Node 1:
Hostname: CMBTRNPROJ4
IP: 10.17.73.239
Node 2:
Hostname: CMBTRNPROJ5
IP: 10.17.73.241
Cluster IP: 10.17.73.202
We used this virtual IP to reach currently active node. This must be unused IP address in the
network.
Basic configuration
Add following the host IP addresses etc/host on both nodes
127.0.0.1
localhost
10.17.73.239 CMBTRNPROJ4 node1
10.17.73.241 CMBTRNPROJ5 node2
Figure shows basic architecture of Active passive cluster.
Architecture of Active passive cluster
Basic Installation process1) Install Pacemaker and Corosync packages on Debien machines.
2) Do the authentication between nodes. Create an Authkey to communicate between two
nodes. Run this command only on first node. And on the second node copy the key from
node 1.
3) This will put those files in to etc/Corosync directory with right permissions. Now change etc/default/Corosync file on both nodes to start Corosync at boot
4) Change only the Interface section of etc/Corosync/Corosync.conf file. Other sections are ok
to start with.
interface {
# The following values need to be set based on your
environment
ringnumber: 0
bindnetaddr: 10.17.73.0
mcastaddr: 226.0.0.1
mcastport: 5405
}
These should be changed according to the environmental variables. The bind net address is set to
local subnet. If our first node is 10.17.73.239 and second one is 10.17.73.241, then bind net
address would be 10.17.73.0.
5) Start Corosync on both nodes.
etc/init.d/Corosync start
6) Now check the cluster status with
crm_mon or crm_mon –-one-shot –V
7) Take a look on the log files less /var/log/Corosync/Corosync.log
Configuring cluster resources
This will shows the basic steps for adding cluster resource call cluster IP.
1) Configure an IP Resource
2) Create copy of current configuration to edit and commit it after editing.
3) Now go to configuration mode to see current configuration
4) Add Failover IP to the configuration
5) And finally we commit changes to the cluster and quit.
6) Configure weight-points to prevent cluster resources flapping among the nodes. When node
goes down, this makes the resource that was running on the others server be kept there.
7) Run the command crm_mon and check cluster status
============
Stack: openais
Current DC: node1 - partition with quorum
Version: 2.0.6-cebe2b6ff49b36b29a3bd7ada1c4701c7470febe
2 Nodes configured, 2 expected votes
1 Resources configured.
============
Online: [ node1 node2 ]
ClusterIP
(ocf::heartbeat:IPaddr):
Started node1
8) Adding another node
Add the IP address of new node to etc/hosts and to new node itself as mentioned earlier. Install pacemaker/Corosync and copy aurhkey and Corosync.conf file to add the node to cluster.
Creating active passive cluster
Active Passive cluster ensures that no important service go offline due to failure of one node. The resources simply migrate to the one of other backup nodes when failover happens. The above mentioned configuration of active passive is sufficient when dealing with static data. But when the requirement comes with dynamic data, the data must be synchronized between all cluster nodes. To we used DRBD to achieve that purpose.
DRBD
DRBD is acronym for Distributed Replicated Block Device. It ensures data replication between cluster nodes and data is kept in synchronized. DRBD can be configured as single primary mode or as dual primary mode. In single primary mode a resource, at any given time is available only on one cluster node. Therefore one cluster node manipulates data at a given time. In dual primary mode a resource, at any given time is in primary role on all cluster nodes hence it is possible to have concurrent access to same data. DRBD requires separate partition or logical volume to proceed with its configurations. We had a separate partition for that purpose. Figure shows the Process of DRBD.
Disk: dev/sda3Device: dev/drbd0
Process of DRBD
Installation process1) All features of DRBD are controlled and managed by its configuration file, etc/drbd.conf. Normally it contains include statements to global_common.conf file and to resource file. All the configuration files must be identical on both nodes.
drbd.d/global_common.conf
drbd.d/*.res file
etc/drbd.d/OraData.res
resource OraData {
meta-disk internal;
device /dev/drbd0;
startup {
become-primary-on both;
}
syncer {
verify-alg sha1;
}
net {
allow-two-primaries;
ater-sb-0pri discard-zero-
changes;
ater-sb-1pri discard-secondary;
ater-sb-2pri disconnect;
}
etc/drbd.d/global_common.conf
global {
usage-count yes;
}
common {
protocol C;
}
Syncer{
Rate 30M;
}
on CMBTRNPROJ4 {
disk /dev/sda3;
address 10.17.73.239:7789;
}
on CMBTRNPROJ5 {
disk /dev/sda3;
address 10.17.73.241:7789;
}
2) Now initialize and load DRBD. First do this on master node.
3) Repeat on the second node
4) Now start initial full synchronization. This step must be performed on only one node.
5) Populate DRBD with data. Here we used file system XFS instead of ext3 or ext4. XFS if
known to be fast performance file system of Linux and suits to the purpose.
6) Make directory u01 and Mount the newly created file system
Final configuration of the oracle Active-Passive cluster
The table shows the scripts and resources for final configuration of the Active-Passive cluster.
Resource Script
Cluster IP
Configure primitive ClusterIP ocf:heartbeat:IPaddr params ip=10.17.73.194 op monitor interval=10s
Data replication
(DRBD)
configure primitive DrbdData ocf:linbit:drbd params drbd_resource=
OraData op monitor interval=60s
configure ms DrbdData Clone DrbdData meta master-max=1 master-node-
max=1 clone-max=2 clone-node-max=1 notify=true
configure primitive OracleFS ocf:heartbeat:Filesystem \
params device="/dev/drbd/by-res/OraData" directory="/u01/"
fstype="ext4"
configure colocation fs_on_drbd inf: OracleFS DrbdDataClone:Master
configure order OracleFS-after-DrbdData inf: DrbdDataClone:promote
OracleFS :start
cib commit fs
crm configure primitive OraLSN ocf:heartbeat:oralsnr params
sid=OraCL op monitor depth="0"
home=/u01/app/oracle/product/11.2.0/dbhome_1 timeout="60"
interval="90"
crm configure primitive OraServer ocf:heartbeat:oracle params
sid=CMBDB home=/u01/app/oracle/product/11.2.0/dbhome_1
user=oracle op monitor interval=90s
crm configure primitive Cluster_mon ocf:pacemaker:ClusterMon params pidfile="/var/run/crm_mon.pid" htmlfile="/var/www/index.html" op
monitor interval="10s" timeout="20s"
Crm configure primitive WebSite ocf:heartbeat:apache \
params configfile="/etc/apache2/apache2.conf" \
op monitor interval="60s"
Collocation and Resource order
crm configure ms DrbdDataClone DrbdData \
meta master-max="1" master-node-max="1" clone-max="2" clone-
node-max="1" notify="true"
crm configure colocation Cluster_mon-with-ClusterIP inf: ClusterIP
Cluster_mon
crm configure colocation OracleFs-with-ClusterIP inf: ClusterIP
OracleFS
crm configure colocation OracleLSN-with-FS inf: OracleFS OraLSN
crm configure colocation OracleLSN-with-OracleServer inf: OraLSN
OraServer
crm configure colocation WebSite-with-ClusterIP inf: WebSite
ClusterIP
crm configure colocation fs_on_drbd inf: OracleFS
DrbdDataClone:Master
crm configure order Cluster_mon-ClusterIP inf: ClusterIP Cluster_mon
crm configure order OraSerever-after-OracleLSN inf: OraLSN OraServer
crm configure order OracleFS-after-ClusterIP inf: ClusterIP OracleFS
crm configure order OracleFS-after-DrbdData inf:
DrbdDataClone:promote OracleFS:start
crm configure order OracleLSN-after-FS inf: OracleFS OraLSN
crm configure order WebSite-after-ClusterIP inf: ClusterIP WebSite
Cluster monitoring
Monitoring the cluster is final and the important step in the process. Generally cluster status can be monitor from the any node from the node crm_mon but it is not so efficient to access cluster node to get status of cluster. Pacemaker provide cluster monitor resource do the job easily. The data and the information comes from monitor resource will be displayed in a website which also should be added as separate resource. Email and mobile resources are also available on the case of monitoring. There are open source products monitor cluster and some of them are supported Pacemaker too. Monitoring website provide monitor cluster from any computer even without being cluster node by simply using cluster IP as URL. Now we can monitor cluster status browsing http://10.17.73.194. (10.17.73.194 is our Cluster IP address).Figure shows the cluster monitoring from cluster node and through website which is more similar and cluster monitoring from cluster node and through website by URL
Cluster monitoring from cluster node and through website by URL
Conclusion
This assures that no service go offline due to fail over and services are constantly available without being notified fail over to the user. Only one node is active at a given time while others remain idle and do no useful work. This can be seen as some sort of wasting processing power and computer resources. Therefore high capable nodes are assigned as masters while less capable machines act as slave nodes.
No comments :
Post a Comment