Tags

, ,

Why CTDB ?

  • Traditionally Clustering involves a SAN connected to n nodes. The storage can be  accessed only by the nodes participating in the cluster and as the the need for more storage and users grow , space tends to be small and clustering becomes small
  • So we need a file system that can be accessed by arbitrary number of clients and not restricted to the systems participating in the cluster.  One of the answers to this problem is Distributed File system.
  • We need to distribute the existing shared storage using network protocols like NFS and cifs. With samba and CTDB we can achieve this goal of distributing the shared File  system using CIFS Protocol
  • CTDB is originally developed specifically as cluster Enhancement software and contains high availability, load balancing features which makes file services like samba, NFS and FTP cluster-able.

Basic Infrastructure of CTDB

  • Storage is attached to nodes participating in the cluster through FC or iscsi
  • Shared File system which supports POSIX-fcntl locks
      • IBM General Parallel File system (GPFS)
      • Global File system (GFS)
      • GNU Cluster File system (Gluster)
      • Sun’s Lustre
      • OCFS2

Basics of CIFS File system

  • CIFS (Common Internet File system is a standard remote file system access protocol for use over network, enabling groups of users to connect and share documents
  • CIFS is open, cross-platform based on SMB (Server Message Block) Protocol, which is native file-sharing Protocol in the Windows Operating system. On RHEL this is implemented using samba
  • CIFS runs over TCP/IP

Basics of Samba

  • Samba provides File and Print services for all the clients using SMB/CIFS protocols
  • Apart from File and Print services, it also does Authentication and authorization, Name
  • Resolution and Service Announcement
  • File and Print services are provided by the smbd daemon
  • Name resolution and browsing is provided by nmbd daemon
  • Configuration file is /etc/samba/smb.conf

TDB (Trivial Database)

  • Samba keeps track of all the information needed to serve clients in a series of *.tdb files
  • located in /var/lib/samba or /var/cache/samba
  • Some of the TDB files are persistent
  • TDB files are very small like Berkely database files
  • Allow multiple simultaneous writes

Example TDB Files:

  • account_policy.tdb    NT account policy settings such as pw expiration
  • brlock.tdb  Byte-range locks
  • connections.tdb Share connections (used to enforce max connections, etc…)
  • Messages.tdb Samba messaging system

What Does CTDB do ?

  • CTDB (Clustered Trivial Data Base) is a very thin and fast database that is developed for samba to make clusterize samba.
  • What CTDB does is to make it possible for Samba to run and serve the same data from several different hosts in network at the same time
  • which means samba becomes clustered service and and are active and exports the samba shares, read-write operations at the same time making it high-available.
  • To do above we require a method of communication (IPC) for samba daemons running between nodes and share some persistent data (TDB files). Some of the information that should be shared are:
  • User information
  • For samba acting as a member server of a Domain, The Domain SID should be shared
  • The user mapping tables Mapping of Unix UID’s and GID’s to Windows Users and Groups
  • The active SMB-sessions and connections are shared
  • locking information like byte-range locks granted exclusively to users to access a particular file have to be shared between all the nodes These locks are Windows Locks i.e when Multiple windows/samba clients access files these locks are given by smbd daemon so it makes sense to share these locks between smb daemons on different nodes

Sample Diagram on CTDB Messages are shared between 2 CTDB Clusters:

Below are the list of TDB files that are to be shared between CTDB Clusters:

  • SMB Sessions (sessionid.tdb)
  • share connections (connections.tdb)
  • share modes (locking.tdb)
  • byte range locks (brlock.tdb)
  • user database (passdb.tdb)
  • domain Join Information (secrets.tdb)
  • id mapping tables (winbind_idmap.tdb)
  • registry (registry.tdb)

Requirements to configure CTDB cluster on RHEL6

    • GFS Packages
    • HA Packages
    • ctdb, samba
    • ctdb-tools

Configuring samba to use CTDB

  • We require 2 separate networks, one internal network through which CTDB daemons communicate and one public network through which it offers cluster services like samba, NFS etc.
  • Install samba and CTDB Packages

$ yum install samba ctdb tdb-tools

  • Configure /etc/samba/smb.conf to make samba cluster aware, Add the below lines in “global” section of smb.conf

clustering = yes
idmap backend = tdb2

  • CTDB Cluster configuration

/etc/sysconfig/ctdb: is the primary configuration file and it contains startup parameters for ctdb. The important parameters are:

CTDB_NODES

CTDB_NODES=/etc/ctdb/nodes
This parameter specifies the file that needs to be created and should contain list of
Private IP address that CTDB daemons will use in the cluster. It should be a private
non-routable subnet which is only used for cluster traffic. This file must be same on all
nodes in the cluster.
contents of /etc/ctdb/nodes :
192.168.122.7
192.168.122.8
192.168.122.9
192.168.122.10

CTDB_RECOVERY_LOCK

This parameter specifies the lock file that the CTDB daemons use to arbitrate which
node is acting as a recovery master. This file must be held on shared storage so that all
CTDB daemons in the cluster will access/lock the same file.

CTDB_RECOVERY_LOCK=”/ctdb/cifs/lockfile”

CTDB_PUBLIC_ADDRESS
This parameter specifies the name of the file which contains the list of public addresses that particular node can host. While running, the CTDB cluster will assign each public address that exists in the entire cluster to one node that will host that public address. These are the addresses that the SMBD daemons and other services will bind to and which clients will use to connect to the cluster

Example 3 node cluster:
CTDB_PUBLIC_ADDRESSES=/etc/ctdb/public_addresses

Content of /etc/ctdb/public_addresses:

10.65.208.142/22 eth0
10.65.208.143/22 eth0
10.65.208.144/22 eth0

Configure it as one DNS A record (==name) with multiple IP addresses and let round- robin DNS distribute the clients across the nodes of the cluster

The CTDB cluster utilizes IP takeover techniques to ensure that as long as at least one node in the cluster is available, all the public IP addresses will always be available to clients.

/etc/ctdb/events.d This is a collection of scripts that is called to by CTDB when certain events occur to allow for site specific tasks to be performed

  • Start CTDB daemon and let ctdb call the smbd daemon, samba daemon should not be started by init process

#chkconfig ctdb on
#chkconfig smb off
#chkconfig nmb off

  • Start the ctdb daemon

# service ctdb start

Example Diagram of a 3 Node CTDB Cluster:

How does CTDB work

  • On each node CTDB daemon “ctdbd” is running, samba instead of writing    directly to TDB databases it talks via local “ctdbd”
  • “ctdbd” negotiates the metadata for the TDB’s over the n/w
  • For actual read and write operations local copies are maintained on fast local storage
  • We have 2 kinds of TDB files Persistent &  Normal
  • Persistent TDB files should always be up-to-date and each node always has a updated copy. These TDB files are kept locally (LTDB) on the local storage and not on the shared storage. So the read and write operations are faster
  • The node when wants to write to Persistent TDB, it locks the whole database , perform read and write operations and the transaction commit operations is finally distributed to all nodes and also written locally
  • Normal TDB files are maintained temporarily . The idea is that each node doesn’t have to know all the records of a database. It’s sufficient to know the records which affect it’s own client  connections, so when the node goes down it is acceptable to lose those records
  • Each node carry certain roles
    • DMASTER (data master)
      • Current, authoritative copy of a record
      • Moves around as nodes write to a record
    • LMASTER (Location Master)
      • knows the location of the DMASTER
      • Knows where the record is stored
  • Only one node has the current authoritative copy of a record, i.e data master
      • Step-1: Get a lock on a record in TDB
      • Step-2: Check if we are on Data master
        • if we are DMASTER for this record
        • then operate on the record and unlock it when finished
      • Step-3: if we are not DMASTER for this record unlock the record
      • Step-4: Send a request to the local CTDB daemon to request the record to be migrated on to this node
      • Step-5 once we get a reply from local “ctdb” daemon that the record is now   locally available, go to step-1

Failover

  • CTDB assigns IP address from the pool (CTDB_PUBLIC_ADDRESS) to the healthy node
  • When the node goes down IP is moved to other node
  • Client reconnects to the new node using tickle ACKs if the below conditions are met :
    • Node goes down
    • Client doesn’t know yet that ip has moved
    • New node sends TCP ACK with seq 0 to the client
    • client sends correct ACK to the client
    • New node resets the connection using RST
    • client re-establishes connection to new node
  • recovery master – performs recovery , collects most recent copy from all nodes and recovery master becomes data master.
  • Recovery master is determined by the election process, the RECOVERY_LOCK file acts as arbitrator and nodes compete to get a lock (POSIX fcntl byte-range) on that file.
  • If the Recovery master node is gone, we need to assign the role to a new node.

Commands to manage CTDB

$ctdb status: status command provides basic information about the cluster and the status of the nodes.

$ctdb ping: This command tries to ping each of the CTDB daemons in the cluster

$ctdb ip: This command prints the current status of the public ip addresses and which physical node is currently serving that ip

$onnode: onnode is used to run commands on ctdb nodes

examples:
$onnode all pidof ctdbd
$onnode all netstat -tn | grep 4379

CTDB Status Messages:

“ctdb status” specifies the node status , There are 5 possible states:

  • ok This node is fully functional
  • DISCONNECTED This node could not be connected through the network and is currently not participating in the cluster.
  • UNHEALTHY ctdbd daemon is running but the service provided by ctdbd has failed
  • BANNED Too many failed too many recovery attempts and is banned from participating in cluster for a period of “RecoveryBanPeriod” seconds
  • STOPPED A node that is stopped does not host any Public IP address and is not part of the cluster

Troubleshooting:

  • ctdb log file /var/log/log.ctdb
  • Output of “ctdb status” and “onnode” commands output would be helpful
  • /var/log/samba contains logs related to smbd daemon
  • If needed tcpdump on port 4379 can be taken, wireshark is capable of identifying CTDB protocol and can display various CTDB status
  • testparm output to check if clustering is enabled ?

Documentation

Man Pages:

man ctdb
man ctdbd
man onnode

Links:

FAQ ?

Q) When CTDB itself a cluster why do we require HA Packages to be installed like cman ?

CTDB will not work without Red Hat Cluster Suite. As CTDB requires gfs and gfs in-turn
requires cman to start dlm_controld and gfs_controld. So CMAN is pre requisite for CTDB

Q)How does CTDB solve the split-brain problem

Well this problem doesn’t arise in the first place as CTDB is all-active node and not active/passive setup where passive nodes suddenly become active.

Q)How to identify which node is actually serving or which node is Data master

The ip to which client is connected is the data master, i.e from the pool of public addresses , to which ever ip the client connects and the node to which the ip is assigned becomes Data  master (DMASTER)

Q) How to identify which node is recovery master (RMASTER)

The node which holds the lock file . The lock file is the file which is saved in the shared file
system. (CTDB_RECOVERY_LOCK)


Advertisements