Veritas Cluster

Cluster Information

Veritas cluster 4.0 can have upto 32 nodes.

LLT (Low-Latency Transport)

veritas uses a high-performance, low-latency protocol for cluster communications. LLT runs directly on top of the data link provider interface (DLPI) layer ver ethernet and has several major junctions:

  • sending and receiving heartbeats
  • monitoring and transporting network traffic over multiple network links to every active system within the cluster
  • load-balancing traffic over multiple links
  • maintaining the state of communication
  • providing a nonroutable transport mechanism for cluster communications.

Group membership services/Atomic Broadcast (GAB)

GAB provides the following:

  • Group Membership Services – GAB manitains the overall cluster membership by the way of its Group Membership Sevices function. Heartbeats are used to determine if a system is active member, joining or leaving a cluster. GAB determines what the position of a system is in within a cluster.
  • Atomic Broadcast – Cluster configuration and status information is distributed dynamically to all system within the cluster using GAB’s Atomic Broadcast feature. Atomic Broadcast ensures all active system receive all messages, for every resource and service group in the cluster. Atomic means that all system receive the update, if one fails then the change is rolled back on all systems.

High Availability Daemon (HAD)

The HAD tracks all changes within the cluster configuration and resource status by communicating with GAB. Think of HAD as the manager of the resource agents. A companion daemon called hashadow moniotrs HAD and if HAD fails hashadow attempts to restart it. Like wise if hashadow daemon dies HAD will restart it. HAD maintains the cluster state information. HAD uses the main.cf file to build the cluster information in memory and is also responsible for updating the configuration in memory.

VCS architecture

So putting the above altogether we get:

  • Agents monitor resources on each system and provide status to HAD on the local system
  • HAD on each system send status information to GAB
  • GAB broadcasts configuration information to all cluster members
  • LLT transports all cluster communications to all cluster nodes
  • HAD on each node takes corrective action, such as failover, when necessary

Service Groups

There are three types of service groups:

  • Failover – The service group runs on one system at any one time.
  • Parallel – The service group can run simultaneously pn more than one system at any time.
  • Hybrid – A hybrid service group is a combination of a failover service group and a parallel service group used in VCS 4.0 replicated data clusters, which are based on Veritas Volume Replicator.

When a service group appears to be suspended while being brought online you can flush the service group to enable corrective action. Flushing a service group stops VCS from attempting to bring resources online or take them offline and clears any internal wait states.

Resources

Resources are objects that related to hardware and software, VCS controls these resources through these actions:

  • Bringing resource online (starting)
  • Taking resource offline (stopping)
  • Monitoring a resource (probing)

When you link a parent resource to a child resource, the dependency becomes a component of the service group configuration. You can view the dependencies at the bottom of the main.cf file.

Proxy Resource

A proxy resource allows multiple service groups to monitor the same network interface. This reduces the network traffic that would result from having multiple NIC resources in different service groups monitoring the same interface.

Phantom Resource

The phantom resource is used to report the actual status of a service group that consists of only persistent resources. A service group shows an online status only when all of its nonpersistent resources are online. Therefore, if a service group has only persistent resources (network interface), VCS considers the group offline, even if the persistent resources are running properly. By adding a phantom resource, the status of the service group is hsown as online.

scsi-initiator-id

All node within the cluster must have a unique scsi-initiator-id, to set the scsi-initiator-id follow below:

  1. At the OBP set the scsi-initiator-id to 6

OK> setenv scsi-initiator-id 6
OK> printenv scsi-initiator-id

  1. When the server has booted create and enter the following in /kernel/drv/glm.conf

name=”glm” parent=/pci@1f,4000
unit-address=”5″
scsi-initiator-id=6;

  1. To check that the scsi-initiator-id has been set use the following command
# prtconf -v          # search through the listing finding scsi-initiator-id

Installation

Before you install VCS make sure you have the following perpared:

  • Cluster Name
  • Unique ID Number
  • Hostnames of the servers
  • Devices names of the network interfaces for the private networks
  • Root access
  • Able to perform remote shell from all systems (.rhosts file requires updating)
  • VCS software

To install VCS follow below, remember that both hosts must be able to root rsh into each otherwithout requesting for a password: –

  1. Start the VCS installation by entering

    # ./installVCS

  2. Enter the cluster name and the unique ID number

    Cluster name:             cluster1
    Unique ID:                1

  3. Enter the systems names that require clustering

    System names: sun1 sun2

  4. The software will now check each servers remote access and then install the software on each server.
  5. A list will appear detailing all the NIC’s available. Select the FIRST then the SECOND private networks links

    First Link:           hme0
    Second Link:          qfe0

  6. Answer Yes to the next questions (Servers are identical)
  7. The LLT and GAB files will be copied and a successful message will appear