1 VERITAS CLUSTER SOLARIS
1.1 Overview
* Conf files:
Llt conf: /etc/llttab [should NOT need to access this]
Network conf: /etc/gabtab
If has: /sbin/gabconfig -c -n2 , will need to run
/sbin/gabconfig -c -x if only one system comes up and
both systems were down.
Cluster conf: /etc/VRTSvcs/conf/config/main.cf
Has exact details on what the cluster contains.
* Most executables are in: /opt/VRTSvcs/bin or /sbin
1.2 Stato delle licenze:
* /opt/VRTS/bin/vxlicrep
* Per aggiungere licenze:
cd /opt/VRTSvcs/install
./licensevcs
1.3 Amministrazione via web (porta 8181)
* http://10.74.24.122:8181/vcs/index
1.4 Amministrazione con interfaccia grafica:
* hagui
1.5 Utili informazioni sul cluster:
* hastatus -summary
* haclus -display
* hares -list
* hasys -display
* hatype -list
* hagrp -list
* hagrp -display
* Per la conf HW, può essere utile lanciare anche
prtdiag. Su Fujitsu ci sono:
* /opt/FJSVmadm/sbin/hrdconf -l
* /usr/platform/FJSV,GPUSK/sbin/prtdiag
1.6 Verifica LLT - Low Latency Transport
* /etc/llthosts
* /etc/llttab
* Per verificare i links attivi per LLT.
lltstat -n (da eseguire su ogni sistema)
Si può usare anche lltstat -nvv. Mostra i sistemi nel
cluster e gli heartbeat di rete (i links, sono 2 in
genere per sistemi ben configurati)
* Per lo stato delle porte: lltstat -p
* Verifico che il modulo è caricato dal kernel:
modinfo | grep llt
* Se devo fare l'unload del modulo dal kernel
modunload -i llt_id
1.7 Verifica GAB - Group Membership and Atomic Broadcast
* /etc/gabtab (c'è l'heartbeat dei dischi)
* Esempio di gabtab:
/sbin/gabdiskhb -a /dev/dsk/c2t1d2s3 -s 16 -p a
/sbin/gabdiskhb -a /dev/dsk/c2t1d2s3 -s 144 -p h
/sbin/gabconfig -c -n2 (il numero dopo n indica il
quorum, ovvero i sistemi che devono essere attivi per
formare il cluster VCS affinchè parta)
* Lanciare il comando /sbin/gabconfig -a
Ci sono situazioni particolari:
* Output vuoto: GAB non sta girando
* Se appare jeopardy (che significa "pericolo") invece
di solo membership, allora un link è broken
* verifica del GAB sui dischi: mostra l'heartbeat dei dischi.
gabdiskhb -l
* Verifico che il modulo è caricato dal kernel:
modinfo | grep gab
* Se devo fare l'unload del modulo dal kernel
modunload -i gab_id
1.8 Verifica main.cf
* Per verificare la sintassi del file
/etc/VRTSvcs/conf/config/main.cf si usa hacf:
# cd /etc/VRTSvcs/conf/config
# cd /etc/VRTSvcs/conf/config
# ./hacf -verify .
1.9 Configurazione globale dei gruppi gestiti dal cluster
* basta guardare il file
/etc/VRTSvcs/conf/config/main.cf. Si vedono molti
dettagli, i parametri di default per esempio, col
comando seguente:
* /opt/VRTS/bin/hagrp -display
Si verifica la configurazione. Sono incluse le
dipendenze dei pacchetti. Ad esempio:
#Group Attribute System Value
ClusterService Administrators global
ClusterService AutoFailOver global 1
ClusterService AutoRestart global 1
ClusterService AutoStart global 1
ClusterService AutoStartIfPartial global 1
ClusterService AutoStartList global prodsshr1 prodsshr0
ClusterService AutoStartPolicy global Order
ClusterService Evacuate global 1
ClusterService ExtMonApp global
ClusterService ExtMonArgs global
ClusterService FailOverPolicy global Priority
ClusterService FaultPropagation global 1
ClusterService Frozen global 0
ClusterService GroupOwner global
ClusterService IntentOnline global 1
ClusterService Load global 0
ClusterService ManageFaults global ALL
ClusterService ManualOps global 1
ClusterService NumRetries global 0
ClusterService OnlineRetryInterval global 0
ClusterService OnlineRetryLimit global 0
ClusterService Operators global
ClusterService Parallel global 0
ClusterService PreOffline global 0
ClusterService PreOnline global 0
ClusterService PreonlineTimeout global 300
ClusterService Prerequisites global
ClusterService PrintTree global 1
ClusterService Priority global 0
ClusterService Restart global 0
ClusterService SourceFile global ./main.cf
ClusterService SystemList global prodsshr1 1
prodsshr0 2
ClusterService SystemZones global
ClusterService TFrozen global 0
ClusterService TFrozen global 0
ClusterService Tag global
ClusterService TriggerEvent global 1
ClusterService TriggerResStateChange global 0
ClusterService TypeDependencies global
ClusterService UserIntGlobal global 0
ClusterService UserStrGlobal global
ClusterService AutoDisabled prodsshr0 0
ClusterService AutoDisabled prodsshr1 0
ClusterService Enabled prodsshr0 1
ClusterService Enabled prodsshr1 1
ClusterService PreOfflining prodsshr0 0
ClusterService PreOfflining prodsshr1 0
ClusterService PreOnlining prodsshr0 0
ClusterService PreOnlining prodsshr1 0
ClusterService Probed prodsshr0 1
ClusterService Probed prodsshr1 1
ClusterService ProbesPending prodsshr0 0
ClusterService ProbesPending prodsshr1 0
ClusterService State prodsshr0 |OFFLINE|
ClusterService State prodsshr1 |ONLINE|
ClusterService UserIntLocal prodsshr0 0
ClusterService UserIntLocal prodsshr1 0
ClusterService UserIntLocal prodsshr1 0
ClusterService UserStrLocal prodsshr0
ClusterService UserStrLocal prodsshr1
#
beadm_sg Administrators global
beadm_sg AutoFailOver global 1
beadm_sg AutoRestart global 1
beadm_sg AutoStart global 1
beadm_sg AutoStartIfPartial global 1
beadm_sg AutoStartList global prodsshr1 prodsshr0
beadm_sg AutoStartPolicy global Order
beadm_sg Evacuate global 1
beadm_sg ExtMonApp global
beadm_sg ExtMonArgs global
beadm_sg FailOverPolicy global Priority
beadm_sg FaultPropagation global 1
beadm_sg Frozen global 0
beadm_sg GroupOwner global
beadm_sg IntentOnline global 1
beadm_sg Load global 0
beadm_sg ManageFaults global ALL
beadm_sg ManualOps global 1
beadm_sg ManualOps global 1
beadm_sg NumRetries global 0
beadm_sg OnlineRetryInterval global 0
beadm_sg OnlineRetryLimit global 0
beadm_sg Operators global
beadm_sg Parallel global 0
beadm_sg PreOffline global 0
beadm_sg PreOnline global 0
beadm_sg PreonlineTimeout global 300
beadm_sg Prerequisites global
beadm_sg PrintTree global 1
beadm_sg Priority global 0
beadm_sg Restart global 0
beadm_sg SourceFile global ./main.cf
beadm_sg SystemList global prodsshr1 1 prodsshr0 2
beadm_sg SystemZones global
beadm_sg TFrozen global 0
beadm_sg Tag global
beadm_sg TriggerEvent global 1
beadm_sg TriggerResStateChange global 0
beadm_sg TypeDependencies global
beadm_sg UserIntGlobal global 0
beadm_sg UserStrGlobal global
beadm_sg AutoDisabled prodsshr0 0
beadm_sg AutoDisabled prodsshr1 0
beadm_sg Enabled prodsshr0 1
beadm_sg Enabled prodsshr1 1
beadm_sg PreOfflining prodsshr0 0
beadm_sg PreOfflining prodsshr1 0
beadm_sg PreOnlining prodsshr0 0
beadm_sg PreOnlining prodsshr1 0
beadm_sg Probed prodsshr0 1
beadm_sg Probed prodsshr1 1
beadm_sg ProbesPending prodsshr0 0
beadm_sg ProbesPending prodsshr1 0
beadm_sg State prodsshr0 |ONLINE|
beadm_sg State prodsshr1 |OFFLINE|
beadm_sg UserIntLocal prodsshr0 0
beadm_sg UserIntLocal prodsshr1 0
beadm_sg UserStrLocal prodsshr0
beadm_sg UserStrLocal prodsshr1
#
1.10 VERITAS Cluster Basic Administrative Operations
1.10.1 Administering Service Groups
* To start a service group and bring its resources
online
# hagrp -online service_group -sys system
* To start a service group on a system (System 1) and
bring online only the resources already online on
another system (System 2)
# hagrp -online service_group -sys system
-checkpartial other_system
If the service group does not have resources online
on the other system, the service group is brought
online on the original system and the checkpartial
option is ignored. Note that the checkpartial option
is used by the Preonline trigger during failover.
When a service group configured with Preonline =1
fails (system 1) fails over to another system (system
2), the only resources brought online on system 1 are
those that were previously online on system 2 prior
to failover.
* To stop a service group and take its resources
offline
# hagrp -offline service_group -sys system
* To stop a service group only if all resources are
probed on the system
# hagrp -offline [-ifprobed] service_group -sys
system
* To switch a service group from one system to another
# hagrp -switch service_group -to system
The -switch option is valid for failover groups only.
A service group can be switched only if it is fully
or partially online.
* To freeze a service group (disable onlining,
offlining, and failover)
# hagrp -freeze service_group [-persistent]
The option -persistent enables the freeze to be
remembered when the cluster is rebooted.
* To thaw a service group (reenable onlining,
offlining, and failover)
# hagrp -unfreeze service_group [-persistent]
* To enable a service group
# hagrp -enable service_group [-sys system]
A group can be brought online only if it is enabled.
* To disable a service group
# hagrp -disable service_group [-sys system]
A group cannot be brought online or switched if it is
disabled.
* To enable all resources in a service group
# hagrp -enableresources service_group
* To disable all resources in a service group
# hagrp -disableresources service_group
Agents do not monitor group resources if resources
are disabled.
* To clear faulted, non-persistent resources in a
service group
# hagrp -clear [service_group] -sys [system]
Clearing a resource automatically initiates the
online process previously blocked while waiting for
the resource to become clear. - If system is
specified, all faulted, non-persistent resources are
cleared from that system only. - If system is not
specified, the service group is cleared on all
systems in the group s SystemList in which at least
one non-persistent resource has faulted.
1.10.2 Risorse di un gruppo di risorse
* prodsshr0:{root}:/>hagrp -resources beadm_sg
tws_client
beadm_dg
beadm_mip
beadm_mnt
twsdm_mnt
prodsshr_mnic
bea_admin
TWS_vol
beadm_vol
prodsshr0:{root}:/>
1.11 VERITAS: Comandi Amministrazione VCS
1.11.1 Cluster Start/Stop:
* stop VCS on all systems:
# hastop -all
* stop VCS on bar_c and move all groups out:
# hastop [ -local ] -sys bar_c -evacuate
* start VCS on local system:
# hastart
1.11.2 Users:
* add gui root user:
# haconf -makerw
# hauser -add root
# haconf -dump -makero
1.11.3 - Set/update VCS super user password:
* add root user:
# haconf -makerw
# hauser -add root
password:...
# haconf -dump -makero
* change root password:
# haconf -makerw
# hauser -update root
password:...
# haconf -dump -makero
1.11.4 Group:
* group start, stop:
# hagrp -offline groupx -sys foo_c
# hagrp -online groupx -sys foo_c
* switch a group to other system:
# hagrp -switch groupx -to bar_c
* freeze a group:
# hagrp -freeze groupx
* unfreeze a group:
# hagrp -unfreeze groupx
* enable a group:
# hagrp -enable groupx
* disable a group:
# hagrp -disable groupx
* enable resources a group:
# hagrp -enableresources groupx
* disable resources a group:
# hagrp -disableresources groupx
* flush a group:
# hagrp -flush groupx -sys bar_c
1.11.5 Node:
* freeze node:
# hasys -freeze bar_c
* thaw node:
# hasys -unfreeze bar_c
1.11.6 Resources:
* online a resouce:
# hares -online IP_192_168_1_54 -sys bar_c
* offline a resouce:
# hares -offline IP_192_168_1_54 -sys bar_c
* offline a resouce and propagte to children:
# hares -offprop IP_192_168_1_54 -sys bar_c
* probe a resouce:
# hares -probe IP_192_168_1_54 -sys bar_c
* clear faulted resource:
# hares -clear IP_192_168_1_54 -sys bar_c
1.11.7 Agents:
* start agent:
# haagent -start IP -sys bar_c
* stop agent:
# haagent -stop IP -sys bar_c
1.11.8 Reboot a node with evacuation of all service groups:
* (groupy is running on bar_c)
* # hastop -sys bar_c -evacuate
* # init 6
* # hagrp -switch groupy -to bar_c
1.12 Starting Cluster Manager (Java Console) and
Configuration Editor
1. After establishing a user account and setting the
display, type the following commands to start Cluster
Manager and Configuration Editor:
* # hagui
* # hacfed
2. Run /opt/VRTSvcs/bin/hagui.