How to shutdown the Exadata database nodes and storage cells in a rolling fashion so certain hardware tasks can be performed
The goal of this document is to provide a way to shutdown the
Exadata database nodes and storage cells in a rolling fashion so certain
hardware tasks may be performed on each server. For example - replacing the
raid controller battery requires downtime for each server. This document will
provide a way to perform these tasks without taking down the database.
This note separates the tasks into two different sections.
- Instructions on shutting down and restarting each
database node in a rolling fashion.
- Instructions on shutting down and restarting each
storage cell in a rolling fashion.
As root perform the following on each database server - one at a time:
1)Check to see if AUTOSTART is enabled:
#
$GRID_HOME/bin/crsctl config crs
2) Disable the Grid Infrastructure for autostart on the
database servers if the previous step indicated it is currently enabled for
autostart.
#
$GRID_HOME/bin/crsctl disable crs
- Note: This is step is [Optional] and it can
required during maintenance operation like "firmware patches"
which requires to reboot the Compute Node several times.
3) Stop the Grid Infrastructure stack on the first database server locally:
#
$GRID_HOME/bin/crsctl stop crs
4) Verify that the Grid Infrastructure stack has shutdown
successfully on the database server.
The following command should show no output if the GI stack has shutdown:
The following command should show no output if the GI stack has shutdown:
#
ps -ef | grep diskmon
5) Confirm that the clusterware resources
are still up and running on the other nodes:
cd
/opt/oracle.SupportTools/onecommand
# dcli -g dbs_group -l root $GRID_HOME/bin/crsctl check crs
# dcli -g dbs_group -l root $GRID_HOME/bin/crsctl check crs
7) Before shutting down the DB node, check /etc/fstab for any
nfs mounts that should be unmounted.
8) Shut the db node down so you can peform maintenance.
#
poweroff
9) Perform the scheduled maintenance on
the DB node.
You can proceed now with hardware replacement or maintenance.
10) After hardware has been replaced or maintenance performed and your ready to bring the server back up, power on the DB node by using the power button on the front panel of the Exadata Storage Servers.
11) Start the Grid Infrastructure stack on the database server once it comes up:
#
$GRID_HOME/bin/crsctl start crs
12) Wait until the Grid Infrastructure stack has
successfully started. To check the status of the Grid Infrastructure stack, run
the following command and verify that the "ora.asm" instance is
started. Note that the command below will continue to report that it is unable
to communicate with the Grid Infrastructure software for several minutes after
issuing the "crsctl start crs" command above:
#
/u01/app/11.2.0/grid/bin/crsctl status resource -t
13) Reenable the Grid Infrastructure for autostart again
since we disabled it earlier in Section 1 # 3:
#
$GRID_HOME/bin/crsctl enable crs
14) Confirm that the clusterware resources are up and
running on all nodes:
cd /opt/oracle.SupportTools/onecommand
cd /opt/oracle.SupportTools/onecommand
#
dcli -g dbs_group -l root $GRID_HOME/bin/crsctl check crs
After completing the steps above, repeat same steps again
in Section 1 on the remaining nodes (one at a time) until all hardware has been
replaced on each DB node.
1) Refer to the instructions listed in the note below:
Steps to shut down or reboot an Exadata storage cell without affecting ASM (Doc ID 1188080.1)
Steps to shut down or reboot an Exadata storage cell without affecting ASM (Doc ID 1188080.1)
Reference Doc ID - . (Doc ID 1539451.1)