Exadata — Basics to Pro Series
1. What Is Exadata
· 2. Hardware Components
· 3. Architecture Deep Dive
· 4. Smart Scan, Storage Indexes, HCC
· 5. Monitoring
· 6. Performance Tuning
· 7. Administration
· 8. Patching
· 9. EBS on Exadata
· 10. OCI Exadata
Managing an Exadata system as a DBA involves two distinct workspaces — the database tier you already know, and the storage cell tier that is unique to Exadata. The storage cell tier has its own command-line interface, its own objects, its own alert system, and its own administration tasks. A DBA who only manages the database layer is managing half of Exadata.
This article covers everything you need for day-to-day storage cell administration. It explains the storage object model — the relationship between physical disks, celldisks, and griddisks — walks through the essential cellcli and dcli commands, shows you how to work with cell alerts and predictive failure warnings, and provides a complete routine health check reference you can run every day.
Two OS users for cellcli: celladmin can run all cellcli commands including those that modify configuration. cellmonitor can run read-only LIST commands only. For routine monitoring, always use cellmonitor. Reserve celladmin for configuration changes. Commands in this article that modify the cell are clearly marked.
The Exadata Storage Object Model
Before running any cellcli commands, you need to understand the three-layer object hierarchy that Exadata uses to represent storage. Every disk in an Exadata cell exists at three levels simultaneously, and each level has a different name, a different purpose, and different administration commands.
| Object |
What It Represents |
Who Manages It |
Typical Naming |
| PHYSICALDISK |
The actual physical hard disk or flash drive inside the storage cell. Identified by its slot position (e.g. 35:3) or flash device name (e.g. FLASH_4_0). |
Exadata System Software — auto-detected |
35:3 (slot:disk) or FLASH_4_0 |
| CELLDISK |
The logical representation of a physical disk within cellsrv. One celldisk per physical disk. This is the layer where Exadata System Software manages the disk. |
Exadata System Software — created automatically |
CD_00_cell01 |
| GRIDDISK |
Logical partitions carved from a celldisk and presented to Oracle ASM. One celldisk can have multiple griddisks (e.g. one for DATA, one for RECO). ASM sees griddisks as disk devices. |
DBA — can be created, dropped, resized |
DATA_CD_00_cell01, RECO_CD_00_cell01 |
The relationship is: one PHYSICALDISK → one CELLDISK → one or more GRIDDISKs. When a physical disk fails, the celldisk becomes unavailable, which takes its griddisks offline, which causes ASM to drop those disks from the disk group and begin rebalancing. Understanding this chain is essential for diagnosing hardware events.
Connecting to cellcli
cellcli runs only on the storage cell — you cannot run it from a database node. You must SSH to each cell individually. The management network (not the RoCE storage network) is used for cellcli SSH access.
Connecting to a storage cell and entering cellcli
# Connect to a storage cell via SSH using the management hostname
ssh celladmin@cell01-adm # Full admin access — can modify configuration
ssh cellmonitor@cell01-adm # Read-only access — monitoring only
# Open the interactive cellcli prompt
cellcli
# You will see the prompt:
# CellCLI: Release 23.x.x.x — Production on [date]
# CellCLI>
# Run a single command without entering interactive mode
cellcli -e "LIST CELL DETAIL"
# Exit the interactive session
CellCLI> EXIT
cellcli syntax rules: Commands are case-insensitive. Object type keywords (CELL, CELLDISK, GRIDDISK, etc.) and attribute names are case-insensitive. String values in WHERE filters are case-sensitive and must match exactly. Use a backslash (\) as a continuation character for long commands that span multiple lines.
Cell-Level Commands
The CELL object represents the entire storage server. Start here for any health check — it gives you the overall status of the cell before drilling into disks or metrics.
Essential CELL commands
# Quick cell status — name, status, and software version
CellCLI> LIST CELL
# Full cell detail — hardware model, software version, all network interfaces
CellCLI> LIST CELL DETAIL
# Check specific attributes only
CellCLI> LIST CELL ATTRIBUTES name, status, releaseVersion
# Check network interconnect interfaces
CellCLI> LIST CELL ATTRIBUTES name, interconnect0, interconnect1
# Check cell uptime and last restart time
CellCLI> LIST CELL ATTRIBUTES name, upTime, restartCount
# Check cell memory total and available
CellCLI> LIST METRICCURRENT CL_MEMUT
# Check CPU utilisation
CellCLI> LIST METRICCURRENT CL_CPUT
# Check cell temperature (thermal status)
CellCLI> LIST METRICCURRENT CL_TEMP
Managing Physical Disks — PHYSICALDISK
Physical disks are the raw hardware inside the storage cell. The most important thing to monitor at this level is the disk status. A status of normal is expected. Any deviation — particularly warning - predictive failure — requires immediate attention.
PHYSICALDISK monitoring commands
# List all physical disks with status — quick overview
CellCLI> LIST PHYSICALDISK
# Full detail for all physical disks
CellCLI> LIST PHYSICALDISK DETAIL
# List only specific attributes for all disks
CellCLI> LIST PHYSICALDISK ATTRIBUTES name, diskType, status, serialNumber
# List only hard disks
CellCLI> LIST PHYSICALDISK WHERE diskType = 'HardDisk'
# List only flash drives
CellCLI> LIST PHYSICALDISK WHERE diskType = 'FlashDisk'
# CRITICAL: Check for any disk not in normal status
CellCLI> LIST PHYSICALDISK WHERE status != 'normal'
# Check for predictive failure specifically (pre-failure warning)
CellCLI> LIST PHYSICALDISK \
WHERE diskType = 'HardDisk' \
AND status = 'warning - predictive failure' \
DETAIL
Normal status values for PHYSICALDISK:
normal — disk is healthy.
warning - predictive failure — S.M.A.R.T. diagnostics predict this disk will fail. Replace proactively before it fails completely.
failed — disk has failed. ASM will drop the associated griddisks and rebalance.
not present — no disk in this slot — expected for empty slots.
Managing Celldisks — CELLDISK
Celldisks are the logical representation of physical disks within Exadata System Software. They are created automatically when the cell is initialised and correspond one-to-one with physical disks. The DBA rarely needs to create or drop celldisks manually — but checking their status and free space is part of routine administration.
CELLDISK monitoring and management commands
# List all celldisks with status
CellCLI> LIST CELLDISK
# Full detail for all celldisks
CellCLI> LIST CELLDISK DETAIL
# Check status, size, and free space for all celldisks
CellCLI> LIST CELLDISK ATTRIBUTES name, status, size, freeSpace
# Check for any celldisk not in normal status
CellCLI> LIST CELLDISK WHERE status != 'normal'
# Check a specific celldisk in detail
CellCLI> LIST CELLDISK CD_00_cell01 DETAIL
# Read I/O throughput per celldisk
CellCLI> LIST METRICCURRENT WHERE objectType = 'CELLDISK'
# Celldisk I/O requests — large (Smart Scan) and small (OLTP)
CellCLI> LIST METRICCURRENT CD_IO_RQ_R_LG
CellCLI> LIST METRICCURRENT CD_IO_RQ_R_SM
Celldisk status values mirror physicaldisk status. If a physicaldisk enters a failed state, the corresponding celldisk will also show as failed or in an error state. When a celldisk fails, all griddisks on that celldisk become unavailable and ASM rebalance begins automatically.
Managing Griddisks — GRIDDISK
Griddisks are what Oracle ASM sees as individual disk devices. Each celldisk is partitioned into one or more griddisks — typically one for the DATA disk group and one for the RECO disk group. The griddisk is the unit that ASM adds to or drops from a disk group.
Griddisks are the most frequently administered storage object. You will work with griddisks when taking a cell offline for maintenance, restoring a replaced disk, and verifying disk group membership after a hardware event.
GRIDDISK monitoring and management commands
# List all griddisks with status
CellCLI> LIST GRIDDISK
# Full detail for all griddisks
CellCLI> LIST GRIDDISK DETAIL
# List griddisks with key attributes
CellCLI> LIST GRIDDISK ATTRIBUTES \
name, asmDiskName, asmDiskGroupName, status, size
# Check for any griddisk not in active status
CellCLI> LIST GRIDDISK WHERE status != 'active'
# List griddisks for a specific ASM disk group
CellCLI> LIST GRIDDISK WHERE asmDiskGroupName = 'DATA'
# Check a specific griddisk in detail
CellCLI> LIST GRIDDISK DATA_CD_00_cell01 DETAIL
Taking a griddisk offline and online — for planned maintenance
When you need to take a storage cell offline for maintenance (patching, hardware work), you must first quiesce the griddisks so ASM can handle the absence gracefully. This is a celladmin-only operation.
Take griddisks inactive for planned cell maintenance
# Step 1: Take all griddisks on this cell INACTIVE before maintenance
# This signals ASM to begin rebalancing before the cell goes down
# Replace cell01 with your cell name
CellCLI> ALTER GRIDDISK ALL INACTIVE
# Step 2: Verify all griddisks show inactive status
CellCLI> LIST GRIDDISK ATTRIBUTES name, status
# Step 3: From the DB node, verify ASM rebalance is complete
# Wait until no rebalance operations are in progress
-- sqlplus / as sysasm
-- SELECT * FROM v$asm_operation WHERE state = 'RUN';
# Step 4: Perform maintenance on the cell
# Step 5: After maintenance, bring griddisks ACTIVE again
CellCLI> ALTER GRIDDISK ALL ACTIVE
# Step 6: Verify all griddisks are active
CellCLI> LIST GRIDDISK ATTRIBUTES name, status
Do not take all cells offline simultaneously. ASM needs enough cells to satisfy the disk group redundancy requirements. For a normal redundancy disk group (2-way mirroring), you can take one cell offline at a time. For high redundancy (3-way mirroring), you can take up to two cells offline simultaneously — but always confirm ASM rebalance completes between each cell.
The Exadata Alert System
Exadata System Software monitors hundreds of metrics across every cell component — disks, CPUs, temperature sensors, fans, network interfaces, flash devices. When a metric crosses a threshold, an alert is generated and recorded in the alert history. Understanding how to read and manage alerts is one of the most important Exadata administration skills.
Alert severity levels
| Severity |
Meaning |
Action Required |
| critical |
Component has failed or is at immediate risk. Data availability may be impacted. |
Immediate — escalate to Oracle Support and hardware team |
| warning |
Component is degraded or approaching a failure threshold. Not yet causing data loss. |
Investigate within the same business day — plan remediation |
| informational |
A notable event occurred but no failure. System state changes, rebalance completions, etc. |
Review — no immediate action usually required |
| clear |
A previously raised alert has been resolved automatically. |
Verify the root cause is genuinely resolved |
Working with ALERTHISTORY
# List all alerts — most recent first
CellCLI> LIST ALERTHISTORY
# Show full detail for all alerts
CellCLI> LIST ALERTHISTORY DETAIL
# Show only critical and warning alerts
CellCLI> LIST ALERTHISTORY WHERE severity LIKE '[warning|critical]'
# Show only alerts not yet acknowledged (examinedBy is null)
CellCLI> LIST ALERTHISTORY \
WHERE severity LIKE '[warning|critical]' \
AND examinedBy IS NULL
# Show alerts from a specific time range
CellCLI> LIST ALERTHISTORY \
WHERE beginTime > '2026-05-22T00:00:00'
# Mark an alert as examined (acknowledge it)
# Replace <alert_id> with the numeric ID from LIST ALERTHISTORY
CellCLI> ALTER ALERTHISTORY <alert_id> examinedBy = 'your_name'
# Mark all alerts as examined
CellCLI> ALTER ALERTHISTORY ALL examinedBy = 'your_name'
# Drop old alert history entries
CellCLI> DROP ALERTHISTORY <alert_id>
Handling Predictive Failure Warnings
A predictive failure warning is one of the most important alerts an Exadata DBA receives. It means S.M.A.R.T. (Self-Monitoring, Analysis, and Reporting Technology) diagnostics inside the disk have detected early signs of failure — the disk has not yet failed, but its internal health indicators predict it will fail soon. The disk must be replaced before it fails completely to avoid data loss and an emergency rebalance operation.
Step-by-step response to a predictive failure warning
Step 1 — Identify the failing disk
# Connect to the affected cell
ssh celladmin@cell01-adm
cellcli
# Find the predictive failure disk — get its slot and serial number
CellCLI> LIST PHYSICALDISK \
WHERE diskType = 'HardDisk' \
AND status = 'warning - predictive failure' \
DETAIL
# Note the output — key fields to record:
# name: 28:3 <-- slot number (rack:slot)
# deviceId: 19 <-- device ID
# serialNumber: ABC123DEF456 <-- serial for Oracle Support SR
# status: warning - predictive failure
# slotNumber: 3 <-- physical slot in the cell
# Also check the associated celldisk and griddisks
CellCLI> LIST CELLDISK ATTRIBUTES name, status, diskId
CellCLI> LIST GRIDDISK WHERE celldisk = 'CD_03_cell01' DETAIL
Step 2 — Check the alert history for this disk
# Review all alerts associated with this cell
CellCLI> LIST ALERTHISTORY WHERE severity LIKE '[warning|critical]' DETAIL
# Check from the database side as well
-- sqlplus / as sysasm
-- SELECT path, mode_status, state, total_mb, free_mb
-- FROM v$asm_disk
-- ORDER BY path;
Step 3 — Raise Oracle Support SR and prepare for disk replacement
# Take all griddisks on the affected celldisk INACTIVE
# This gives ASM time to rebalance before disk replacement
CellCLI> ALTER GRIDDISK DATA_CD_03_cell01, RECO_CD_03_cell01 INACTIVE
# Verify griddisks are inactive
CellCLI> LIST GRIDDISK ATTRIBUTES name, status
# Wait for ASM rebalance to complete from the database node
-- SELECT group_number, operation, state, est_minutes
-- FROM v$asm_operation;
# After disk is physically replaced by Oracle hardware support,
# and Exadata System Software detects the new disk,
# create the celldisk and griddisks on the replacement disk
CellCLI> CREATE CELLDISK CD_03_cell01 physicalDisk = '28:3'
# Create griddisks on the new celldisk (matching the original sizes)
CellCLI> CREATE GRIDDISK DATA_CD_03_cell01 \
celldisk = 'CD_03_cell01', \
size = 18T, offset = 0
CellCLI> CREATE GRIDDISK RECO_CD_03_cell01 \
celldisk = 'CD_03_cell01', \
size = 4T, offset = 18T
# ASM will detect the new griddisks and begin rebalancing automatically
# Monitor from the database:
-- SELECT group_number, operation, state, est_minutes FROM v$asm_operation;
Always raise an Oracle Support SR before disk replacement, even for predictive failures. Oracle Support has tools to remotely diagnose the disk and will guide the replacement procedure. For systems under Oracle-managed Platinum support, Oracle may initiate the replacement proactively before you even see the alert.
The Exadata Alert Log
In addition to the cellcli alert history, Exadata System Software maintains an alert log on each storage cell — similar in concept to the Oracle Database alert log. This log records all significant events including daemon starts and stops, disk events, configuration changes, and hardware alerts.
Finding and reading the Exadata alert log on a storage cell
# The alert log is on the storage cell OS -- SSH to the cell first
ssh celladmin@cell01-adm
# Main cellsrv alert log location
cat /opt/oracle/cell/log/diag/asm/cell/cell01/alert/log.xml
# For more readable plain text version
ls -la /opt/oracle/cell/log/diag/asm/cell/cell01/trace/
# Tail the live alert log to watch for new events
tail -f /opt/oracle/cell/log/diag/asm/cell/cell01/alert/log.xml
# Search for specific error types
grep -i "error\|warning\|critical\|ORA-" \
/opt/oracle/cell/log/diag/asm/cell/cell01/alert/log.xml | tail -50
# Check cellsrv trace files for detailed diagnostics
ls -lt /opt/oracle/cell/log/diag/asm/cell/cell01/trace/ | head -20
Check cell system logs for hardware events
# Check the OS system log for hardware events (disk I/O errors, etc.)
tail -100 /var/log/messages | grep -i "error\|fail\|disk"
# Check IPMI/BMC hardware event log via ipmitool
ipmitool sel list | tail -20
# Check for disk I/O errors in the kernel ring buffer
dmesg | grep -i "error\|i/o error\|disk" | tail -20
dcli — Running Commands Across All Cells
dcli (Distributed CLI) executes the same command on multiple cells simultaneously and collects the combined output. It is the most time-efficient way to check the health of all cells in one command rather than SSH-ing into each one individually.
dcli setup and syntax
# The cell group file lists one cell hostname per line
cat /opt/oracle.SupportTools/onecommand/cell_group
# Basic syntax
# dcli -g <group_file> [-l <username>] "<command>"
# Default runs as celladmin
dcli -g /opt/oracle.SupportTools/onecommand/cell_group \
"cellcli -e LIST CELL"
# Run as cellmonitor (read-only) -- specify username with -l
dcli -g /opt/oracle.SupportTools/onecommand/cell_group \
-l cellmonitor \
"cellcli -e LIST CELL ATTRIBUTES name, status"
# Run on specific cells only -- create a subset group file
echo -e "cell01\ncell02\ncell03" > /tmp/three_cells.txt
dcli -g /tmp/three_cells.txt "cellcli -e LIST CELL"
Essential dcli daily health checks
# 1. Check all cell daemon status
dcli -g /opt/oracle.SupportTools/onecommand/cell_group \
"service celld status"
# 2. Check Exadata System Software version on all cells
dcli -g /opt/oracle.SupportTools/onecommand/cell_group \
"cellcli -e LIST CELL ATTRIBUTES name, releaseVersion"
# 3. Check for any alerts across all cells
dcli -g /opt/oracle.SupportTools/onecommand/cell_group \
"cellcli -e LIST ALERTHISTORY WHERE severity LIKE '[warning|critical]'"
# 4. Check for any disk NOT in normal status across all cells
dcli -g /opt/oracle.SupportTools/onecommand/cell_group \
"cellcli -e LIST PHYSICALDISK WHERE status != 'normal'"
# 5. Check predictive failure disks specifically
dcli -g /opt/oracle.SupportTools/onecommand/cell_group \
"cellcli -e \"LIST PHYSICALDISK WHERE status = \
'warning - predictive failure'\""
# 6. Check all griddisk status
dcli -g /opt/oracle.SupportTools/onecommand/cell_group \
"cellcli -e LIST GRIDDISK WHERE status != 'active'"
# 7. Check cell CPU across all cells
dcli -g /opt/oracle.SupportTools/onecommand/cell_group \
"cellcli -e LIST METRICCURRENT CL_CPUT"
# 8. Check flash cache status on all cells
dcli -g /opt/oracle.SupportTools/onecommand/cell_group \
"cellcli -e LIST FLASHCACHE DETAIL"
# 9. Check Storage Index savings across all cells
dcli -g /opt/oracle.SupportTools/onecommand/cell_group \
"cellcli -e LIST METRICCURRENT WHERE name LIKE 'SI_%'"
# 10. Check for any celldisk not in normal state
dcli -g /opt/oracle.SupportTools/onecommand/cell_group \
"cellcli -e LIST CELLDISK WHERE status != 'normal'"
Managing Cell Services — Starting and Stopping
The three cell daemons — cellsrv, MS (Management Server), and RS (Restart Server) — run as a single service called celld. RS is the watchdog process that monitors and automatically restarts cellsrv and MS if they fail. In normal operations, you should rarely need to manually start or stop cell services.
Cell service management commands — run as root on the cell OS
# Check status of all cell daemons
service celld status
# Start all cell services (RS starts first, then MS, then cellsrv)
service celld start
# Stop all cell services gracefully
service celld stop
# Restart a specific daemon without stopping others
# Use cellcli ALTER commands instead of OS-level restart when possible
CellCLI> ALTER CELL RESTART SERVICES cellsrv
CellCLI> ALTER CELL RESTART SERVICES ms
CellCLI> ALTER CELL RESTART SERVICES rs
# Restart all services (equivalent to stop then start)
CellCLI> ALTER CELL RESTART SERVICES ALL
# Shutdown the cell completely (use before hardware maintenance)
CellCLI> ALTER CELL SHUTDOWN
Never restart cellsrv during active database I/O without first taking the griddisks INACTIVE and allowing ASM to rebalance. An abrupt cellsrv restart while databases are actively writing will cause ASM to drop the affected griddisks from the disk group, triggering an unplanned rebalance. Always take griddisks INACTIVE first for planned maintenance.
Flash Cache Administration
Smart Flash Cache is managed by Exadata System Software automatically — it populates, evicts, and manages cache contents without DBA intervention. However, there are specific administration tasks a DBA needs to perform, particularly during troubleshooting or after hardware replacement.
Flash cache administration commands
# Check flash cache status and configuration
CellCLI> LIST FLASHCACHE DETAIL
# Check flash cache utilisation metrics
CellCLI> LIST METRICCURRENT FC_BY_USED
CellCLI> LIST METRICCURRENT FC_IO_BY_R_SEC
# List what is cached in the flash cache (be careful -- large output)
CellCLI> LIST FLASHCACHECONTENT
# Flush (clear) the flash cache -- use only when directed by Oracle Support
# This causes temporary performance degradation as cache repopulates
CellCLI> ALTER FLASHCACHE ALL FLUSH
# Drop and recreate the flash cache (after flash hardware replacement)
CellCLI> DROP FLASHCACHE
CellCLI> CREATE FLASHCACHE ALL
Routine Health Check — Full Command Reference
This is the complete daily administration runbook for Exadata storage cell health. Run all sections at the start of each working day. Total time: under 5 minutes when run via dcli.
Daily Exadata health check — run from the primary DB node
#!/bin/bash
# DAILY EXADATA STORAGE CELL HEALTH CHECK
# Run from primary DB node as oracle or applmgr
# Requires SSH key-based authentication to all cells
CELL_GROUP=/opt/oracle.SupportTools/onecommand/cell_group
echo "=============================================="
echo " EXADATA DAILY HEALTH CHECK - $(date)"
echo "=============================================="
echo ""
echo "--- 1. CELL DAEMON STATUS ---"
dcli -g $CELL_GROUP "service celld status | grep -E 'CellSRV|MS |RS '"
echo ""
echo "--- 2. CELL SOFTWARE VERSION ---"
dcli -g $CELL_GROUP \
"cellcli -e LIST CELL ATTRIBUTES name, releaseVersion"
echo ""
echo "--- 3. CRITICAL AND WARNING ALERTS ---"
dcli -g $CELL_GROUP \
"cellcli -e LIST ALERTHISTORY WHERE severity LIKE '[warning|critical]'"
echo ""
echo "--- 4. PHYSICAL DISKS NOT IN NORMAL STATUS ---"
dcli -g $CELL_GROUP \
"cellcli -e LIST PHYSICALDISK WHERE status != 'normal'"
echo ""
echo "--- 5. GRIDDISKS NOT IN ACTIVE STATUS ---"
dcli -g $CELL_GROUP \
"cellcli -e LIST GRIDDISK WHERE status != 'active'"
echo ""
echo "--- 6. CELLDISKS NOT IN NORMAL STATUS ---"
dcli -g $CELL_GROUP \
"cellcli -e LIST CELLDISK WHERE status != 'normal'"
echo ""
echo "--- 7. CELL CPU UTILISATION ---"
dcli -g $CELL_GROUP \
"cellcli -e LIST METRICCURRENT CL_CPUT"
echo ""
echo "--- 8. FLASH CACHE UTILISATION ---"
dcli -g $CELL_GROUP \
"cellcli -e LIST METRICCURRENT FC_BY_USED"
echo ""
echo "=============================================="
echo " HEALTH CHECK COMPLETE"
echo "=============================================="
Complete cellcli Command Reference
| Command |
What It Does |
Access Required |
LIST CELL |
Cell status overview |
cellmonitor |
LIST CELL DETAIL |
Full cell info including hardware model, software version, network |
cellmonitor |
LIST PHYSICALDISK |
All physical disk status |
cellmonitor |
LIST PHYSICALDISK WHERE status != 'normal' |
Disks with problems only |
cellmonitor |
LIST CELLDISK |
All celldisk status and size |
cellmonitor |
LIST CELLDISK WHERE status != 'normal' |
Celldisks with problems |
cellmonitor |
LIST GRIDDISK |
All griddisk status and ASM mapping |
cellmonitor |
LIST GRIDDISK WHERE status != 'active' |
Griddisks not in active state |
cellmonitor |
LIST ALERTHISTORY |
All cell alerts |
cellmonitor |
LIST ALERTHISTORY WHERE severity LIKE '[warning|critical]' |
Warning and critical alerts only |
cellmonitor |
LIST FLASHCACHE DETAIL |
Flash cache status and configuration |
cellmonitor |
LIST METRICCURRENT CL_CPUT |
Cell CPU utilisation |
cellmonitor |
LIST METRICCURRENT FC_BY_USED |
Flash cache utilisation % |
cellmonitor |
LIST METRICCURRENT CD_IO_RQ_R_LG |
Large I/O (Smart Scan) request rate |
cellmonitor |
LIST IORMPLAN |
I/O Resource Management plan |
cellmonitor |
ALTER GRIDDISK ALL INACTIVE |
Take all griddisks offline for maintenance |
celladmin |
ALTER GRIDDISK ALL ACTIVE |
Bring all griddisks back online |
celladmin |
ALTER ALERTHISTORY <id> examinedBy = 'name' |
Acknowledge an alert |
celladmin |
ALTER CELL RESTART SERVICES cellsrv |
Restart the cellsrv daemon |
celladmin |
ALTER FLASHCACHE ALL FLUSH |
Clear the flash cache |
celladmin |
CREATE GRIDDISK |
Create a new griddisk on a celldisk |
celladmin |
DROP GRIDDISK |
Remove a griddisk |
celladmin |
Summary
- Exadata storage has three object levels: PHYSICALDISK (hardware) → CELLDISK (cell software layer) → GRIDDISK (presented to ASM). Understanding this chain is essential for hardware event diagnosis.
cellcli runs on storage cells only — SSH to each cell individually. Use cellmonitor for read-only monitoring, celladmin for configuration changes.
- The most critical daily check is
LIST PHYSICALDISK WHERE status != 'normal' — a warning - predictive failure status means the disk must be replaced proactively before it fails completely.
dcli runs any cellcli command across all cells simultaneously using a group file — it is the only practical way to check all cells every day.
- The cell alert system records all significant events in ALERTHISTORY. Check for unacknowledged warning and critical alerts daily using the dcli command in the health check script.
- The Exadata alert log on each cell (
/opt/oracle/cell/log/diag/asm/cell/) provides detailed event history for deep troubleshooting.
- Before any planned cell maintenance, always take griddisks INACTIVE first and wait for ASM rebalance to complete. Never restart cellsrv with griddisks in active state during database I/O.