Exadata administration essentials — cellcli, dcli, and managing storage cells day to day Admin The day-to-day administration guide. Covers the cellcli command line interface — how to connect, list objects, check disk status, manage griddisks and celldisks. dcli for running commands across all storage cells simultaneously. How to check cell alerts, manage the Exadata alert log, handle predictive failure warnings, and perform routine cell health checks.

Storage object model table — PHYSICALDISK → CELLDISK → GRIDDISK — the three-level chain and what breaks when a disk fails

cellcli connection guide — celladmin vs cellmonitor, interactive vs single command, syntax rules

Full cellcli command vocabulary — all major object types and actions available in the tool K21 Academy

PHYSICALDISK commands — list all, list by type, filter by status, full detail

Physical disk status values explained — normal, warning-predictive failure, failed, not present — with real-world output example showing slot numbers and flash device names 4pillarsinfosys

CELLDISK commands — list, detail, status filter, free space check, I/O metrics

GRIDDISK commands — list, detail, ASM mapping, status filter, disk group filter

Taking griddisks INACTIVE/ACTIVE for planned maintenance — with the correct ASM rebalance wait step and redundancy rules

Alert severity levels table — critical, warning, informational, clear — action required for each

ALERTHISTORY commands — list, filter by severity, filter unacknowledged, acknowledge, drop Blogger

Full predictive failure response workflow — identify disk, record serial number, take griddisks inactive, wait for ASM rebalance, create celldisk and griddisks after replacement Techgoeasy

Exadata alert log location and grep commands for deep troubleshooting

dcli reference — syntax, username flag, subset group files, 10 essential daily dcli commands

Cell service management — service celld start/stop/status, ALTER CELL RESTART, shutdown sequence

Flash cache administration — list, flush, drop and recreate

Ready-to-run bash health check script — 8 checks via dcli, runs in under 5 minutes

Complete cellcli command reference table — 22 commands with access level required

Exadata Administration Essentials — cellcli, dcli, and Managing Storage Cells Day to Day | punitoracledba

Oracle Exadata Series — Basics to Pro · Article 7 of 10 · punitoracledba.blogspot.com

Exadata Administration Essentials — cellcli, dcli, and Managing Storage Cells Day to Day

Exadata — Basics to Pro Series 1. What Is Exadata · 2. Hardware Components · 3. Architecture Deep Dive · 4. Smart Scan, Storage Indexes, HCC · 5. Monitoring · 6. Performance Tuning · 7. Administration · 8. Patching · 9. EBS on Exadata · 10. OCI Exadata

Managing an Exadata system as a DBA involves two distinct workspaces — the database tier you already know, and the storage cell tier that is unique to Exadata. The storage cell tier has its own command-line interface, its own objects, its own alert system, and its own administration tasks. A DBA who only manages the database layer is managing half of Exadata.

This article covers everything you need for day-to-day storage cell administration. It explains the storage object model — the relationship between physical disks, celldisks, and griddisks — walks through the essential cellcli and dcli commands, shows you how to work with cell alerts and predictive failure warnings, and provides a complete routine health check reference you can run every day.

Two OS users for cellcli: celladmin can run all cellcli commands including those that modify configuration. cellmonitor can run read-only LIST commands only. For routine monitoring, always use cellmonitor. Reserve celladmin for configuration changes. Commands in this article that modify the cell are clearly marked.

The Exadata Storage Object Model

Before running any cellcli commands, you need to understand the three-layer object hierarchy that Exadata uses to represent storage. Every disk in an Exadata cell exists at three levels simultaneously, and each level has a different name, a different purpose, and different administration commands.

Object	What It Represents	Who Manages It	Typical Naming
PHYSICALDISK	The actual physical hard disk or flash drive inside the storage cell. Identified by its slot position (e.g. 35:3) or flash device name (e.g. FLASH_4_0).	Exadata System Software — auto-detected	`35:3` (slot:disk) or `FLASH_4_0`
CELLDISK	The logical representation of a physical disk within cellsrv. One celldisk per physical disk. This is the layer where Exadata System Software manages the disk.	Exadata System Software — created automatically	`CD_00_cell01`
GRIDDISK	Logical partitions carved from a celldisk and presented to Oracle ASM. One celldisk can have multiple griddisks (e.g. one for DATA, one for RECO). ASM sees griddisks as disk devices.	DBA — can be created, dropped, resized	`DATA_CD_00_cell01`, `RECO_CD_00_cell01`

The relationship is: one PHYSICALDISK → one CELLDISK → one or more GRIDDISKs. When a physical disk fails, the celldisk becomes unavailable, which takes its griddisks offline, which causes ASM to drop those disks from the disk group and begin rebalancing. Understanding this chain is essential for diagnosing hardware events.

Connecting to cellcli

cellcli runs only on the storage cell — you cannot run it from a database node. You must SSH to each cell individually. The management network (not the RoCE storage network) is used for cellcli SSH access.

Connecting to a storage cell and entering cellcli

# Connect to a storage cell via SSH using the management hostname
ssh celladmin@cell01-adm       # Full admin access — can modify configuration
ssh cellmonitor@cell01-adm     # Read-only access — monitoring only

# Open the interactive cellcli prompt
cellcli

# You will see the prompt:
# CellCLI: Release 23.x.x.x — Production on [date]
# CellCLI>

# Run a single command without entering interactive mode
cellcli -e "LIST CELL DETAIL"

# Exit the interactive session
CellCLI> EXIT

cellcli syntax rules: Commands are case-insensitive. Object type keywords (CELL, CELLDISK, GRIDDISK, etc.) and attribute names are case-insensitive. String values in WHERE filters are case-sensitive and must match exactly. Use a backslash (\) as a continuation character for long commands that span multiple lines.

Cell-Level Commands

The CELL object represents the entire storage server. Start here for any health check — it gives you the overall status of the cell before drilling into disks or metrics.

Essential CELL commands # Quick cell status — name, status, and software version CellCLI> LIST CELL # Full cell detail — hardware model, software version, all network interfaces CellCLI> LIST CELL DETAIL # Check specific attributes only CellCLI> LIST CELL ATTRIBUTES name, status, releaseVersion # Check network interconnect interfaces CellCLI> LIST CELL ATTRIBUTES name, interconnect0, interconnect1 # Check cell uptime and last restart time CellCLI> LIST CELL ATTRIBUTES name, upTime, restartCount # Check cell memory total and available CellCLI> LIST METRICCURRENT CL_MEMUT # Check CPU utilisation CellCLI> LIST METRICCURRENT CL_CPUT # Check cell temperature (thermal status) CellCLI> LIST METRICCURRENT CL_TEMP

Managing Physical Disks — PHYSICALDISK

Physical disks are the raw hardware inside the storage cell. The most important thing to monitor at this level is the disk status. A status of normal is expected. Any deviation — particularly warning - predictive failure — requires immediate attention.

PHYSICALDISK monitoring commands # List all physical disks with status — quick overview CellCLI> LIST PHYSICALDISK # Full detail for all physical disks CellCLI> LIST PHYSICALDISK DETAIL # List only specific attributes for all disks CellCLI> LIST PHYSICALDISK ATTRIBUTES name, diskType, status, serialNumber # List only hard disks CellCLI> LIST PHYSICALDISK WHERE diskType = 'HardDisk' # List only flash drives CellCLI> LIST PHYSICALDISK WHERE diskType = 'FlashDisk' # CRITICAL: Check for any disk not in normal status CellCLI> LIST PHYSICALDISK WHERE status != 'normal' # Check for predictive failure specifically (pre-failure warning) CellCLI> LIST PHYSICALDISK \ WHERE diskType = 'HardDisk' \ AND status = 'warning - predictive failure' \ DETAIL

Normal status values for PHYSICALDISK: normal — disk is healthy. warning - predictive failure — S.M.A.R.T. diagnostics predict this disk will fail. Replace proactively before it fails completely. failed — disk has failed. ASM will drop the associated griddisks and rebalance. not present — no disk in this slot — expected for empty slots.

Managing Celldisks — CELLDISK

Celldisks are the logical representation of physical disks within Exadata System Software. They are created automatically when the cell is initialised and correspond one-to-one with physical disks. The DBA rarely needs to create or drop celldisks manually — but checking their status and free space is part of routine administration.

CELLDISK monitoring and management commands # List all celldisks with status CellCLI> LIST CELLDISK # Full detail for all celldisks CellCLI> LIST CELLDISK DETAIL # Check status, size, and free space for all celldisks CellCLI> LIST CELLDISK ATTRIBUTES name, status, size, freeSpace # Check for any celldisk not in normal status CellCLI> LIST CELLDISK WHERE status != 'normal' # Check a specific celldisk in detail CellCLI> LIST CELLDISK CD_00_cell01 DETAIL # Read I/O throughput per celldisk CellCLI> LIST METRICCURRENT WHERE objectType = 'CELLDISK' # Celldisk I/O requests — large (Smart Scan) and small (OLTP) CellCLI> LIST METRICCURRENT CD_IO_RQ_R_LG CellCLI> LIST METRICCURRENT CD_IO_RQ_R_SM

Celldisk status values mirror physicaldisk status. If a physicaldisk enters a failed state, the corresponding celldisk will also show as failed or in an error state. When a celldisk fails, all griddisks on that celldisk become unavailable and ASM rebalance begins automatically.

Managing Griddisks — GRIDDISK

Griddisks are what Oracle ASM sees as individual disk devices. Each celldisk is partitioned into one or more griddisks — typically one for the DATA disk group and one for the RECO disk group. The griddisk is the unit that ASM adds to or drops from a disk group.

Griddisks are the most frequently administered storage object. You will work with griddisks when taking a cell offline for maintenance, restoring a replaced disk, and verifying disk group membership after a hardware event.

GRIDDISK monitoring and management commands # List all griddisks with status CellCLI> LIST GRIDDISK # Full detail for all griddisks CellCLI> LIST GRIDDISK DETAIL # List griddisks with key attributes CellCLI> LIST GRIDDISK ATTRIBUTES \ name, asmDiskName, asmDiskGroupName, status, size # Check for any griddisk not in active status CellCLI> LIST GRIDDISK WHERE status != 'active' # List griddisks for a specific ASM disk group CellCLI> LIST GRIDDISK WHERE asmDiskGroupName = 'DATA' # Check a specific griddisk in detail CellCLI> LIST GRIDDISK DATA_CD_00_cell01 DETAIL

Taking a griddisk offline and online — for planned maintenance

When you need to take a storage cell offline for maintenance (patching, hardware work), you must first quiesce the griddisks so ASM can handle the absence gracefully. This is a celladmin-only operation.

Take griddisks inactive for planned cell maintenance # Step 1: Take all griddisks on this cell INACTIVE before maintenance # This signals ASM to begin rebalancing before the cell goes down # Replace cell01 with your cell name CellCLI> ALTER GRIDDISK ALL INACTIVE # Step 2: Verify all griddisks show inactive status CellCLI> LIST GRIDDISK ATTRIBUTES name, status # Step 3: From the DB node, verify ASM rebalance is complete # Wait until no rebalance operations are in progress -- sqlplus / as sysasm -- SELECT * FROM v$asm_operation WHERE state = 'RUN'; # Step 4: Perform maintenance on the cell # Step 5: After maintenance, bring griddisks ACTIVE again CellCLI> ALTER GRIDDISK ALL ACTIVE # Step 6: Verify all griddisks are active CellCLI> LIST GRIDDISK ATTRIBUTES name, status

Do not take all cells offline simultaneously. ASM needs enough cells to satisfy the disk group redundancy requirements. For a normal redundancy disk group (2-way mirroring), you can take one cell offline at a time. For high redundancy (3-way mirroring), you can take up to two cells offline simultaneously — but always confirm ASM rebalance completes between each cell.

The Exadata Alert System

Exadata System Software monitors hundreds of metrics across every cell component — disks, CPUs, temperature sensors, fans, network interfaces, flash devices. When a metric crosses a threshold, an alert is generated and recorded in the alert history. Understanding how to read and manage alerts is one of the most important Exadata administration skills.

Alert severity levels

Severity	Meaning	Action Required
critical	Component has failed or is at immediate risk. Data availability may be impacted.	Immediate — escalate to Oracle Support and hardware team
warning	Component is degraded or approaching a failure threshold. Not yet causing data loss.	Investigate within the same business day — plan remediation
informational	A notable event occurred but no failure. System state changes, rebalance completions, etc.	Review — no immediate action usually required
clear	A previously raised alert has been resolved automatically.	Verify the root cause is genuinely resolved

Working with ALERTHISTORY # List all alerts — most recent first CellCLI> LIST ALERTHISTORY # Show full detail for all alerts CellCLI> LIST ALERTHISTORY DETAIL # Show only critical and warning alerts CellCLI> LIST ALERTHISTORY WHERE severity LIKE '[warning|critical]' # Show only alerts not yet acknowledged (examinedBy is null) CellCLI> LIST ALERTHISTORY \ WHERE severity LIKE '[warning|critical]' \ AND examinedBy IS NULL # Show alerts from a specific time range CellCLI> LIST ALERTHISTORY \ WHERE beginTime > '2026-05-22T00:00:00' # Mark an alert as examined (acknowledge it) # Replace <alert_id> with the numeric ID from LIST ALERTHISTORY CellCLI> ALTER ALERTHISTORY <alert_id> examinedBy = 'your_name' # Mark all alerts as examined CellCLI> ALTER ALERTHISTORY ALL examinedBy = 'your_name' # Drop old alert history entries CellCLI> DROP ALERTHISTORY <alert_id>

Handling Predictive Failure Warnings

A predictive failure warning is one of the most important alerts an Exadata DBA receives. It means S.M.A.R.T. (Self-Monitoring, Analysis, and Reporting Technology) diagnostics inside the disk have detected early signs of failure — the disk has not yet failed, but its internal health indicators predict it will fail soon. The disk must be replaced before it fails completely to avoid data loss and an emergency rebalance operation.

Step-by-step response to a predictive failure warning

Step 1 — Identify the failing disk # Connect to the affected cell ssh celladmin@cell01-adm cellcli # Find the predictive failure disk — get its slot and serial number CellCLI> LIST PHYSICALDISK \ WHERE diskType = 'HardDisk' \ AND status = 'warning - predictive failure' \ DETAIL # Note the output — key fields to record: # name: 28:3 <-- slot number (rack:slot) # deviceId: 19 <-- device ID # serialNumber: ABC123DEF456 <-- serial for Oracle Support SR # status: warning - predictive failure # slotNumber: 3 <-- physical slot in the cell # Also check the associated celldisk and griddisks CellCLI> LIST CELLDISK ATTRIBUTES name, status, diskId CellCLI> LIST GRIDDISK WHERE celldisk = 'CD_03_cell01' DETAIL

Step 2 — Check the alert history for this disk # Review all alerts associated with this cell CellCLI> LIST ALERTHISTORY WHERE severity LIKE '[warning|critical]' DETAIL # Check from the database side as well -- sqlplus / as sysasm -- SELECT path, mode_status, state, total_mb, free_mb -- FROM v$asm_disk -- ORDER BY path;

Step 3 — Raise Oracle Support SR and prepare for disk replacement # Take all griddisks on the affected celldisk INACTIVE # This gives ASM time to rebalance before disk replacement CellCLI> ALTER GRIDDISK DATA_CD_03_cell01, RECO_CD_03_cell01 INACTIVE # Verify griddisks are inactive CellCLI> LIST GRIDDISK ATTRIBUTES name, status # Wait for ASM rebalance to complete from the database node -- SELECT group_number, operation, state, est_minutes -- FROM v$asm_operation; # After disk is physically replaced by Oracle hardware support, # and Exadata System Software detects the new disk, # create the celldisk and griddisks on the replacement disk CellCLI> CREATE CELLDISK CD_03_cell01 physicalDisk = '28:3' # Create griddisks on the new celldisk (matching the original sizes) CellCLI> CREATE GRIDDISK DATA_CD_03_cell01 \ celldisk = 'CD_03_cell01', \ size = 18T, offset = 0 CellCLI> CREATE GRIDDISK RECO_CD_03_cell01 \ celldisk = 'CD_03_cell01', \ size = 4T, offset = 18T # ASM will detect the new griddisks and begin rebalancing automatically # Monitor from the database: -- SELECT group_number, operation, state, est_minutes FROM v$asm_operation;

Always raise an Oracle Support SR before disk replacement, even for predictive failures. Oracle Support has tools to remotely diagnose the disk and will guide the replacement procedure. For systems under Oracle-managed Platinum support, Oracle may initiate the replacement proactively before you even see the alert.

The Exadata Alert Log

In addition to the cellcli alert history, Exadata System Software maintains an alert log on each storage cell — similar in concept to the Oracle Database alert log. This log records all significant events including daemon starts and stops, disk events, configuration changes, and hardware alerts.

Finding and reading the Exadata alert log on a storage cell # The alert log is on the storage cell OS -- SSH to the cell first ssh celladmin@cell01-adm # Main cellsrv alert log location cat /opt/oracle/cell/log/diag/asm/cell/cell01/alert/log.xml # For more readable plain text version ls -la /opt/oracle/cell/log/diag/asm/cell/cell01/trace/ # Tail the live alert log to watch for new events tail -f /opt/oracle/cell/log/diag/asm/cell/cell01/alert/log.xml # Search for specific error types grep -i "error\|warning\|critical\|ORA-" \ /opt/oracle/cell/log/diag/asm/cell/cell01/alert/log.xml | tail -50 # Check cellsrv trace files for detailed diagnostics ls -lt /opt/oracle/cell/log/diag/asm/cell/cell01/trace/ | head -20

dcli — Running Commands Across All Cells

dcli (Distributed CLI) executes the same command on multiple cells simultaneously and collects the combined output. It is the most time-efficient way to check the health of all cells in one command rather than SSH-ing into each one individually.

dcli setup and syntax # The cell group file lists one cell hostname per line cat /opt/oracle.SupportTools/onecommand/cell_group # Basic syntax # dcli -g <group_file> [-l <username>] "<command>" # Default runs as celladmin dcli -g /opt/oracle.SupportTools/onecommand/cell_group \ "cellcli -e LIST CELL" # Run as cellmonitor (read-only) -- specify username with -l dcli -g /opt/oracle.SupportTools/onecommand/cell_group \ -l cellmonitor \ "cellcli -e LIST CELL ATTRIBUTES name, status" # Run on specific cells only -- create a subset group file echo -e "cell01\ncell02\ncell03" > /tmp/three_cells.txt dcli -g /tmp/three_cells.txt "cellcli -e LIST CELL"

Essential dcli daily health checks # 1. Check all cell daemon status dcli -g /opt/oracle.SupportTools/onecommand/cell_group \ "service celld status" # 2. Check Exadata System Software version on all cells dcli -g /opt/oracle.SupportTools/onecommand/cell_group \ "cellcli -e LIST CELL ATTRIBUTES name, releaseVersion" # 3. Check for any alerts across all cells dcli -g /opt/oracle.SupportTools/onecommand/cell_group \ "cellcli -e LIST ALERTHISTORY WHERE severity LIKE '[warning|critical]'" # 4. Check for any disk NOT in normal status across all cells dcli -g /opt/oracle.SupportTools/onecommand/cell_group \ "cellcli -e LIST PHYSICALDISK WHERE status != 'normal'" # 5. Check predictive failure disks specifically dcli -g /opt/oracle.SupportTools/onecommand/cell_group \ "cellcli -e \"LIST PHYSICALDISK WHERE status = \ 'warning - predictive failure'\"" # 6. Check all griddisk status dcli -g /opt/oracle.SupportTools/onecommand/cell_group \ "cellcli -e LIST GRIDDISK WHERE status != 'active'" # 7. Check cell CPU across all cells dcli -g /opt/oracle.SupportTools/onecommand/cell_group \ "cellcli -e LIST METRICCURRENT CL_CPUT" # 8. Check flash cache status on all cells dcli -g /opt/oracle.SupportTools/onecommand/cell_group \ "cellcli -e LIST FLASHCACHE DETAIL" # 9. Check Storage Index savings across all cells dcli -g /opt/oracle.SupportTools/onecommand/cell_group \ "cellcli -e LIST METRICCURRENT WHERE name LIKE 'SI_%'" # 10. Check for any celldisk not in normal state dcli -g /opt/oracle.SupportTools/onecommand/cell_group \ "cellcli -e LIST CELLDISK WHERE status != 'normal'"

Managing Cell Services — Starting and Stopping

The three cell daemons — cellsrv, MS (Management Server), and RS (Restart Server) — run as a single service called celld. RS is the watchdog process that monitors and automatically restarts cellsrv and MS if they fail. In normal operations, you should rarely need to manually start or stop cell services.

Cell service management commands — run as root on the cell OS # Check status of all cell daemons service celld status # Start all cell services (RS starts first, then MS, then cellsrv) service celld start # Stop all cell services gracefully service celld stop # Restart a specific daemon without stopping others # Use cellcli ALTER commands instead of OS-level restart when possible CellCLI> ALTER CELL RESTART SERVICES cellsrv CellCLI> ALTER CELL RESTART SERVICES ms CellCLI> ALTER CELL RESTART SERVICES rs # Restart all services (equivalent to stop then start) CellCLI> ALTER CELL RESTART SERVICES ALL # Shutdown the cell completely (use before hardware maintenance) CellCLI> ALTER CELL SHUTDOWN

Never restart cellsrv during active database I/O without first taking the griddisks INACTIVE and allowing ASM to rebalance. An abrupt cellsrv restart while databases are actively writing will cause ASM to drop the affected griddisks from the disk group, triggering an unplanned rebalance. Always take griddisks INACTIVE first for planned maintenance.

Flash Cache Administration

Smart Flash Cache is managed by Exadata System Software automatically — it populates, evicts, and manages cache contents without DBA intervention. However, there are specific administration tasks a DBA needs to perform, particularly during troubleshooting or after hardware replacement.

Flash cache administration commands # Check flash cache status and configuration CellCLI> LIST FLASHCACHE DETAIL # Check flash cache utilisation metrics CellCLI> LIST METRICCURRENT FC_BY_USED CellCLI> LIST METRICCURRENT FC_IO_BY_R_SEC # List what is cached in the flash cache (be careful -- large output) CellCLI> LIST FLASHCACHECONTENT # Flush (clear) the flash cache -- use only when directed by Oracle Support # This causes temporary performance degradation as cache repopulates CellCLI> ALTER FLASHCACHE ALL FLUSH # Drop and recreate the flash cache (after flash hardware replacement) CellCLI> DROP FLASHCACHE CellCLI> CREATE FLASHCACHE ALL

Routine Health Check — Full Command Reference

This is the complete daily administration runbook for Exadata storage cell health. Run all sections at the start of each working day. Total time: under 5 minutes when run via dcli.

Daily Exadata health check — run from the primary DB node #!/bin/bash # DAILY EXADATA STORAGE CELL HEALTH CHECK # Run from primary DB node as oracle or applmgr # Requires SSH key-based authentication to all cells CELL_GROUP=/opt/oracle.SupportTools/onecommand/cell_group echo "==============================================" echo " EXADATA DAILY HEALTH CHECK - $(date)" echo "==============================================" echo "" echo "--- 1. CELL DAEMON STATUS ---" dcli -g $CELL_GROUP "service celld status | grep -E 'CellSRV|MS |RS '" echo "" echo "--- 2. CELL SOFTWARE VERSION ---" dcli -g $CELL_GROUP \ "cellcli -e LIST CELL ATTRIBUTES name, releaseVersion" echo "" echo "--- 3. CRITICAL AND WARNING ALERTS ---" dcli -g $CELL_GROUP \ "cellcli -e LIST ALERTHISTORY WHERE severity LIKE '[warning|critical]'" echo "" echo "--- 4. PHYSICAL DISKS NOT IN NORMAL STATUS ---" dcli -g $CELL_GROUP \ "cellcli -e LIST PHYSICALDISK WHERE status != 'normal'" echo "" echo "--- 5. GRIDDISKS NOT IN ACTIVE STATUS ---" dcli -g $CELL_GROUP \ "cellcli -e LIST GRIDDISK WHERE status != 'active'" echo "" echo "--- 6. CELLDISKS NOT IN NORMAL STATUS ---" dcli -g $CELL_GROUP \ "cellcli -e LIST CELLDISK WHERE status != 'normal'" echo "" echo "--- 7. CELL CPU UTILISATION ---" dcli -g $CELL_GROUP \ "cellcli -e LIST METRICCURRENT CL_CPUT" echo "" echo "--- 8. FLASH CACHE UTILISATION ---" dcli -g $CELL_GROUP \ "cellcli -e LIST METRICCURRENT FC_BY_USED" echo "" echo "==============================================" echo " HEALTH CHECK COMPLETE" echo "=============================================="

Complete cellcli Command Reference

Command	What It Does	Access Required
`LIST CELL`	Cell status overview	cellmonitor
`LIST CELL DETAIL`	Full cell info including hardware model, software version, network	cellmonitor
`LIST PHYSICALDISK`	All physical disk status	cellmonitor
`LIST PHYSICALDISK WHERE status != 'normal'`	Disks with problems only	cellmonitor
`LIST CELLDISK`	All celldisk status and size	cellmonitor
`LIST CELLDISK WHERE status != 'normal'`	Celldisks with problems	cellmonitor
`LIST GRIDDISK`	All griddisk status and ASM mapping	cellmonitor
`LIST GRIDDISK WHERE status != 'active'`	Griddisks not in active state	cellmonitor
`LIST ALERTHISTORY`	All cell alerts	cellmonitor
`LIST ALERTHISTORY WHERE severity LIKE '[warning\|critical]'`	Warning and critical alerts only	cellmonitor
`LIST FLASHCACHE DETAIL`	Flash cache status and configuration	cellmonitor
`LIST METRICCURRENT CL_CPUT`	Cell CPU utilisation	cellmonitor
`LIST METRICCURRENT FC_BY_USED`	Flash cache utilisation %	cellmonitor
`LIST METRICCURRENT CD_IO_RQ_R_LG`	Large I/O (Smart Scan) request rate	cellmonitor
`LIST IORMPLAN`	I/O Resource Management plan	cellmonitor
`ALTER GRIDDISK ALL INACTIVE`	Take all griddisks offline for maintenance	celladmin
`ALTER GRIDDISK ALL ACTIVE`	Bring all griddisks back online	celladmin
`ALTER ALERTHISTORY <id> examinedBy = 'name'`	Acknowledge an alert	celladmin
`ALTER CELL RESTART SERVICES cellsrv`	Restart the cellsrv daemon	celladmin
`ALTER FLASHCACHE ALL FLUSH`	Clear the flash cache	celladmin
`CREATE GRIDDISK`	Create a new griddisk on a celldisk	celladmin
`DROP GRIDDISK`	Remove a griddisk	celladmin

Summary

Exadata storage has three object levels: PHYSICALDISK (hardware) → CELLDISK (cell software layer) → GRIDDISK (presented to ASM). Understanding this chain is essential for hardware event diagnosis.
cellcli runs on storage cells only — SSH to each cell individually. Use cellmonitor for read-only monitoring, celladmin for configuration changes.
The most critical daily check is LIST PHYSICALDISK WHERE status != 'normal' — a warning - predictive failure status means the disk must be replaced proactively before it fails completely.
dcli runs any cellcli command across all cells simultaneously using a group file — it is the only practical way to check all cells every day.
The cell alert system records all significant events in ALERTHISTORY. Check for unacknowledged warning and critical alerts daily using the dcli command in the health check script.
The Exadata alert log on each cell (/opt/oracle/cell/log/diag/asm/cell/) provides detailed event history for deep troubleshooting.
Before any planned cell maintenance, always take griddisks INACTIVE first and wait for ASM rebalance to complete. Never restart cellsrv with griddisks in active state during database I/O.

Database Consultant (Passionate about Database & Cloud Technologies)

Friday, May 22, 2026