How to Monitor Exadata Performance — Key Metrics, Views, and Commands Every DBA Needs
Monitoring an Exadata system is different from monitoring a standard Oracle database. On a standard system, most of what you need is in V$ views and AWR. On Exadata, there is an additional layer — the storage cells — with their own metrics, their own command-line interface, and their own sections in the AWR report. A DBA who only monitors at the database layer will miss the most important Exadata-specific signals.
This article covers the complete Exadata monitoring toolkit: the V$ views specific to Exadata, the V$SYSSTAT statistics that reveal Smart Scan activity, the cellcli commands you run on storage cells, the dcli utility for querying all cells simultaneously, the Exadata-specific sections of the AWR report, and a ready-to-run SQL health snapshot script you can use immediately.
Two monitoring surfaces: Exadata monitoring operates at two levels — the database level (V$ views, AWR, SQL queries from the DB node) and the cell level (cellcli commands run directly on storage cells or via dcli). Both levels are needed for a complete picture. This article covers both.
The Four Monitoring Tools on Exadata
| Tool | Where You Run It | What It Shows |
|---|---|---|
| V$ views | Database node — any SQL client | Cell identification, Smart Scan statistics, I/O statistics, cell configuration |
| V$SYSSTAT / V$MYSTAT | Database node — any SQL client | Smart Scan byte counts, Storage Index savings, cell I/O counters — the key performance indicators |
| cellcli | Storage cell — SSH to individual cell then run cellcli | Cell-level metrics: CPU, temperature, disk I/O, flash cache, alert states, metric history |
| dcli | Database node or admin node — runs cellcli commands across all cells at once | Same as cellcli but aggregated across all storage cells simultaneously |
Exadata-Specific V$ Views
V$CELL — Cell Identification
The simplest starting point. V$CELL lists all storage cells known to the database instance, with their IP addresses and cell type. Run this first on any new Exadata system to confirm the database can see all expected storage cells.
-- List all storage cells connected to this database SELECT cell_hash, cell_path, cell_type FROM v$cell ORDER BY cell_path; -- GV$CELL shows cells from all RAC instances SELECT inst_id, cell_hash, cell_path, cell_type FROM gv$cell ORDER BY inst_id, cell_path; -- Count cells — should match your expected rack configuration -- Full rack = up to 19 cells, Half rack = up to 9, Quarter = 3 SELECT COUNT(*) AS cell_count FROM v$cell;
V$CELL_CONFIG — Cell Software Configuration
V$CELL_CONFIG exposes configuration information about each storage cell in XML format. It includes Exadata System Software version, cell configuration attributes, and IORM plan information. This is the view to query when you need to confirm software versions or configuration settings without logging into each cell individually.
-- Extract Exadata System Software version from each cell SELECT cell_path, EXTRACTVALUE( XMLTYPE(confval), '/cli-output/cell/releaseVersion' ) AS cell_software_version FROM v$cell_config WHERE conftype = 'CELL' ORDER BY cell_path; -- Check smart_scan_capable status for all disk groups SELECT d.name AS disk_group, a.name AS attribute, a.value AS attr_value FROM v$asm_diskgroup d, v$asm_attribute a WHERE d.group_number = a.group_number AND a.name = 'cell.smart_scan_capable' ORDER BY d.name;
V$CELL_DB — Per-Database Cell Statistics
V$CELL_DB shows I/O statistics broken down by database for each cell. Particularly useful in consolidated environments where multiple databases share the same Exadata system — it lets you see which database is consuming the most cell resources.
-- Show I/O statistics per database per cell (XML parsing)
SELECT cell_path,
EXTRACTVALUE(XMLTYPE(statistics), '/stats/stat[@name="dbName"]')
AS db_name,
EXTRACTVALUE(XMLTYPE(statistics), '/stats/stat[@name="iormBytes"]')
AS iorm_bytes
FROM v$cell_db
ORDER BY cell_path, db_name;
V$CELL_DISK — Disk-Level Statistics per Cell
V$CELL_DISK exposes disk-level statistics for each disk on each cell. Use it to identify uneven I/O distribution across disks, detect high-latency disks, or verify disk status during a hardware event.
-- Check disk status on all cells from the database
SELECT cell_path,
EXTRACTVALUE(XMLTYPE(statistics), '/stats/stat[@name="diskName"]')
AS disk_name,
EXTRACTVALUE(XMLTYPE(statistics), '/stats/stat[@name="status"]')
AS disk_status
FROM v$cell_disk
ORDER BY cell_path, disk_name;
Smart Scan Metrics in V$SYSSTAT — The Most Important Statistics
The most immediately useful Exadata monitoring comes from V$SYSSTAT statistics with names beginning with cell. These statistics accumulate at the instance level and tell you exactly how much Smart Scan activity has occurred, how much I/O Storage Indexes have eliminated, and how efficient the overall offload process has been.
The key Exadata statistics in V$SYSSTAT
| Statistic Name | What It Measures | What Good Looks Like |
|---|---|---|
cell scans |
Count of Smart Scan operations performed | Non-zero and growing for analytic workloads |
cell physical IO interconnect bytes |
Total bytes transferred between cells and DB nodes across the RoCE network | Much lower than physical read total bytes — indicates offload is working |
cell physical IO bytes eligible for predicate offload |
Bytes read from disk that were candidates for Smart Scan filtering | Should be high — shows how much data went through Smart Scan |
cell physical IO interconnect bytes returned by smart scan |
Bytes actually returned to the DB node after Smart Scan filtering | Should be much smaller than eligible bytes — ratio shows filtering efficiency |
cell physical IO bytes saved by storage index |
Bytes that Storage Indexes eliminated before disk was read | Higher is better — shows Storage Index I/O avoidance |
cell physical IO bytes saved during optimized RMAN file restore |
Bytes saved during RMAN operations by cell offloading | Non-zero indicates RMAN backup offload is active |
cell IO uncompressed bytes |
Bytes after HCC decompression at the cell — before predicate filtering | Significantly higher than interconnect bytes confirms HCC is compressing effectively |
physical read total bytes |
Total bytes read from storage — baseline for comparison | Compare against interconnect bytes to calculate overall I/O reduction |
-- All Exadata-relevant statistics system-wide SELECT name, value, ROUND(value / 1024 / 1024 / 1024, 2) AS value_gb FROM v$sysstat WHERE name IN ( 'cell scans', 'cell physical IO interconnect bytes', 'cell physical IO bytes eligible for predicate offload', 'cell physical IO interconnect bytes returned by smart scan', 'cell physical IO bytes saved by storage index', 'cell IO uncompressed bytes', 'physical read total bytes' ) ORDER BY name; -- Calculate Smart Scan efficiency ratio SELECT ROUND( (SELECT value FROM v$sysstat WHERE name = 'cell physical IO interconnect bytes returned by smart scan') / NULLIF( (SELECT value FROM v$sysstat WHERE name = 'cell physical IO bytes eligible for predicate offload'), 0) * 100, 2 ) AS pct_data_returned_after_smartscan FROM dual; -- Interpretation: -- 1% = 99% I/O eliminated by Smart Scan (excellent) -- 10% = 90% I/O eliminated (good) -- 50% = 50% I/O eliminated (moderate) -- 90% = minimal Smart Scan filtering (investigate)
-- Check Smart Scan stats for the current session only SELECT sn.name, ms.value, ROUND(ms.value / 1024 / 1024, 2) AS value_mb FROM v$mystat ms JOIN v$statname sn ON ms.statistic# = sn.statistic# WHERE sn.name LIKE 'cell%' AND ms.value > 0 ORDER BY sn.name; -- Use V$SESSTAT for a specific session (sid=123 example) SELECT sn.name, ss.value, ROUND(ss.value / 1024 / 1024, 2) AS value_mb FROM v$sesstat ss JOIN v$statname sn ON ss.statistic# = sn.statistic# WHERE ss.sid = 123 AND sn.name LIKE 'cell%' AND ss.value > 0 ORDER BY sn.name;
cellcli — Monitoring Inside the Storage Cell
cellcli is the command-line interface for each storage cell. It is installed on the storage cell's Oracle Linux operating system and provides direct access to cell metrics, disk status, alert state, and configuration. The default read-only user is cellmonitor — most DBAs use this for monitoring without needing the celladmin account.
Connecting to cellcli
# SSH to a storage cell (use the management network hostname) ssh celladmin@cell01 # Or as cellmonitor for read-only access ssh cellmonitor@cell01 # Open the cellcli interactive interface cellcli # Or run a single command without entering interactive mode cellcli -e "LIST CELL DETAIL" # Run cellcli as a script from the OS command line cellcli -e "LIST METRICCURRENT WHERE name LIKE 'CL_%'"
Essential cellcli monitoring commands
# Full cell detail including hardware and software info CellCLI> LIST CELL DETAIL # Quick status — name, status, and alert state CellCLI> LIST CELL ATTRIBUTES name, status, alertHistory # Check current CPU utilisation on this cell CellCLI> LIST METRICCURRENT CL_CPUT # Check cell temperature (thermal monitoring) CellCLI> LIST METRICCURRENT CL_TEMP # Check cell memory utilisation CellCLI> LIST METRICCURRENT CL_MEMUT
# List all physical disks with status CellCLI> LIST PHYSICALDISK ATTRIBUTES name, diskType, status, luns # List flash drives status CellCLI> LIST FLASHDISK ATTRIBUTES name, status, diskType # Check all celldisks (how physical disks appear to cellsrv) CellCLI> LIST CELLDISK ATTRIBUTES name, status, size, freespace # Check I/O read requests for large (Smart Scan) operations CellCLI> LIST METRICCURRENT CD_IO_RQ_R_LG # Check I/O read requests for small (OLTP) operations CellCLI> LIST METRICCURRENT CD_IO_RQ_R_SM # Disk read throughput in MB/sec CellCLI> LIST METRICCURRENT CD_IO_BY_R_LG_SEC
# Check Smart Flash Cache size and utilisation CellCLI> LIST FLASHCACHE DETAIL # Flash cache read hits in MB/sec CellCLI> LIST METRICCURRENT FC_IO_BY_R_SEC # Flash cache utilisation percentage CellCLI> LIST METRICCURRENT FC_BY_USED # Columnar flash cache (CELLMEMORY) usage CellCLI> LIST METRICCURRENT WHERE name LIKE 'FC_COL%'
# List all current alerts on this cell CellCLI> LIST ALERTHISTORY # Show only critical or warning alerts CellCLI> LIST METRICCURRENT WHERE alertState != 'normal' # Check metric history for I/O read requests — last hour CellCLI> LIST METRICHISTORY CD_IO_RQ_R_LG WHERE collectionTime > '2026-05-22T08:00:00' # Check metric definitions to understand a metric CellCLI> LIST METRICDEFINITION CD_IO_RQ_R_LG DETAIL
Understanding the cellcli metric naming convention
cellcli metric names follow a consistent pattern that tells you exactly what is being measured. Once you understand the pattern, you can decode any metric name without documentation.
| Prefix | Object Type | Example |
|---|---|---|
CL_ |
Cell-level — the overall cell server | CL_CPUT = Cell CPU Total utilisation |
CD_ |
Celldisk — individual disk on the cell | CD_IO_RQ_R_LG = Celldisk I/O Requests Read Large |
GD_ |
Griddisk — logical disk presented to ASM | GD_IO_RQ_R_SM = Griddisk I/O Requests Read Small |
FC_ |
Flash cache — Smart Flash Cache metrics | FC_IO_BY_R_SEC = Flash Cache I/O Bytes Read per Second |
FD_ |
Flash disk — individual flash drive | FD_IO_RQ_R = Flash Disk I/O Requests Read |
SI_ |
Storage Index statistics | SI_STORAGE_SAVED = Storage Index bytes saved |
The middle segment tells you the type of measurement:
IO_RQ= I/O Requests (count)IO_BY= I/O Bytes (throughput)R= Read,W= WriteLG= Large I/O (Smart Scan type),SM= Small I/O (OLTP type)SEC= per second (rate metric), no suffix = cumulative
dcli — Running Commands Across All Cells at Once
cellcli requires you to SSH into each cell individually. For a full rack with 14 to 19 storage cells, repeating the same command on each cell is impractical. The dcli (Distributed CLI) utility solves this — it runs any cellcli command across all storage cells simultaneously and collects the results.
# dcli uses a group file listing all cell hostnames # The cell group file is usually here: cat /opt/oracle.SupportTools/onecommand/cell_group # Should list one cell hostname per line: # cell01 # cell02 # cell03 # ... # Basic syntax: # dcli -g <group_file> "<command to run on each cell>" # Example: check cell daemon status on all cells dcli -g /opt/oracle.SupportTools/onecommand/cell_group \ "service celld status"
# Check cellsrv/MS/RS daemon status on ALL cells dcli -g /opt/oracle.SupportTools/onecommand/cell_group \ "service celld status" # Check Exadata System Software version on all cells dcli -g /opt/oracle.SupportTools/onecommand/cell_group \ "cellcli -e LIST CELL ATTRIBUTES name,cellVersion" # Check CPU utilisation on all cells simultaneously dcli -g /opt/oracle.SupportTools/onecommand/cell_group \ "cellcli -e LIST METRICCURRENT CL_CPUT" # Check for any alerts across all cells dcli -g /opt/oracle.SupportTools/onecommand/cell_group \ "cellcli -e LIST ALERTHISTORY" # Check disk status on all cells — look for non-normal status dcli -g /opt/oracle.SupportTools/onecommand/cell_group \ "cellcli -e LIST PHYSICALDISK ATTRIBUTES name,status" # Check flash cache utilisation on all cells dcli -g /opt/oracle.SupportTools/onecommand/cell_group \ "cellcli -e LIST FLASHCACHE DETAIL" # Check Storage Index savings on all cells dcli -g /opt/oracle.SupportTools/onecommand/cell_group \ "cellcli -e LIST METRICCURRENT WHERE name LIKE 'SI_%'"
dcli output format: Each line of dcli output is prefixed with the cell hostname that produced it — for example cell01: CellSRV is running. This makes it easy to identify which cell produced which result. If a cell produces no output, it may indicate a connectivity or daemon issue on that specific cell.
Exadata-Specific Sections in the AWR Report
The AWR (Automatic Workload Repository) report on Exadata includes additional sections that do not appear in standard Oracle AWR reports. These sections are automatically populated when the database is running on Exadata storage. You generate the AWR report the same way as always — @$ORACLE_HOME/rdbms/admin/awrrpt.sql — the extra sections appear automatically.
Key Exadata sections in the AWR report
| AWR Section | What It Shows | What to Look For |
|---|---|---|
| Exadata Statistics | Summary of Smart Scan bytes, Storage Index savings, and interconnect bytes for the AWR period | High ratio of interconnect bytes to eligible bytes indicates Smart Scan not filtering effectively |
| Exadata Smart Statistics | Detailed breakdown of Smart Scan operations including offload rates | Smart Scan offload % — should be high for analytic workloads |
| Cell Server Charts | Graphical and tabular view of cell CPU, I/O throughput, and network usage during the AWR period | Cell CPU saturation — if cells are at 100% CPU, Smart Scan is processing-bound |
| Flash Cache Statistics | Smart Flash Cache read hits, population rate, and CELLMEMORY (columnar flash) usage | Low flash cache hit rate may indicate cache is too small or data is too large for flash |
| Storage Index Statistics | Bytes eliminated by Storage Indexes per AWR period | Zero savings may indicate data is not clustered well — check data loading patterns |
| IORM Statistics | I/O Resource Management plan activity — how I/O is distributed between databases or consumers | Throttling events indicate IORM is actively limiting a database's I/O |
-- Query AWR for Exadata statistics across a time range SELECT snap_id, stat_name, value, ROUND(value / 1024 / 1024 / 1024, 2) AS value_gb FROM dba_hist_sysstat WHERE stat_name IN ( 'cell physical IO interconnect bytes', 'cell physical IO bytes eligible for predicate offload', 'cell physical IO interconnect bytes returned by smart scan', 'cell physical IO bytes saved by storage index' ) AND dbid = (SELECT dbid FROM v$database) AND snap_id BETWEEN &start_snap AND &end_snap ORDER BY snap_id, stat_name; -- Calculate Smart Scan efficiency from AWR snapshots SELECT a.snap_id, ROUND( (a.value - b.value) / 1024 / 1024 / 1024, 2 ) AS returned_gb, ROUND( (c.value - d.value) / 1024 / 1024 / 1024, 2 ) AS eligible_gb, ROUND( (1 - (a.value - b.value) / NULLIF((c.value - d.value), 0)) * 100, 1 ) AS pct_saved FROM dba_hist_sysstat a JOIN dba_hist_sysstat b ON a.snap_id - 1 = b.snap_id AND a.stat_name = b.stat_name AND a.dbid = b.dbid JOIN dba_hist_sysstat c ON a.snap_id = c.snap_id AND a.dbid = c.dbid JOIN dba_hist_sysstat d ON c.snap_id - 1 = d.snap_id AND c.stat_name = d.stat_name AND c.dbid = d.dbid WHERE a.stat_name = 'cell physical IO interconnect bytes returned by smart scan' AND c.stat_name = 'cell physical IO bytes eligible for predicate offload' AND a.dbid = (SELECT dbid FROM v$database) ORDER BY a.snap_id;
Ready-to-Run Exadata Health Snapshot Script
This script gives you a complete Exadata health picture from the database tier in under 30 seconds. Run it any time you need a quick status check — first thing in the morning, before a patching window, or when a user reports a performance problem.
-- ================================================
-- EXADATA HEALTH SNAPSHOT
-- Run as SYSTEM or a DBA user on any RAC node
-- ================================================
PROMPT ============================================
PROMPT 1. STORAGE CELL INVENTORY
PROMPT ============================================
SELECT cell_path,
cell_type,
CASE WHEN cell_hash > 0 THEN 'REACHABLE' ELSE 'UNREACHABLE' END
AS status
FROM v$cell
ORDER BY cell_path;
PROMPT ============================================
PROMPT 2. SMART SCAN CAPABILITY CHECK
PROMPT ============================================
SELECT name AS disk_group,
a.value AS smart_scan_capable
FROM v$asm_diskgroup dg,
v$asm_attribute a
WHERE dg.group_number = a.group_number
AND a.name = 'cell.smart_scan_capable'
ORDER BY name;
PROMPT ============================================
PROMPT 3. SMART SCAN EFFICIENCY (INSTANCE LIFETIME)
PROMPT ============================================
SELECT name,
ROUND(value / 1024 / 1024 / 1024, 2) AS value_gb
FROM v$sysstat
WHERE name IN (
'cell scans',
'cell physical IO interconnect bytes',
'cell physical IO bytes eligible for predicate offload',
'cell physical IO interconnect bytes returned by smart scan',
'cell physical IO bytes saved by storage index',
'cell IO uncompressed bytes',
'physical read total bytes'
)
ORDER BY name;
PROMPT ============================================
PROMPT 4. SMART SCAN OFFLOAD EFFICIENCY %
PROMPT ============================================
SELECT ROUND(
(SELECT value FROM v$sysstat
WHERE name =
'cell physical IO interconnect bytes returned by smart scan')
/ NULLIF(
(SELECT value FROM v$sysstat
WHERE name =
'cell physical IO bytes eligible for predicate offload'), 0)
* 100, 2) AS pct_data_returned,
ROUND(100 -
(SELECT value FROM v$sysstat
WHERE name =
'cell physical IO interconnect bytes returned by smart scan')
/ NULLIF(
(SELECT value FROM v$sysstat
WHERE name =
'cell physical IO bytes eligible for predicate offload'), 0)
* 100, 2) AS pct_io_saved
FROM dual;
PROMPT ============================================
PROMPT 5. TOP SQL BY SMART SCAN I/O SAVED
PROMPT ============================================
SELECT sql_id,
ROUND(io_cell_offload_eligible_bytes / 1073741824, 2) AS eligible_gb,
ROUND(io_cell_offload_returned_bytes / 1073741824, 2) AS returned_gb,
ROUND((1 - io_cell_offload_returned_bytes /
NULLIF(io_cell_offload_eligible_bytes, 0)) * 100, 1)
AS pct_saved,
executions,
ROUND(elapsed_time / 1000000 / NULLIF(executions,0), 2)
AS avg_elapsed_sec
FROM v$sql
WHERE io_cell_offload_eligible_bytes > 104857600
ORDER BY io_cell_offload_eligible_bytes DESC
FETCH FIRST 10 ROWS ONLY;
PROMPT ============================================
PROMPT 6. CELL SOFTWARE VERSION CHECK
PROMPT ============================================
SELECT cell_path,
EXTRACTVALUE(
XMLTYPE(confval),
'/cli-output/cell/releaseVersion'
) AS cell_version
FROM v$cell_config
WHERE conftype = 'CELL'
ORDER BY cell_path;
PROMPT ============================================
PROMPT 7. ASM DISK GROUP STATUS
PROMPT ============================================
SELECT name,
state,
type,
ROUND(total_mb / 1024, 1) AS total_gb,
ROUND(free_mb / 1024, 1) AS free_gb,
ROUND((1 - free_mb / NULLIF(total_mb,0)) * 100, 1)
AS pct_used
FROM v$asm_diskgroup
ORDER BY name;
PROMPT ============================================
PROMPT 8. ACTIVE SESSIONS WAIT EVENTS
PROMPT ============================================
SELECT event,
COUNT(*) AS session_count,
ROUND(AVG(seconds_in_wait), 1) AS avg_wait_sec
FROM v$session
WHERE status = 'ACTIVE'
AND wait_class != 'Idle'
GROUP BY event
ORDER BY session_count DESC
FETCH FIRST 10 ROWS ONLY;
PROMPT ============================================
PROMPT SNAPSHOT COMPLETE
PROMPT ============================================
Daily Monitoring Checklist for Exadata DBAs
| Check | How | What to Look For |
|---|---|---|
| Cell daemon status | dcli -g cell_group "service celld status" |
All three daemons (cellsrv, MS, RS) running on every cell |
| Cell alerts | dcli -g cell_group "cellcli -e LIST ALERTHISTORY" |
Any new critical or warning alerts since yesterday |
| Disk status | dcli -g cell_group "cellcli -e LIST PHYSICALDISK ATTRIBUTES name,status" |
All disks showing normal — any predictive failure warnings |
| Smart Scan efficiency | Health snapshot script — section 4 | Offload efficiency above 80% for analytic workloads |
| Flash cache hit rate | dcli -g cell_group "cellcli -e LIST METRICCURRENT FC_IO_BY_R_SEC" |
Flash reads significantly higher than disk reads |
| Cell CPU | dcli -g cell_group "cellcli -e LIST METRICCURRENT CL_CPUT" |
No cell consistently at 100% — indicates processing bottleneck |
| ASM disk group space | Health snapshot script — section 7 | No disk group above 85% used |
| Top SQL Smart Scan I/O | Health snapshot script — section 5 | High-I/O queries are achieving high offload % — investigate any near 0% |
Summary — Monitoring Tools at a Glance
- V$CELL — confirms all storage cells are visible and reachable from the database
- V$CELL_CONFIG — cell software version and configuration attributes without logging into each cell
- V$SYSSTAT cell statistics — the most important monitoring metrics: Smart Scan bytes eligible, returned, and Storage Index savings. Calculate the efficiency ratio to confirm Smart Scan is working.
- cellcli — run on individual storage cells for disk status, flash cache, CPU, temperature, and alerts. Use
cellcli -efor single commands without interactive mode. - dcli — run any cellcli command across all cells simultaneously. Use the cell_group file to target all cells at once.
- AWR Exadata sections — Smart Statistics, Cell Server Charts, Flash Cache Statistics, and IORM Statistics — automatically populated on Exadata, no extra configuration needed.
- Health snapshot script — eight sections covering cells, Smart Scan efficiency, top SQL, disk groups, and active waits. Run any time from SQL*Plus.
No comments:
Post a Comment