Exadata Smart Scan, Storage Indexes, and HCC — The Three Features That Change Everything
If you ask an experienced Exadata DBA to name the three features that deliver the most real-world performance benefit, the answer is always the same: Smart Scan, Storage Indexes, and Hybrid Columnar Compression. Together, these three features can reduce the I/O a query performs by over 99% compared to traditional storage.
This article explains how each one works internally, what conditions must be met for them to activate, how they interact with each other, and how to verify using SQL that they are actually doing their job. This is the article to bookmark when you need to explain Exadata performance to a developer or architect.
These three features work together as a stack. Storage Indexes eliminate I/O before disk is read. Smart Scan filters what comes off disk before it crosses the network. HCC reduces what is stored on disk in the first place. They compound — the benefit of all three together is greater than the sum of each individually.
Feature 1 — Smart Scan
Smart Scan is Exadata's SQL offload capability. Instead of reading raw database blocks from storage and filtering them at the database server, Smart Scan pushes the filtering work down to the storage cell — so only matching rows and requested columns ever travel across the RoCE network to the database server.
Article 3 covered the mechanics in full. This article focuses on the activation conditions, what disables Smart Scan, and how to verify it is working.
The four conditions required for Smart Scan
Smart Scan does not activate automatically for every query. Four conditions must all be true simultaneously:
- Direct Path Read must be used. Smart Scan is invoked from within Oracle's direct path read code path — not from the buffer cache read path. If Oracle reads from the buffer cache, it never calls the Smart Scan function (kcfis_read). Direct path reads bypass the buffer cache and read data into PGA memory instead.
- The object must be stored on Exadata storage. The ASM disk group containing the object must have
cell.smart_scan_capable = TRUE. Objects on non-Exadata storage cannot use Smart Scan — the storage layer has no cellsrv to process the iDB request. - The operation must be a full scan. Smart Scan activates for full table scans, fast full index scans, and fast full bitmap index scans. Single-block lookups via index range scans do not qualify — they use conventional I/O via the buffer cache and are not candidates for offloading.
- CELL_OFFLOAD_PROCESSING must be TRUE. This database initialisation parameter controls whether Smart Scan processing is enabled. It defaults to TRUE on Exadata. It must not have been explicitly disabled at the instance, session, or statement level.
The most common reason Smart Scan does not activate in practice is that the query is serial and the segment is small enough that Oracle decides to use buffer cache reads instead of direct path reads. Parallel queries automatically use direct path reads and therefore automatically qualify for Smart Scan. For serial queries, Oracle makes a dynamic decision at runtime based on segment size relative to buffer cache size.
When Direct Path Read activates for serial queries
For parallel queries — always. For serial queries — Oracle makes a dynamic decision at runtime for each segment separately, based on:
- The size of the segment being scanned — larger segments are more likely to trigger direct path reads
- The current buffer cache size and how much of the segment is already cached
- The
_small_table_thresholdhidden parameter — segments below approximately 5x this value may use buffer cache reads - Whether the table has been recently accessed and is warm in the cache
This dynamic decision can cause inconsistent Smart Scan behaviour for the same query run at different times. We cover the diagnosis and resolution of this in detail in Article 6.
What Smart Scan does once activated
| Smart Scan Operation | What Is Eliminated | Where It Happens |
|---|---|---|
| Predicate filtering | Rows that do not match the WHERE clause — never sent to the DB node | Storage cell CPUs via CELLOFLSRVn |
| Column projection | Columns not in the SELECT list or JOIN conditions — never transmitted | Storage cell CPUs via CELLOFLSRVn |
| Bloom filter join offload | Rows that cannot survive a join with a dimension table — eliminated at the cell | Storage cell CPUs via CELLOFLSRVn |
| HCC decompression | HCC data is decompressed at the cell before filtering — only matching rows sent | Storage cell CPUs via CELLOFLSRVn |
Smart Scan and the buffer cache
Smart Scan results are not stored in the buffer cache. Because cellsrv returns row sets rather than raw block images, the result does not conform to Oracle's block format and cannot be cached. This is intentional — Smart Scan is designed for large analytic scans where caching the entire dataset would overwhelm the buffer cache. The data flows directly into PGA memory for assembly and is returned to the client without polluting the buffer cache with cold data.
-- Check Smart Scan statistics for the current session SELECT name, value FROM v$mystat ms JOIN v$statname sn ON ms.statistic# = sn.statistic# WHERE sn.name LIKE 'cell%' AND value > 0 ORDER BY sn.name; -- Key statistics to look for: -- cell physical IO interconnect bytes : total bytes sent between cell and DB node -- cell physical IO bytes eligible for predicate offload : bytes that went through Smart Scan -- cell physical IO interconnect bytes returned by smart scan : bytes actually returned after filtering -- cell scans : number of Smart Scan operations performed -- The efficiency ratio: -- bytes returned by smart scan / bytes eligible for predicate offload -- A ratio of 0.01 (1%) means 99% I/O reduction via Smart Scan
-- Check Smart Scan offload for a specific SQL statement SELECT sql_id, io_cell_offload_eligible_bytes / 1024 / 1024 / 1024 AS eligible_gb, io_interconnect_bytes / 1024 / 1024 / 1024 AS interconnect_gb, io_cell_offload_returned_bytes / 1024 / 1024 / 1024 AS returned_gb, ROUND( (1 - io_cell_offload_returned_bytes / NULLIF(io_cell_offload_eligible_bytes,0)) * 100, 2 ) AS pct_saved FROM v$sql WHERE sql_id = '&your_sql_id' ORDER BY last_active_time DESC FETCH FIRST 5 ROWS ONLY; -- pct_saved approaching 99% = excellent Smart Scan efficiency -- pct_saved = 0 = Smart Scan not activating for this query
-- After running a query, check its plan from cursor cache SELECT * FROM TABLE( DBMS_XPLAN.DISPLAY_CURSOR('&sql_id', NULL, 'ALLSTATS LAST') ); -- In the plan output, Smart Scan shows as: -- TABLE ACCESS STORAGE FULL -- "STORAGE" indicates Smart Scan offload -- Without Smart Scan it would show: -- TABLE ACCESS FULL -- no "STORAGE" keyword
-- Disable Smart Scan for this session only (testing/diagnosis) ALTER SESSION SET cell_offload_processing = FALSE; -- Re-enable Smart Scan ALTER SESSION SET cell_offload_processing = TRUE; -- Force direct path read for serial queries (testing only) -- Not recommended for production without Oracle guidance ALTER SESSION SET "_serial_direct_read" = ALWAYS;
Feature 2 — Storage Indexes
Storage Indexes are an in-memory optimisation that happens before Smart Scan. While Smart Scan reduces what crosses the network after reading disk, Storage Indexes eliminate disk reads entirely. They represent the fastest possible I/O — no I/O at all.
What a Storage Index is
A Storage Index is an in-memory data structure maintained automatically on each storage cell. It tracks the minimum and maximum values of up to eight columns for each storage region — a 1 MB contiguous area of disk on the storage cell called a storage region.
When cellsrv receives a Smart Scan request with a predicate, it first consults the Storage Index for each storage region before reading any data from disk. If the Storage Index shows that the min/max range of the predicate column in a given region cannot possibly satisfy the WHERE clause, that entire region is skipped — no disk I/O, no flash read, nothing.
A concrete example
Consider a table with a column ORDER_DATE and a query with WHERE ORDER_DATE = DATE '2026-04-21'.
- Storage region 1 contains ORDER_DATE values from 01-Jan-2023 to 31-Dec-2023. The Storage Index records min=01-Jan-2023, max=31-Dec-2023. Since 21-Apr-2026 is outside this range, region 1 is skipped entirely — zero disk I/O.
- Storage region 2 contains ORDER_DATE values from 01-Jan-2026 to 30-Jun-2026. The Storage Index records min=01-Jan-2026, max=30-Jun-2026. Since 21-Apr-2026 falls within this range, region 2 must be read. Smart Scan then filters it further.
- Storage region 3 contains ORDER_DATE values from 01-Jul-2026 to 31-Dec-2026. Min=01-Jul-2026 is already past 21-Apr-2026, so region 3 is skipped — zero disk I/O.
Storage Indexes are most effective when data has natural clustering — when similar values tend to be stored in adjacent storage regions. Data loaded in ORDER_DATE sequence, or partitioned by date, will have excellent Storage Index effectiveness. Randomly distributed data provides less benefit because min/max ranges overlap across regions and fewer regions can be eliminated.
How Storage Indexes are built and maintained
- Built automatically by Exadata System Software — no DBA action required, no DDL, no maintenance
- Maintained in the storage cell's DRAM — separate from XRMEM, part of cellsrv's own memory
- Updated continuously as data is written to each storage region — always current
- Tracks up to 8 columns per storage region — selected automatically based on observed query predicates
- Rebuilt automatically if the cell restarts — they are regenerated from the data patterns observed after restart
- Cannot be created, dropped, or explicitly managed by the DBA — they are entirely automatic
Verifying Storage Index savings
-- Check storage index bytes saved for current session SELECT name, value / 1024 / 1024 / 1024 AS gb_saved FROM v$mystat ms JOIN v$statname sn ON ms.statistic# = sn.statistic# WHERE sn.name = 'cell physical IO bytes saved by storage index' AND value > 0; -- System-wide storage index savings SELECT name, value / 1024 / 1024 / 1024 AS total_gb_saved FROM v$sysstat WHERE name = 'cell physical IO bytes saved by storage index'; -- From AWR: compare eligible bytes vs storage index savings -- High ratio = Storage Indexes are very effective for this workload
# SSH to any storage cell ssh root@cell01 cellcli # Check storage index statistics CellCLI> LIST METRICCURRENT WHERE name LIKE 'SI_%' # SI_NUM_COLUMN_COLUMNS : number of columns tracked by storage indexes # SI_STORAGE_SAVED : bytes saved by storage index eliminations
Storage Indexes vs traditional database indexes
| Factor | Traditional Index | Exadata Storage Index |
|---|---|---|
| Creation | Manual — DBA runs CREATE INDEX | Automatic — built by Exadata System Software |
| Maintenance | Manual — rebuild, analyse, monitor fragmentation | Fully automatic — no DBA involvement |
| Storage cost | Significant — index segments use database storage | Zero — stored in cell DRAM, no disk storage cost |
| DML overhead | Every INSERT, UPDATE, DELETE updates the index | No DML overhead — maintained asynchronously |
| Granularity | Row level — points to specific rows | Storage region level (1 MB) — eliminates entire regions |
| Best for | OLTP lookups — returns specific rows via index range scan | Analytics — eliminates large I/O regions during full scans |
| Works with Smart Scan | Index range scans bypass Smart Scan entirely | Works before Smart Scan — eliminates regions before Smart Scan reads them |
Feature 3 — Hybrid Columnar Compression (HCC)
Hybrid Columnar Compression is an Exadata-exclusive compression technology that enables dramatic reductions in storage for large datasets. Storage savings typically range from 5x to 20x, with an industry-leading average of 10x. HCC is not available on standard Oracle storage — it requires Exadata storage cells.
How HCC works internally — Compression Units
Standard Oracle compression stores data in row format — each row is stored together, one row after another. HCC uses a fundamentally different approach based on Compression Units (CUs).
A Compression Unit is a logical storage structure larger than a database block — CU size ranges from 32 KB to 256 KB depending on the compression tier. Within a CU, HCC groups data by column rather than by row. All values from column 1 for a set of rows are stored together, then all values from column 2, and so on.
Because similar data types and values are grouped together within each column, compression algorithms achieve dramatically higher ratios than row-based storage. A column of account status codes containing mostly 'ACTIVE' compresses to almost nothing when stored together. In row format, those same values are interspersed with dozens of other column values and compress much less efficiently.
The four HCC compression tiers
HCC provides four compression tiers — two Query tiers and two Archive tiers. Each tier trades compression ratio against query performance and DML support.
| Tier | Syntax | Compression | Query Speed | Best Use |
|---|---|---|---|---|
| Query Low | COMPRESS FOR QUERY LOW |
~6x | Fastest | Active data warehouse, frequent reads, fast load requirement |
| Query High | COMPRESS FOR QUERY HIGH |
~10x | Fast | Default choice for data warehouse tables — best compression/performance balance |
| Archive Low | COMPRESS FOR ARCHIVE LOW |
~15x | Moderate | Historical data — queried occasionally, older partitions |
| Archive High | COMPRESS FOR ARCHIVE HIGH |
~50x | Slow | Cold archives — regulatory compliance, accessed very rarely |
HCC and DML — the important limitation
HCC is best suited for data with no or very limited DML operations. This is the most important constraint to communicate to application teams.
The reason is the Compression Unit locking model. HCC uses one lock per CU — not one lock per row. When an UPDATE or DELETE targets a single row inside a CU, the entire CU is migrated to row format (decompressed) to allow the DML. This can cause significant space and performance problems for tables with frequent DML.
- Bulk loads using direct path INSERT — fully supported, excellent performance
- INSERT ... SELECT — supported with direct path from Oracle Database 12.2 onwards
- SELECT queries including Smart Scan — fully supported, HCC decompression happens at the cell
- Frequent UPDATE or DELETE — not recommended. Use Advanced Row Compression instead.
- Infrequent single-row updates — acceptable but each update decompresses the affected CU
HCC and Smart Scan — how they work together
HCC and Smart Scan are designed to work together on Exadata. When a Smart Scan reads HCC-compressed data, the decompression happens on the storage cell's CPUs — not on the database server. The storage cell decompresses the CU, applies predicate filtering, projects the requested columns, and returns only the matching uncompressed rows.
The result is a compounding effect: HCC reduces the number of blocks on disk (fewer blocks to read), Smart Scan filters what comes off those blocks (less data across the network), and Storage Indexes eliminate entire regions before either feature operates (some blocks never read at all).
For data analytics workloads, Exadata Smart Flash Cache implements an additional optimisation called Exadata Columnar Flash Cache (CELLMEMORY). As HCC-compressed data is loaded into Smart Flash Cache, Exadata automatically transforms it into a pure columnar format. Smart Scans on this columnar flash data run faster because they read only the selected columns — further reducing flash I/Os and storage server CPU consumption.
How to create HCC-compressed tables and partitions
-- Create a new table with HCC Query High (default — recommended) CREATE TABLE sales_history ( sale_id NUMBER, customer_id NUMBER, sale_date DATE, amount NUMBER, status VARCHAR2(20) ) COMPRESS FOR QUERY HIGH; -- Create with Archive High for cold data CREATE TABLE audit_log_archive ( log_id NUMBER, log_date DATE, action VARCHAR2(100) ) COMPRESS FOR ARCHIVE HIGH; -- Compress an existing table (rebuilds the table) ALTER TABLE sales_history MOVE COMPRESS FOR QUERY HIGH; -- Add HCC to a new partition on a partitioned table ALTER TABLE sales ADD PARTITION p_2025 VALUES LESS THAN (DATE '2026-01-01') COMPRESS FOR QUERY HIGH; -- Change compression tier for an existing partition ALTER TABLE sales MOVE PARTITION p_2020 COMPRESS FOR ARCHIVE HIGH;
-- Check table-level compression SELECT table_name, compression, compress_for FROM dba_tables WHERE owner = 'APPS' AND compression = 'ENABLED' ORDER BY table_name; -- Check partition-level compression SELECT table_name, partition_name, compression, compress_for FROM dba_tab_partitions WHERE table_owner = 'APPS' ORDER BY table_name, partition_position; -- Estimate compression savings before applying EXEC DBMS_COMPRESSION.GET_COMPRESSION_RATIO( scratchtbsname => 'TEMP', ownname => 'APPS', tabname => 'SALES_HISTORY', partname => NULL, comptype => DBMS_COMPRESSION.COMP_FOR_QUERY_HIGH, blkcnt_cmp => :blk_cmp, blkcnt_uncmp => :blk_uncmp, row_cmp => :row_cmp, row_uncmp => :row_uncmp, cmp_ratio => :ratio, comptype_str => :type_str );
How All Three Features Work Together
The real power of Exadata is not any single feature — it is the combination. Consider the same 1 TB query from Article 3, now with all three features active.
| Stage | Feature | What Happens | I/O Remaining |
|---|---|---|---|
| Before disk | HCC | Data compressed 10x — 1 TB of logical data stored in 100 GB on disk. Fewer blocks to read. | 100 GB on disk (was 1 TB) |
| Before disk read | Storage Index | 70% of storage regions eliminated because ORDER_DATE min/max cannot satisfy WHERE clause. Zero disk reads for those regions. | 30 GB must be read from disk |
| After disk read | Smart Scan | Cell decompresses the 30 GB, applies predicate filtering (only 2% match), projects only requested columns. Only matching rows cross the network. | ~600 MB crosses network |
| At DB node | — | DB node assembles result from all cells. Final aggregation if needed. Returns result to user. | Result set returned |
Without any Exadata features: 1 TB crosses the storage network. With all three features combined: approximately 600 MB crosses the network — a reduction of over 99.9%. The query that took hours on traditional storage can complete in minutes or seconds.
Complete Smart Scan Verification Script
Use this script before and after a query to measure exactly how much work Exadata saved.
-- Step 1: Capture baseline statistics before your query CREATE TABLE ss_before AS SELECT name, value FROM v$mystat ms JOIN v$statname sn ON ms.statistic# = sn.statistic# WHERE sn.name IN ( 'cell scans', 'cell physical IO interconnect bytes', 'cell physical IO bytes eligible for predicate offload', 'cell physical IO interconnect bytes returned by smart scan', 'cell physical IO bytes saved by storage index', 'physical read total bytes' ); -- Step 2: Run your query here -- Step 3: Compare after vs before SELECT after.name, after.value - before.value AS delta FROM (SELECT name, value FROM v$mystat ms JOIN v$statname sn ON ms.statistic# = sn.statistic# WHERE sn.name IN ( 'cell scans', 'cell physical IO interconnect bytes', 'cell physical IO bytes eligible for predicate offload', 'cell physical IO interconnect bytes returned by smart scan', 'cell physical IO bytes saved by storage index', 'physical read total bytes' )) after JOIN ss_before before ON after.name = before.name WHERE after.value - before.value > 0 ORDER BY after.name; -- Key ratios to calculate from the output: -- -- Smart Scan efficiency: -- returned_by_smart_scan / eligible_for_predicate_offload -- Lower = better. 0.01 means 99% reduction. -- -- Storage Index efficiency: -- bytes_saved_by_storage_index / eligible_for_predicate_offload -- Higher = better. Shows what Storage Index eliminated before disk read. -- -- Overall I/O reduction: -- (physical_read_total - interconnect_bytes) / physical_read_total -- Shows total I/O saving across both features.
Summary — The Three Features in One View
| Feature | When It Operates | What It Eliminates | Managed By |
|---|---|---|---|
| Storage Index | Before disk read — during Smart Scan I/O planning | Entire 1 MB storage regions where predicates cannot match | Automatic — no DBA action |
| Smart Scan | During disk read — when Direct Path Read + Full Scan conditions are met | Non-matching rows and unrequested columns before network transmission | Automatic — requires CELL_OFFLOAD_PROCESSING=TRUE |
| HCC | Always — data stored in compressed format, decompressed at cell during Smart Scan | Disk space and I/O — 6x to 50x fewer blocks to read in the first place | DBA configures at table/partition level — automatic after that |
- Smart Scan requires Direct Path Read, a full scan operation, Exadata storage, and CELL_OFFLOAD_PROCESSING=TRUE
- Storage Indexes are fully automatic — track min/max per storage region for up to 8 columns — no creation or maintenance by the DBA
- HCC has four tiers — Query Low (~6x), Query High (~10x, default), Archive Low (~15x), Archive High (~50x)
- HCC is for data with minimal DML — frequent updates should use Advanced Row Compression instead
- All three features compound — HCC reduces disk I/O, Storage Indexes eliminate regions, Smart Scan filters network I/O
- Verify with
cell physical IO bytes eligible for predicate offloadvscell physical IO interconnect bytes returned by smart scanin V$SYSSTAT or V$MYSTAT
No comments:
Post a Comment