Thursday, February 16, 2023

Huge Pages: In the context of Exadata

 

Huge Pages: In the context of Exadata


Huge Pages : In the context of Exadata


 This article is focused on the Linux/x86 platform. Other platforms have different configurations for Huge Pages but the core concepts remain the same.

Memory is an important resource for running any software on a system but is even more important for running Oracle Database. Memory plays an important role in database performance because keeping data in memory leads to higher database performance. Oracle In-Memory Database places even more data into a system's main memory instead of relying on storage I/O. Modern applications need databases to deliver lower latency and higher throughput than ever before. Large scale databases often extend into the petabyte range, while I/O latency needs have pushed into the range of microseconds.

Due to these ever-increasing demands, it has become even more important to fit frequently accessed data in system memory. Advancements in technology with higher number of processor cores and faster networks make it evident that systems are pushing into the range where terabytes of memory are required. Let us see how memory is organized in these large-scale systems.

System Memory - How is it organized?

Memory is organized in blocks known as pages. A page in Linux x86-based systems is 4K in size (4096 bytes), with total memory stretching into terabytes. A system with 1 gigabyte of memory has 256,000 pages using a 4K page size. All of these pages are managed by the memory management unit (MMU) of the CPU. Consider for example, a system that has 1 TB (terabyte) of memory using the standard 4K (4096 bytes) page size. That system will have to manage more than 268 million pages! The overhead for the MMU to manage that many pages  will degrade performance.

Page Entries - How are they managed?

The MMU can manage large amounts of memory only by using large numbers of page table entries in the MMU, which introduces high amounts of performance overhead. One alternative is to increase the page size from 4K bytes to something larger to reduce the number of page table entries. However, this approach would waste memory for applications that work perfectly well with 4K page sizes. The solution is to use separate memory allocations for structures like the Oracle System Global Area (SGA) that require large amounts of memory, leaving conventional 4K pages for other uses. This is where Huge Pages comes into play. Huge Pages is a feature of the Oracle Linux kernel that allows the Operating System to use memory pages greater than the default (4K) page size. We recommend using Huge Pages size of 2MB for allocating memory to the Oracle System Global Area.

Oracle Database, Exadata and Huge Pages…

Exadata is an engineered system that is specifically optimized for running Oracle Database. Exadata is widely used for high performance databases as well as for high density database consolidation. The current generation of Exadata systems have terabytes of system memory and deliver unsurpassed database performance. System memory needs to be configured properly to realize the performance and consolidation benefits of the Exadata platform. Exadata systems need to be configured properly to ensure the SGA for each database resides in Huge Pages.

How to set Huge Pages in Exadata:

A System Administrator (root privilege) is required to configure huge pages in a Linux environment such as Exadata. The configuration file sysctl.conf is updated and is in /etc directory. The parameter for hugepages is vm.nr_hugepages. Normally the system will need a reboot for the number of huge pages to take effect, but sysadmin can dynamically load the file with the command sysctl -p.

oracle@slcqah01adm03 ~]$ cat /proc/meminfo | grep Huge

AnonHugePages:         0 kB

ShmemHugePages:        0 kB

HugePages_Total:   77064

HugePages_Free:      673

HugePages_Rsvd:      673

HugePages_Surp:        8

Hugepagesize:       2048 kB

In the above example there are 77064 Huge Pages configured on the system. With this number of Huge Pages, Oracle Database(s) can be configured with about 150GB (77064 x 2M) of SGA.

How to calculate number of Huge Pages:

Memory configuration for databases is critical for optimal performance. In Oracle database, there is shared memory that contains the SGA (System Global Area) and private memory that contains the PGA (Process Global Area). OLTP (online transaction processing) workloads typically require more data in the Buffer Cache (an area of the SGA) for fastest access. Data Warehouse or Analytic workloads typically offload more work to Exadata Storage Servers and need more private memory in the PGA.

Although SGA sizes can vary depending on the type of workload such as OLTP vs. Analytics, Oracle Databases perform optimally when the entire SGA resides in Huge Pages. The appropriate size of the SGA is system-dependent and is limited by resources available on the system. The proper number of Huge Pages is based on the aggregate size of all SGAs for Databases running on the system. Using a Huge Page size of 2M, an SGA sized at 50G (51200M) will require 51200/2 = 25600 Huge Pages. For the sake of simplicity, Oracle recommends adding extra Huge Pages to account for byte offset and page alignment requirements.

Oracle Database will use the SGA size (specified in the spfile.ora) to calculate the number of Huge Pages required. Oracle will fit the SGA in the Huge Pages and/or standard 4K pages depending on the parameter setting for USE_LARGE_PAGES.

Oracle provides a script to compute values for recommended Huge Pages configuration for the current shared memory segments on Oracle Linux Systems. Please refer to MOS Doc ID: 401749.1

What are the options for USE_LARGE_PAGES?

The parameter (set in init.ora configuration file) USE_LARGE_PAGES has several options outlined in the manual as follows: You can refer to the Oracle Documentation for 19c Oracle Database at this link (Use_Large_Pages) 

FALSE: With this setting database instance will not use Huge Pages. This setting is not recommended for use with large SGA sizes because better performance can be achieved using Large Pages (also known as Huge Pages).

TRUE: This setting allows Oracle databases to use a combination of Huge Pages and 4K pages, by allocating as many Huge Pages as possible before allocating 4K pages. This setting is not recommended for use with large SGA sizes because better performance can be achieved by using Huge Pages exclusively using the ONLY setting.

AUTO: This setting is intended as a convenience to eliminate the need for coordination between DBA and SysAdmin. Oracle recommends using the ONLY setting instead. The AUTO setting uses SGA sizes to determine the number of large pages required in the Operating System and automatically configures as many Huge Pages as possible. The SGA of each database may use a combination of Huge Pages and 4K pages, so this setting is not recommended.

ONLY: This parameter value was introduced to make sure all the SGA will reside in Huge Pages. Oracle recommends this setting for consistent performance. Databases will fail to start if there aren’t sufficient Huge Pages available, but this is preferred over databases starting with a sub-optimal configuration.

AUTO_ONLY: As with AUTO, this setting is intended as a convenience to eliminate the need for coordination between DBA and SysAdmin. This setting automatically determines the number of Huge Pages required to fit each SGA, and the SGA will only reside in Huge Pages. Oracle recommends using the ONLY setting.

Note for Exadata Systems:

To maximize performance and stability, Oracle recommends always using large memory pages for the SGA, setting Huge Pages in advance, and using USE_LARGE_PAGES=ONLY for optimal system performance.  Oracle automation tools use the setting USE_LARGE_PAGES=ONLY. This includes databases created using Oracle Exadata Deployment Assistant (OEDA) and databases created on a cloud-based Exadata deployment (using Exadata Database Service). The Oracle Exadata Configuration check utility (Exachk) also checks for this setting and recommends USE_LARGE_PAGES=ONLY as well.

When USE_LARGE_PAGES=ONLY, the database fails to start if insufficient huge pages (SGA * 2MB = Total number of Huge Pages required) are available to contain the SGA. In this case, the system administrator must configure the operating system to ensure enough huge pages are available to cover all databases, or SGA sizes need to be decreased to stay within the number of Huge Pages available.


Conclusion:

Exadata in on-premises and Cloud deployments is a consolidation platform that enables massive scalability and terabytes of system memory. The standard Linux memory page size is 4K, which is not appropriate for massive SGA sizes of modern high performance Oracle Databases on Exadata. Oracle recommends using Huge Pages for SGA memory exclusively through the setting USE_LARGE_PAGES=ONLY on all Exadata systems.


ortunately Oracle provide a handy script (search for hugepages_setting.sh) that can be used to work out what huge pages settings you have, so let’s start here:



Script-

hugepages_settings.sh


This script will show you what value to use for hugepages.

#!/bin/bash
#
# hugepages_settings.sh
#
# Linux bash script to compute values for the
# recommended HugePages/HugeTLB configuration
# on Oracle Linux
#
# Note: This script does calculation for all shared memory
# segments available when the script is run, no matter it
# is an Oracle RDBMS shared memory segment or not.
#
# This script is provided by Doc ID 401749.1 from My Oracle Support
# http://support.oracle.com

# Welcome text echo " This script is provided by Doc ID 401749.1 from My Oracle Support (http://support.oracle.com) where it is intended to compute values for the recommended HugePages/HugeTLB configuration for the current shared memory segments on Oracle Linux. Before proceeding with the execution please note following: * For ASM instance, it needs to configure ASMM instead of AMM. * The 'pga_aggregate_target' is outside the SGA and you should accommodate this while calculating the overall size. * In case you changes the DB SGA size, as the new SGA will not fit in the previous HugePages configuration, it had better disable the whole HugePages, start the DB with new SGA size and run the script again. And make sure that: * Oracle Database instance(s) are up and running * Oracle Database 11g Automatic Memory Management (AMM) is not setup (See Doc ID 749851.1) * The shared memory segments can be listed by command: # ipcs -m Press Enter to proceed..." read # Check for the kernel version KERN=`uname -r | awk -F. '{ printf("%d.%d\n",$1,$2); }'` # Find out the HugePage size HPG_SZ=`grep Hugepagesize /proc/meminfo | awk '{print $2}'` if [ -z "$HPG_SZ" ];then echo "The hugepages may not be supported in the system where the script is being executed." exit 1 fi # Initialize the counter NUM_PG=0 # Cumulative number of pages required to handle the running shared memory segments for SEG_BYTES in `ipcs -m | cut -c44-300 | awk '{print $1}' | grep "[0-9][0-9]*"` do MIN_PG=`echo "$SEG_BYTES/($HPG_SZ*1024)" | bc -q` if [ $MIN_PG -gt 0 ]; then NUM_PG=`echo "$NUM_PG+$MIN_PG+1" | bc -q` fi done RES_BYTES=`echo "$NUM_PG * $HPG_SZ * 1024" | bc -q` # An SGA less than 100MB does not make sense # Bail out if that is the case if [ $RES_BYTES -lt 100000000 ]; then echo "***********" echo "** ERROR **" echo "***********" echo "Sorry! There are not enough total of shared memory segments allocated for HugePages configuration. HugePages can only be used for shared memory segments that you can list by command: # ipcs -m of a size that can match an Oracle Database SGA. Please make sure that: * Oracle Database instance is up and running * Oracle Database 11g Automatic Memory Management (AMM) is not configured" exit 1 fi # Finish with results case $KERN in '2.4') HUGETLB_POOL=`echo "$NUM_PG*$HPG_SZ/1024" | bc -q`; echo "Recommended setting: vm.hugetlb_pool = $HUGETLB_POOL" ;; '2.6') echo "Recommended setting: vm.nr_hugepages = $NUM_PG" ;; '3.8') echo "Recommended setting: vm.nr_hugepages = $NUM_PG" ;; '3.10') echo "Recommended setting: vm.nr_hugepages = $NUM_PG" ;; '4.1') echo "Recommended setting: vm.nr_hugepages = $NUM_PG" ;; '4.14') echo "Recommended setting: vm.nr_hugepages = $NUM_PG" ;; '4.18') echo "Recommended setting: vm.nr_hugepages = $NUM_PG" ;; '5.4') echo "Recommended setting: vm.nr_hugepages = $NUM_PG" ;; *) echo "Kernel version $KERN is not supported by this script (yet). Exiting." ;; esac
# End 



Reference from - 

Oracle Blog

No comments: