Tuesday 11 March 2014

SAN storage basics

One of the customers I work with had poor storage performance on his database server. This customer uses the database to run a transaction host and experiences volumes of tens of millions of transactions per month.

His transaction host back-end is an Oracle 11g database running on Windows Server 2003, and uses a SAN for storage.

This customer's database performance reports have been consistently highlighting poor i/o throughput and excessive virtual memory paging.

Therefore, I decided to take a closer look at SANs and understand how they work. This article tries to put together a few basic points about SAN storage technology collected during that study.

SANs are specialized high performance, high capacity network-based storage solutions, and require specialized hardware and software to work. 

Sometimes, the incorrect configuration of SANs or their components can cause performance and availability problems, so its good for a DBA/Solution Architect to know a bit how they work.

Below, are some basic points related to SANs which could be useful to have in mind when faced with SAN storage performance problems. 

# What is the manufacturer/model of host bus adapter (HBA) cards installed on the database server.

HBAs are are cards which slot into the database server. They do the TCP to Fiber channel protocol conversion, and enable the client (database server) to talk to the SAN, which only understands Fiber-channel protocol. 

Fiber-channel protocol basically consists of SCSI disk management commands transmitted over the fiber-channel network, on fiber-optic cables. Data is transmitted as a laser signal instead of electric signals.


# What is the capacity of the installed HBA in terms of Mbits/Gbits per second. 

Perhaps this could be compared to the numbers in the AWR report's load profile section.

# Check how many such adapters cards are installed on database server and whether all are functional.  

If one HBA is not functional the other ones may have to do process extra traffic, depending on how the HBAs are configured.

# What is the model and capacity (in Mbits/Gbits per second) of the SAN Fiber Channel switch installed, if using a fiber channel switch.

A  fiber-channel switch is much like an ethernet switch, except it links fiber-channel devices and networks instead of ethernet networks. 

For example, the database server may be hooked up to a fiber-channel switch, which in turn hooks up to the storage array.

# How many fiber-channel ports are available on the switch? 
More ports may mean potentially more servers attached to those ports, demanding more IO services from the SAN, increasing the load on the SAN.


# Check whether the SAN environment is using fiber channel hub rather than a switch, what are hub's Mbits/second numbers.

Fiber-channel hubs were used before switches were introduced, but today, they would be rare. They are analogous to ethernet hubs, and only one device can communicate at a time, while the others wait for their turn.

# If using a fiber-channel hub, check whether other servers than the database server also use the disk array - they would need to queue up to use the hub.

# Check the rated capacity of disk array in Mbits per second.

# Get the the number of disks in disk array.

# Get the Raid configuration created, if any, on disk array.

The RAID configuration created will have an impact on the read/write performance. 

# Get a network diagram showing all the connections.

Database server <--> any ethernet switches <---> san switch/hub <---> disk array.

Try to determine how many network hops there are from database server to storage disk - fewer the better.

# Determine whether any patch panels employed in creation/configuration of SAN network.

Patch panels join two fiber optic cable together, and if the geometric center of the cables are not aligned well, then these panels may become a source of signal loss.

# Check the estimated length of fiber optic cable running from database server host bus adapter to SAN fiber channel switch and from there on to storage array. 

Depending on the nature of the fiber-optic cable used, there are limits on how long a fiber optic cable can transmit without signal loss. 
  
# Check if there are any issues in the optic cable like > 90 degree bends or pinches in the wiring ?

Optic fiber cables are very sensitive to pinches, sharp bends and will suffer signal loss if subjected to physical stresses, this may be worth checking.

# Check, whether the disk array come with a cache ? What is it's size ?

# Check whether the SAN has any continuous data protection/other high availablity features enabled?

# Check whether all the SAN components - the database server host bus adapter, the fiber-channel switch, and the disk array  all communicate at same speed or not ? 

If not, the storage network will only function as fast as the slowest     component in it.
    
# Check whether the host bus adapter driver software on the database server the correct and recommended ones


Application database usage is a far more frequent cause of db performance problems. But, the above information could help to understand the storage environment better and identify bottlenecks which may exist in the SAN configuration.




Monday 10 March 2014

PDB duplication using storage snapshot



Oracle ASM supports creation of a traditional file system on top of an ASM disk group. This file system is accessible to applications once mounted, and it supports all kinds of application files, and has been known as ACFS.

The ACFS file system has a storage snapshot feature.  A storage snapshot is an image of a file system as it existed at some point in time. Storage snapshots are fast and efficient because they do not consume disk space or resources at creation time. They can be created as read-only or read-write snapshots.

Only when the file system contents are updated, do the ACFS drivers work to maintain the snapshot.

This is called copy-on-write technology, and enables the initial duplication of large file systems with little effort.

Starting version 12.1, ACFS supports storage of database datafiles in the Grid Infrastructure environment.

In this doc, we will create a container database on top of an ACFS file system first.
Next, we will use the ACFS storage snapshot feature to rapidly duplicate one of the tenant databases.

We will use the STORAGE SNAPSHOT capability of the CREATE PLUGGABLE DATABASE statement.
The requirement for use of STORAGE SNAPSHOT clause is that the source pdb datafiles be stored on a system which supports storage snapshots.

When we duplicate an existing pluggable database specifying the storage snapshot argument, ACFS drivers automatically create a read-write storage snapshot, and maintain it.

Duplicating an existing database otherwise requires copying of the source datafiles to the new destination which can take a long time, and a lot of storage if the database is large.

The exercise will progress in stages.


Stage # 1 – Linux and Oracle Restart install

The ASM storage will use a single external redundancy disk, exposed to ASM as an asmlib disk. We take care to inspect the new disk group’s attributes, and make sure that COMPATIBLE.ADVM is set to 12.1 at least.

Though not supported, this example uses Oracle Restart for establishing the ASM environment, for testing purposes.

Stage # 2 - Creation of the asm dynamic volume and asm cluster file system.

We will use the asm cluster file system to store the database datafiles.

Note that this is only a test, and in fact, storing of database datafiles on ACFS is NOT supported for Oracle Restart.

Oracle 12.1 only supports database datafile storage on ACFS for Grid Infrastructure.

Stage # 3 – Creation of a test database on the file system

We will create a test database called CDB2 containing a single PDB called TOOLS.

Stage # 4 – PDB duplication using the storage snapshot feature.

The PDB named TOOLS will be duplicated using the storage snapshot option of the CREATE PLUGGABLE DATABASE command.




Stage # 1 – Linux and Oracle Restart install

# Set up asmlib by installing the oracle asmlib rpms.
Next, we identify the unix kernel and download the correct kernel-specific asmlib driver files as recommended by oracle asmlib download site.



[oracle@linux12c CDB2]$ uname -r
2.6.18-308.el5


These are the asmlib listed for the above unix kernel
oracleasm-2.6.18-308.el5-2.0.5-1.el5.x86_64.rpm
oracleasm-support-2.1.8-1.el5.x86_64.rpm
oracleasmlib-2.0.4-1.el5.x86_64.rpm

For this example, a single 6 GB asmlib disk named DISK1 is configured before Oracle Restart install.



[oracle@linux12c CDB2]$ /usr/sbin/oracleasm listdisks
DISK1



#Set up Oracle Restart and configure GI for Oracle Restart
After configuring the grid and oracle users and the groups asmadmin, asmdba, and oinstall, the Grid Infrastructure installer GUI can be started. Using the OUI, and running as the grid user, the option to install and configure grid infrastructure for standalone environment was selected, and the Oracle Restart environment was created.

# Start up acfs drivers manually as shown (need to do it each boot for Oracle Restart env)

The drivers will be loaded up as part of Oracle Restart install, but will not be started automatically on boot.
If required, start up the drivers using the acfsload executable found in the grid home under the bin folder, as shown.


root@linux12c ~]# /u01/app/grid/product/12.1.0/grid/bin/acfsload start -s
ACFS-9391: Checking for existing ADVM/ACFS installation.
ACFS-9392: Validating ADVM/ACFS installation files for operating system.
ACFS-9393: Verifying ASM Administrator setup.
ACFS-9308: Loading installed ADVM/ACFS drivers.
ACFS-9154: Loading 'oracleoks.ko' driver.
ACFS-9154: Loading 'oracleadvm.ko' driver.
ACFS-9154: Loading 'oracleacfs.ko' driver.
ACFS-9327: Verifying ADVM/ACFS devices.
ACFS-9156: Detecting control device '/dev/asm/.asm_ctl_spec'.
ACFS-9156: Detecting control device '/dev/ofsctl'.
ACFS-9322: completed


To check if drivers are loaded, run lsmod | grep oracle as the root.

This completes the os and software set up stage.

Stage # 2 - Creation of the asm dynamic volume and asm cluster file system.

First an ASM volume called VOL1 with 6GB space was carved out of the asm disk group DATA. The screenshot below shows the details of the ADVM volume. This volume will be exposed to the os as a device file, in this case it's name and path is /dev/asm/vol1-181.
 

Next, we create an ACFS file system on top of this volume device. The following ACFS file system was created using asmca as shown.  ASMCA has also created a mount point for the new file system.



# Mount the acfs file system as the root user.
The next step is to mount the file system so that applications can access it. (Additionally, create a folder called oradata under the mount point, and give read/write permissions to the oracle user so that database files can be created under it.)


# mount –t acfs /dev/asm/vol1-181 /u01/app/grid/acfsmounts/data_vol1



[oracle@linux12c ~]$ mount | grep acfs
/dev/asm/vol1-181 on /u01/app/grid/acfsmounts/data_vol1 type acfs (rw)


# Optionally, do a test storage snapshot creation directly from the OS:
Storage snapshots are created using the acfsutil snap command. Each snapshot must have a name – it should be a valid os folder name.

Below, we specify the date of creation as the snapshot name (10mar2014).

[grid@linux12c bin]$ /sbin/acfsutil snap create -r 10mar2014 /u01/app/grid/acfsmounts/data_vol1
acfsutil snap create: Snapshot operation is complete.


# Viewing file system info using acfsutil from the os.
We can see the file system details with 1 snapshot being listed.


$ /sbin/acfsutil info fs

/u01/app/grid/acfsmounts/data_vol1
    ACFS Version: 12.1.0.1.0
    flags:        MountPoint,Available
    mount time:   Mon Mar 10 13:18:06 2014
    volumes:      1
    total size:   6442450944
    total free:   1989275648
    primary volume: /dev/asm/vol1-181
        label:
        flags:                 Primary,Available,ADVM
        on-disk version:       39.0
        allocation unit:       4096
        major, minor:          252, 92673
        size:                  6442450944
        free:                  1989275648
        ADVM diskgroup         DATA
        ADVM resize increment: 33554432
        ADVM redundancy:       unprotected
        ADVM stripe columns:   4
        ADVM stripe width:     131072
    number of snapshots:  1
    snapshot space usage: 285282304
    replication status: DISABLED

This completes the environment set up stage.


Stage # 3 – Creation of a test database on the file system

Create the CDB using DBCA, specifying the acfs mount point as the storage location.
Below we list the databases and datafiles contained within the new PDB. Note that all the datafile are stored under the file system exposed by ACFS.

SQL> select name from v$pdbs;

NAME
------------------------------
PDB$SEED
TOOLS

SQL > select con_id, FILE_NAME from cdb_data_files order by con_id;

    CON_ID FILE_NAME
---------- --------------------------------------------------------------------------------
         1 /u01/app/grid/acfsmounts/data_vol1/oradata/CDB2/system01.dbf
         1 /u01/app/grid/acfsmounts/data_vol1/oradata/CDB2/sysaux01.dbf
         1 /u01/app/grid/acfsmounts/data_vol1/oradata/CDB2/undotbs01.dbf
         1 /u01/app/grid/acfsmounts/data_vol1/oradata/CDB2/users01.dbf
         2 /u01/app/grid/acfsmounts/data_vol1/oradata/CDB2/pdbseed/system01.dbf
         2 /u01/app/grid/acfsmounts/data_vol1/oradata/CDB2/pdbseed/sysaux01.dbf
         3 /u01/app/grid/acfsmounts/data_vol1/oradata/CDB2/TOOLS/SAMPLE_SCHEMA_users01.dbf
         3 /u01/app/grid/acfsmounts/data_vol1/oradata/CDB2/TOOLS/sysaux01.dbf
         3 /u01/app/grid/acfsmounts/data_vol1/oradata/CDB2/TOOLS/example01.dbf
         3 /u01/app/grid/acfsmounts/data_vol1/oradata/CDB2/TOOLS/system01.dbf

10 rows selected.


Stage # 4 – PDB duplication using the storage snapshot feature.

 # Ensure that the source database – TOOLS is in read-only mode before clone.
This is required for creating a PDB from an existing PDB, when we use the CREATE PLUGGABLE DATABASE…FROM statement.


SQL> alter pluggable database tools close;

Pluggable database altered.

SQL> alter pluggable database tools open read only;

Pluggable database altered.

# Create the snapshot based clone (note the SNAPSHOT COPY syntax below):
The CREATE TABLE DDL is shown below. The command requests a copy of the TOOLS to be created in the same database, with a new name TOOLSCOPY. The SNAPSHOT COPY specifies that the duplication should be done using storage snapshots. The storage snapshot function is provided by the underlying ACFS drivers.

create pluggable database TOOLSCOPY from TOOLS
 file_name_convert=('/u01/app/grid/acfsmounts/data_vol1/oradata/CDB2/TOOLS',
                    '/u01/app/grid/acfsmounts/data_vol1/oradata/CDB2/TOOLSCOPY') SNAPSHOT COPY
/


SQL> @create_snap.sql;
Pluggable database created.


# Open the new db in read-write mode at least once to integrate it with the CDB.

SQL> alter pluggable database toolscopy open read write;


Check the status of the cloned database to make sure its integrated properly into the CDB.

SQL> select pdb_name,status from cdb_pdbs;

PDB_NAME             STATUS
-------------------- -------------
TOOLS                NORMAL
PDB$SEED             NORMAL
TOOLSCOPY            NORMAL


# Connect to the newly clone database and create a table to test it.

To test the newly cloned database, we connect to it using the HR user which it inherited from the source.  We will create a copy of that table to make sure that we are able to modify the database.


[oracle@linux12c ~]$ sqlplus hr/hr@localhost:1521/TOOLSCOPY

SQL*Plus: Release 12.1.0.1.0 Production on Mon Mar 10 15:15:28 2014

Copyright (c) 1982, 2013, Oracle.  All rights reserved.

Last Successful login time: Mon Mar 10 2014 15:15:18 +08:00

Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.1.0 - 64bit Production
With the Partitioning, OLAP, Advanced Analytics and Real Application Testing options


SQL> create table test as select * from employees;

Table created.

SQL> select sys_context('USERENV','CON_NAME') from dual;

SYS_CONTEXT('USERENV','CON_NAME')
--------------------------------------------------------------------------------
TOOLSCOPY

SQL> exit

# View the data files within the CDB after duplication

Below, we query the database datafiles and the red ones are the ones being supported by the storage snapshot.

SQL> select con_id, file_name from cdb_data_files order by 1;

CON_ID FILE_NAME
------ ------------------------------------------------------------------------------------------
     1 /u01/app/grid/acfsmounts/data_vol1/oradata/CDB2/system01.dbf
     1 /u01/app/grid/acfsmounts/data_vol1/oradata/CDB2/sysaux01.dbf
     1 /u01/app/grid/acfsmounts/data_vol1/oradata/CDB2/undotbs01.dbf
     1 /u01/app/grid/acfsmounts/data_vol1/oradata/CDB2/users01.dbf
     2 /u01/app/grid/acfsmounts/data_vol1/oradata/CDB2/pdbseed/system01.dbf
     2 /u01/app/grid/acfsmounts/data_vol1/oradata/CDB2/pdbseed/sysaux01.dbf
     3 /u01/app/grid/acfsmounts/data_vol1/oradata/CDB2/TOOLS/SAMPLE_SCHEMA_users01.dbf
     3 /u01/app/grid/acfsmounts/data_vol1/oradata/CDB2/TOOLS/example01.dbf
     3 /u01/app/grid/acfsmounts/data_vol1/oradata/CDB2/TOOLS/sysaux01.dbf
     3 /u01/app/grid/acfsmounts/data_vol1/oradata/CDB2/TOOLS/system01.dbf
     4 /u01/app/grid/acfsmounts/data_vol1/oradata/CDB2/TOOLSCOPY/system01.dbf
     4 /u01/app/grid/acfsmounts/data_vol1/oradata/CDB2/TOOLSCOPY/example01.dbf
     4 /u01/app/grid/acfsmounts/data_vol1/oradata/CDB2/TOOLSCOPY/SAMPLE_SCHEMA_users01.dbf
     4 /u01/app/grid/acfsmounts/data_vol1/oradata/CDB2/TOOLSCOPY/sysaux01.dbf

14 rows selected.

 

# View ACFS snapshot information

From the OS, issue the acfsutil snap into command, to display the snapshots. The command expects the mount point as its argument.
Below we can see the new storage snapshot created as a result of the create pluggable database command.

When the TOOLSCOPY database is updated, the read-write snapshot is automatically updated by ACFS. The possible use of this feature is rapid creation of an independent copy of the source database.


oracle@linux12c ~]$ /sbin/acfsutil snap info /u01/app/grid/acfsmounts/data_vol1
snapshot name:               F43B485145091279E0430100007F293D
RO snapshot or RW snapshot:  RW
parent name:                 /u01/app/grid/acfsmounts/data_vol1
snapshot creation time:      Mon Mar 10 13:58:37 2014

snapshot name:               10mar2014
RO snapshot or RW snapshot:  RO
parent name:                 /u01/app/grid/acfsmounts/data_vol1
snapshot creation time:      Mon Mar 10 13:20:26 2014

    number of snapshots:  2
    snapshot space usage: 979169280




This concludes the exercise.