Road To OCM: August 2012

Saturday, August 25, 2012

Best Practices for Creating an Oracle Database

Best Practice	Reasoning
Make the SYSTEM tablespace locally managed	Doing this enforces that all tablespaces created in this database are locally managed
Use the REUSE clause with caution. Normally, you should use it only when you’re re-creating a database	The REUSE clause instructs Oracle to overwrite existing files, regardless of whether they’re in use. This is dangerous
Create a default temporary tablespace with TEMP somewhere in the name	Every user should be assigned a temporary tablespace of type TEMP, including the SYS user. If you don’t specify a default temporary tablespace, the SYSTEM tablespace is used. You never want a user to be assigned a temporary tablespace of SYSTEM. If your database doesn’t have a default temporary tablespace, use the ALTER DATABASE DEFAULT TEMPORARY TABLESPACE statement to assign one
Create a default permanent tablespace named USERS	This ensures that users are assigned a default permanent tablespace other than SYSTEM. If your database doesn’t have a default permanent tablespace, use the ALTER DATABASE DEFAULT TABLESPACE statement to assign one
Use the USER SYS and USER SYSTEM clauses to specify nondefault passwords	Doing this creates the database with nondefault passwords for database accounts that are usually the first targets for hackers
Create at least three redo log groups with two members each	At least three redo log groups provides time for the archive process to write out archive redo logs between switches. Two members mirror the online redo log members, providing some fault tolerance
Name the redo logs something like redoNA.rdo	This deviates slightly from the OFA standard, but it's better to avoid to name it with the extension of .log because you can accidentally delete it

cached I/O

direct I/O

Application issues a read request
Kernel looks for requested data in file buffer cache
Requested data not present in file buffer cache
Kernel reads data from disk
Read data is cached in file buffer cache
Read data is copied from file buffer cache to applicatio buffer

Application issues a read request
Kernel initiates a disk request
Requested data transfered from disk to application buffer

AIO Tuning Recommendations

Parameters for tuning asynchronous I/O

minservers The minimum number of aioservers that are started for asynchronous disk I/O. The default value is 1
maxservers The maximum number of aioservers that are started for asynchronous disk I/O. The default value is 10. Since each aioserver uses memory, this number should not be much larger than the
expected amount of simultaneous asynchronous disk I/O requests
maxreqs Maximum number of asynchronous disk I/O requests that can be stored in the queue. The default value is 4096

Tuning Recommendations

minserver : 2 or Number of CPUs -1, whatever is larger
maxserver : Two times the number of datafiles
maxreqs : 12288
for large systems the number of aioserver processes may become very large and the maximum number of processes per user has to be adopted

File system layout to minimize disk drive contention

This example configuration contains eight data areas, including disk drives, striped sets,
RAID sets, and placeholders for other new technologies to be developed in the future.
Separate the eight data areas as completely as possible. Ideally, operate from different
device controllers or channels to maximize throughput. The more disk drive heads are
moving at one time, the faster the database. To minimize disk drive contention, lay out
the file system disk drives as follows:

AREA 1 – Oracle executables and a control file
AREA 2 – Data: datafiles, index datafiles, system datafiles, tool datafiles, user datafiles,
and a control file
AREA 3 – Data datafiles, index datafiles, temporary datafiles, undo datafiles, and a
control file
AREA 4 – Archive log files, export files, backup staging area, and a control file
AREA 5 – Redo log files
AREA 6 – Redo log files
AREA 7 – Redo log files
AREA 8 – Redo log files

Online redo log files exist on four separate disk drives. Online redo log files are
multiplexed, and Oracle creates these log files in a circular fashion:
redo log 1 =>redo log 2 => redo log 3 => redo log 4 =>redo log 1
As a result, the I/O is evenly distributed. Therefore, when Oracle switches log file
groups, writing to the new redo log files does not impact reading the old redo log file
to create a new archive log file.

Friday, August 24, 2012

How PCTFREE and PCTUSED Work Together

PCTFREE and PCTUSED work together to optimize the use of space in the data blocks of the extents within a data segment

Strategies and techniques to resolve 'log file sync' waits

Commit is not complete until LGWR writes log buffers including commit redo recods to log files. In a
nutshell, after posting LGWR to write, user or background processes waits for LGWR to signal back
with 1 sec timeout. User process charges this wait time as 'log file sync' event.

Root causes of 'log file sync', essentially boils down to few scenarios :

Disk I/O performance to log files is not good enough. Even though LGWR can use
asynchronous I/O, redo log files are opened with DSYNC flag and buffers must be
flushed to the disk (or at least, written to disk array cache in the case of SAN) before
LGWR can mark commit as complete.
LGWR is starving for CPU resource. If the server is very busy, then LGWR can starve
for CPU too. This will lead to slower response from LGWR, increasing 'log file sync'
waits. After all, these system calls and I/O calls must use CPU. In this case, 'log file
sync' is a secondary symptom and resolving root cause for high CPU usage will reduce
'log file sync' waits.
Due to memory starvation issues, LGWR can be paged out. This can lead to slower
response from LGWR too
LGWR is unable to complete writes fast enough due to file system or unix buffer
cache limitations.
LGWR is unable to post the processes fast enough, due to excessive commits. It is quite
possible that there is no starvation for cpu or memory and I/O performance is decent enough. Still, if
there are excessive commits, then LGWR has to perform many writes/semctl calls and this can
increase 'log file sync' waits. This can also result in sharp increase in 'redo wastage' statistics'
With Private strands, a process can generate few Megabytes of redo
before committing. LGWR must write generated redo so far and processes must wait for 'log file sync'
waits, even if redo generated from other processes is small enough
LGWR is suffering from other database contention such as enqueue waits or latch contention.
For example, we have seen LGWR freeze due to CF enqueue contention.

Finding and understanding root cause is essential to resolve a performance issue.

If I/O bandwith is an issue, then doing anything other than improving I/O bandwidth is not
useful. Switching to file systems providing better write throughput is one option. RAW devices are
another option. Reducing # of log file members in a group is another option as it reduces # of write
calls. But, this option comes with a cost.
If CPU starvation is an issue, then reducing CPU starvation is the correct step to resolve it.
Increasing priority of LGWR is a work around
If commit rate is higher, then decreasing commits is correct step but, in few case, if that is not
possible, increasing priority of LGWR (using nice) or increasing priority class of LGWR to RT might
provide some relief.
Solid State Disk devices also can be used if redo size is extreme. In that case, it is also
preferable to decrease redo size.
If excessive redo size is root cause, redo size can be reduced using various techniques

What RAID levels are most appropriate for what oracle file types

Use RAID 0 arrays only for high traffic data that does not need any redundancy protection for device failures. RAID 0 is the least used RAID format but provides for high speed I/O without the additional redundant disk drives for protection. RAID 1 offers the best performance while providing data protection by mirroring each physical disk drive. Create RAID 1 arrays with the most disk drives possible (30 maximum) to achieve the highest performance.

File Type

RAID Level

Redo logs

RAID 10

Control files

RAID 10

Temp datafiles

RAID 10, RAID 5

Archive logs

RAID 10, RAID 5

Undo/Rollback

RAID 10, RAID 5

Datafiles

RAID 10, RAID 5

Oracle executables

RAID 5

Export files

RAID 10, RAID 5

Backup staging

RAID 10, RAID 5

Oracle Initialization Parameters Best Practices

Best Practice	Reasoning
Oracle recommends that you use a binary server parameter file (spfile)	Use whichever type of initialization parameter file you’re comfortable with. If you have a requirement to use an spfile, then by all means implement one.
don’t set initialization parameters if you’re not sure of their intended purpose. When in doubt, use the default	Setting initialization parameters can have far-reaching consequences in terms of database performance. Only modify parameters if you know what the resulting behavior will be
For 11g, set the memory_target and memory_max_target initialization parameters	Doing this allows Oracle to manage all memory components for you.
For 10g, set the sga_target and sga_target_max initialization parameters.	Doing this lets Oracle manage most memory components for you
For 10g, set pga_aggregate_target and workarea_size_policy	Doing this allows Oracle to manage the memory used for the sort space
Starting with 10g, use the automatic UNDO feature. This is set using the undo_management and undo_tablespace parameters	Doing this allows Oracle to manage most features of the UNDO tablespace
Set open_cursors to a higher value than the default. typically set it to 500. Active online transaction processing (OLTP) databases may need a much higher value	The default value of 50 is almost never enough. Even a small one-user application can exceed the default value of 50 open cursors.
Use at least two control files, preferably in different locations using different disks	If one control file becomes corrupt, it’s always a good idea to have at least one other control file available