Troubleshoot Wait Event 'log file sync'
 Case Study Summary

TThis example describes a scenario of troubleshooting ‘log file sync’ waits on a database. The troubleshooting steps taken were based on the Oracle metalink note 1376916.1.

The list of steps include:

 Step 1 : 'log file sync' was shown as one of AWR top 5 events

AWR top 5 events showed that ‘log file sync’ waits stayed as the top #1 event on the database.
Also, for the time frame we investigated, there was a noticeable increase in the waits for ‘log file sync’.

 Step 2 : Compare the average wait time for 'log file sync' with the average wait time for 'log file parallel write'

LGWR will wait for event 'log file parallel write' while the actual write operation to the redo is occurring. It shows how much of the ‘log file sync’ operation is spent on IO and also, by inference, how much processing time is spent on the CPU.

As shown in this example, the average time for ‘log file parallel write’ is not a significant when comparing to the average time for ‘log file sync’. This proves that there was no IO issue for LGWR process.

 Step 3 : Check if there was issue with redo logs size

A 'log file sync' operation is performed every time the redo logs switch to the next log to ensure that everything is written before the next log is started. If the redo log size is too small, a spur of redo generation will lead to excessive log file switch and therefore, ‘log file sync’ wait.

As shown in this example, the change in redo generation did not match the change in 'log file sync' wait

 Step 4 : Check if there was issue with application commit

The excessive ‘commit’ in application can also be the root cause of ‘log file sync’ wait.

As shown in the instance load profile, the increase in transaction rate matched the increase in the ‘log file sync’ waits. This confirmed that ‘log file sync’ was caused by ‘commit’ logic in application.

 Step 5 : Check what is the most significant wait event for LGWR

At this point, we would like to double check the LGWR to confirm its performance. First, we need to identify the session ID for LGWR. This can be done by checking the blocking session for ‘log file sync’ wait.

Remember, LGWR is background process and we need check ‘Top Background Session’ in ASH data.

 Step 6 : Check the data guard related wait event

This database has data guard and therefore it spend a considerable amount of its time on data guard related wait events like ‘LGWR-LNS wait on channel’.

This ‘LGWR-LNS wait on channel’ wait refers to the time LGWR or LNS spend waiting for receiving messages and can be considered as idle wait event. There is not much change in its average wait time, which means the LGWR’s performance was stable at that part.

Copyright © 2011 Actrace. All Rights Reserved.
Home | Product | Case Study | Support | Download | Contact Us