Tuesday, April 27, 2010

Solaris Tuning

Few of the things that I learned while performing load tests on a Solaris box were quite interesting and informative. Though these might be archaic for some, it was refreshing to view the performance limitations of an application from a different angle, rather than the code related issues (inefficient queries, synchronization blocks that are used haphazardly, memory hoggers, etc).
I'll try to list some of them that I used on the Solaris box to get additional information. First one was pretty obvious, since when the load test was run, weblogic server starting coughing up IOException (too many open files).
Running ulimit command showed that the open files settings on the box was too low for a high load. 
> ulimit -a
...
file size (blocks)      unlimited
open files                256
stack size (kbytes)    8192
...
This number should be adjusted based on the load (# of users) and other processes running on the system.
> ulimit -n 1024
This number is applicable only for the current session. If it needs to be set permanently, then rlim_fd_max (default hard limit) needs to be set to that number and system will need to be rebooted to make this effective.
To view the current list of file descriptors used by a specific process, use:
> ls /proc//fd | wc -l

Another helpful command is prstat or top (if available) that displays the top most processes running on the server that utilizes the most CPU.
> prstat -a (displays the CPU intensive processes grouped by user)
> prstat -n 3 -c (limits to top 3 processes and prints below the previous line)

Next comes sar (System Activity Reporter). This command lets you view the system activity and the most interesting one for me was
> sar -u (which displays the CPU utilization activity)

Time   %usr(user)  %sys(system)  %wio(waiting for I/O)  %idle(inactive)
If you want to watch the CPU utilization for every minute for the next 10 minutes, use
> sar -u 60 10
If idle time is consistently 0 or very low, it indicates that the CPU is running short on resources. Also if the wait time is consistently high, it indicates that there could be some stuck threads or blocking threads.

vmstat is another command that provides virtual memory, disk, page and CPU information
It displays the run, blocked & swapped processes and typically you should not see a high number under 'blocked' queue for consecutive reads.

iostat displays the input/output statistics for each disk. If the r/s and w/s is consistently high along with %b (% of time spent on transactions), then the application needs to be tuned to use the io processes more effectively.

To summarize, following are the commands that helped/guided me to gain additional insight into my app:
  • ulimit
  • prstat
  • sar
  • vmstat
  • iostat

No comments: