Provide feedback on the IBM HTTP Server forum on IBM developerWorks.
The documentation required to diagnose high CPU problems with web server processes includes
The ServerDoc tool provided with ihsdiag automates much of the work of gathering this information. The user runs ServerDoc and provides the IHS installation directory and other information; ServerDoc creates a new directory to hold the required documentation, and stores information in that new directory. This collector also supports IBM Edge Caching Proxy (ibmproxy, WTE) since version 1.4.19
Once the ServerDoc tool has completed, the user should copy any remaining log files and configuration files used by the web server and the plug-in into the new directory, and send in the directory to IBM support.
This tool uses native tools such as strace
and truss
to obtain system call traces, which include the contents of buffers used to
read and write data from the network.
There are known issues in the ICC (IBM Crypto for C-Language) that is part of the GSKit bundled with earlier versions of IBM HTTP Server 8.0 and 8.5.5 that can result in high CPU, delays, or (frontend) timeouts. This is typically noticed by users migrating from IHS 7.0 on AIX or Linux/PPC.
The issues are resolved by GSKit 8.0.50.17 provided by PI09443, which is shipped in IHS 8.0.0.9 or 8.5.5.2. So, the problem no longer exists in those and higher versions, and the additional configuration described below is not required for those.
The recommended solution for earlier versions of IBM HTTP Server is to either upgrade to a minimum of IHS fixpack
versions 8.0.0.9 or 8.5.5.2, or to upgrade the GSKit to a minimum version of 8.0.50.17 using the PI09443 interim fix.
NOTE: It is always recommended to use the latest available GSKit. To determine the latest recommended GSKit, you
you can refer to the 'Comments' section for your fixpack version in the
Recommended fixes for IBM HTTP Server
document.
If you are experiencing the problem symptoms described above, but cannot apply IHS fixpacks 8.0.0.9, 8.5.5.2 or newer, nor published interim fix PI09443 or newer providing at least GSKit 8.0.50.17, then you can complete the steps below:
$IHSROOT/bin/envvars
and perform a full stop and start:
ICC_IGNORE_FIPS=YES
export ICC_IGNORE_FIPS
$IHSROOT/bin/envvars
and perform a full
stop and start:
ICC_TRNG=ALT2
export ICC_TRNG
Note: 'ICC_TRNG=ALT
' had been previously recommended for this setting, but 'ALT2
' is now
the recommended value to try. The only difference between the two is in the PRNG used, but the one used by
'ALT2
' has been determined to resolve the problem in a wider range of circumstances, so is now the
suggested value.
For customers that have previously encountered this problem and have set this to 'ALT
',
they can leave as-is if it resolved the problem for them, or change to 'ALT2
'.
If FIPSEnable
is required by your configuration, the above fixes will use
a cryptographic library that is NOT FIPS certified. The protocols and ciphers will however still be restricted
to the protocols and ciphers deemed acceptable under FIPS140-2, but the implementation has not (yet) been certified.
We currently have no timeframe for when a FIPS certified GSKit with the fixes will be available in an IHS fixpack, as it depends on external
FIPS certification. When it is available, this document will be updated with the APAR number.
Disable TLS compression: ssl_questions.html#compress
If you're using this level of the WebSphere Plugin, and have HTTPS transports configured, upgrade to a level including PK85105 before proceeding.
GSKit v7 is inefficient on the Niagra architecture, but small RSA keysizes may help performance. This is a permanent limitation in IBM HTTP Server 7.0 and earlier. IHS 8.0 and later are more efficient on this platform, but PKCS11 off may still be necessary for suitable performance.
Hardware cryptographic offload, a part of the hardware platform, is documented here. Symmetric cryptographic offload should additionally be enabled with SSLAttributeSet 417 549 on IBM HTTP Server 7.0 and later if supported by the particular level of Niagra hardware/Sotware.
A very rare (usually accidental) pattern used in regular expressions can use excessive CPU. These are most easily identified by searching the configuration file for a repetition character immediately following a closing parenthesis. If a regex matches this pattern, refactor it to not repeat the capture itself.
grep -F ')+' httpd.conf
grep -F ')*' httpd.conf
Having many hundreds or thousands of mod_rewrite directives evaluated for common requests
can be detrimental to performance, even moreso if they are specified in .htaccess
.
Limit evaluation by defining them in <VirtualHost> context, using the 'L' flag in early
rules to avoid further processing for frequent requests, and making matches in the first
parameter of RewriteRule
"fail fast".
ThreadsPerChild
has not been increased far beyond the default of 25
.
%{HTTPS_CIPHER}e
as described in the ihsdiag
ciphers documentation.
Special concerns for threaded processes on older Linux distributions |
Customers may be using single-threaded IHS 1.3 on older Linux
distributions such as RedHat 7.x. If third-party modules are loaded
which create threads in the IHS processes, this high CPU tool may not
be able to analyze those processes due to limitations of the system
tools gdb and strace . In addition to this
high CPU tool, also run the hang
documentation tool, as it may be able to retrieve additional
information. Submit both the high CPU doc and hang doc to IHS
support. |
Operating systems provide special programs for displaying the CPU
usage of processes running on the system. These programs include
ps
, top
, topas
,
nmon
, and others. When a high CPU problem is suspected,
one of these programs should be used to determine which processes are
actively consuming high CPU. If a web server process is the highest
consumer of CPU, documentation about the suspected problem should be
submitted to IBM HTTP Server support for analysis.
While ps
is universally available, it averages CPU usage
across the lifetime of the process. A very old process that begins to use high CPU
will often not show high CPU in ps
output. Using "top" or "topas" to
show processes with high CPU over some short (1 second) interval is usually best.
If you cannot find a high CPU process via something like "top" or "topas",
the example below demonstrates how to identify the high CPU process with ps
.
ps
invocations which show percent CPUPlatform | ps invocation |
AIX, z/OS |
ps -A -o pid,ppid,pcpu,time,args |
Linux |
ps -eo pid,ppid,pcpu,time,fname,cmd |
Solaris |
ps -A -o pid,ppid,pcpu,time,comm |
HP-UX |
UNIX95=1 ps -A -o pid,ppid,pcpu,time,args |
ps
to find high CPU processesSubstitute the appropriate ps
command for your
platform.
# ps -A -o pid,ppid,pcpu,time,args PID PPID %CPU TIME COMMAND 0 0 0.0 00:07:56 swapper 1 0 0.0 00:01:18 /etc/init 516 0 96.4 89-10:46:02 wait 774 0 0.0 00:00:09 reaper 1032 0 0.0 00:00:15 xmgc 1290 0 0.0 00:00:27 netm 1548 0 0.0 00:24:25 gil 1806 0 0.0 00:00:00 wlmsched 2850 0 0.0 00:00:00 lvmbb 3170 8258 0.0 00:00:00 /usr/sbin/sshd -D 3410 1 0.9 19:02:01 /usr/sbin/syncd 60 3650 1 0.0 00:00:00 /usr/dt/bin/dtlogin -daemon 3964 1 0.0 00:00:00 /usr/lib/errdemon 4148 3650 0.0 00:00:00 dtlogin <:0> -daemon 4420 3650 0.0 00:07:31 /usr/lpp/X11/bin/X -D /usr/lib/X11//rgb -T -force :0 -auth /var/dt/A:0-svd 5160 1 0.0 00:00:00 /usr/ccs/bin/shlap 5422 1 0.0 00:00:00 /usr/sbin/srcmstr 5962 5422 0.0 00:05:03 /opt/freeware/sbin/prngd -f /dev/egd-pool -m 666 6210 5962 00:00:00 <defunct> 6456 5962 00:00:00 <defunct> 6714 5962 00:00:00 <defunct> 7004 1 0.0 00:00:00 /usr/sbin/biod 8 7226 5422 0.0 00:00:03 /usr/sbin/syslogd 7482 5422 0.0 00:00:32 sendmail: accepting connections 7740 5422 0.0 00:00:22 /usr/sbin/portmap 7998 5422 0.0 00:00:00 /usr/sbin/inetd 8258 5422 0.0 00:03:50 /usr/sbin/sshd -D 8516 0 0.0 00:00:01 rtcmd 8772 0 0.0 00:00:00 kbiod 9064 1 0.0 00:00:00 /usr/sbin/uprintfd 9300 1 0.0 00:00:00 /usr/sbin/rpc.statd 9564 1 0.0 00:00:00 /usr/sbin/rpc.lockd 9820 1 0.0 00:00:20 /usr/sbin/cron 10078 5422 0.0 00:00:01 /usr/sbin/qdaemon 10320 1 0.0 00:06:52 rpc.lockd 10578 0 0.0 00:00:00 cash 10836 1 0.0 00:00:26 aump 11094 5422 0.0 00:00:14 /usr/sbin/automountd 11384 5422 0.0 00:00:00 /usr/sbin/writesrv 12400 1 0.0 00:00:00 /usr/IMNSearch/httpdlite/httpdlite -r /etc/IMNSearch/httpdlite/httpdlite.c 13434 5422 0.0 00:00:08 /usr/sbin/rsct/bin/IBM.AuditRMd 13692 1 0.0 00:00:00 /usr/sbin/getty /dev/console 13946 5422 0.0 00:00:01 /bin/sh /etc/gsa_updmn 14450 1 0.0 00:00:00 /usr/sbin/getty /dev/lft0 14988 5422 0.0 00:00:20 /usr/sbin/rsct/bin/rmcd -r 15224 5422 0.0 00:22:29 /usr/opt/ifor/bin/i4llmd -b -n wcclwts -l /var/ifor/llmlg 15484 15224 0.0 00:00:00 /usr/opt/ifor/bin/i4llmd -b -n wcclwts -l /var/ifor/llmlg 15968 13946 0.0 00:00:00 sleep 14400 16278 7998 0.0 00:00:00 rpc.ttdbserver 100083 1 16888 5422 0.0 00:00:08 /usr/sbin/rsct/bin/IBM.ERrmd 17050 5422 0.0 00:00:08 /usr/sbin/rsct/bin/IBM.ServiceRMd 19404 1 0.0 00:00:00 frca 20298 33448 0.0 00:00:00 /usr/IBMIHS-1.3.28/bin/httpd -d /usr/IBMIHS-1.3.28 21772 4148 0.2 02:23:00 dtgreet 23438 1 0.0 00:00:00 /usr/lpp/diagnostics/bin/diagd 23920 5422 0.0 00:00:08 /usr/sbin/rsct/bin/IBM.CSMAgentRMd 24432 5422 0.0 00:00:05 /usr/sbin/rsct/bin/ctcasd 25988 45448 0.0 00:00:00 emacs ph/ihsdiag/ServerDoc/Platform.java 28506 25988 0.0 00:00:00 /opt/freeware/lib/emacs/20.7/powerpc-ibm-aix4.3.3.0/emacsserver 31872 33448 0.0 00:00:00 /usr/IBMIHS-1.3.28/bin/httpd -d /usr/IBMIHS-1.3.28 32442 33448 0.0 00:00:00 /usr/IBMIHS-1.3.28/bin/httpd -d /usr/IBMIHS-1.3.28 32662 3170 0.0 00:00:00 /usr/sbin/sshd -D 33448 1 0.0 00:00:00 /usr/IBMIHS-1.3.28/bin/httpd -d /usr/IBMIHS-1.3.28 35482 33448 40.0 00:09:12 /usr/IBMIHS-1.3.28/bin/httpd -d /usr/IBMIHS-1.3.28 37564 38136 0.0 00:00:00 -bash 37854 33448 0.0 00:00:00 /usr/IBMIHS-1.3.28/bin/httpd -d /usr/IBMIHS-1.3.28 38136 42606 0.0 00:00:00 /usr/sbin/sshd -D 39222 37564 0.0 00:00:00 ps -A -o pid,ppid,pcpu,time,args 42606 8258 0.0 00:00:00 /usr/sbin/sshd -D 45270 33448 0.0 00:00:00 /usr/IBMIHS-1.3.28/bin/httpd -d /usr/IBMIHS-1.3.28 45448 32662 0.0 00:00:00 -bash
The two processes highlighted are the only ones with a high current
CPU utilization (third column in the ps
output).
This example is from AIX, where wait
is a
special process which represents all idle CPU on the system. So we have
to ignore the wait
process. This leaves process 35482, which is an
IBM HTTP Server process.
Some confirmation that 35482 is actively consuming high amounts of
CPU can be done by examining the forth column in the ps
output over time.
# ps -A -o pid,ppid,pcpu,time,args | grep 35482 35482 33448 96.2 00:10:28 /usr/IBMIHS-1.3.28/bin/httpd -d /usr/IBMIHS-1.3.28 (wait 10 seconds or so) # ps -A -o pid,ppid,pcpu,time,args | grep 35482 35482 33448 96.1 00:10:35 /usr/IBMIHS-1.3.28/bin/httpd -d /usr/IBMIHS-1.3.28 (wait 10 seconds or so) [trawick@b80-2 trawick]$ ps -A -o pid,ppid,pcpu,time,args | grep 35482 35482 33448 96.6 00:10:44 /usr/IBMIHS-1.3.28/bin/httpd -d /usr/IBMIHS-1.3.28
The subsequent checks for process 35482 show that cumulative CPU time (the fourth column) continues to increase. At this point, the suspect process is clearly identified, and we should proceed to collecting information about the proces using the instructions below.
Please refer to these instructions for verifying that required support programs are installed.
Note: This executable mustgather is not used on Windows nor on z/OS.
procdump
captures of the High CPU process.Run the tool as root
to avoid any permissions problems
with obtaining backtraces or reading files, such as log files and
configuration files. (More information about the requirement to run
this tool as root
is available here.)
ServerDoc is passed in four parameters for gathering high CPU documentation:
GatherHighCpuDoc
# java -jar ServerDoc.jar GatherHighCpuDoc /path/to/IHS 35482 127.0.0.1:80
Redhat Enterprise Linux 3.0 and SuSE SLES 9 - Some
performance tools on these platforms may identify specific thread
"pids" for high CPU. The high CPU tool cannot currently accept thread
pids on these levels of Linux. Be sure to find the true process id for
the entire process with high CPU, and specify that true process id on
the invocation of the high CPU tool. The sample |
The tool creates a new directory which contains a timestamp in the name, and the high CPU documentation will be saved in that directory.
If the IHS installation only supports SSL, then use - for this parameter. Otherwise, specify an IP address and port which can be used to reach the server from the local machine without using SSL.
Use the following table to determine the value of the non-SSL
address parameter based on the form of a non-SSL Listen
directive used in your configuration:
Listen directive looks like this | use this for address parameter |
(no non-SSL ports) | - |
Listen 80 |
127.0.0.1:80 |
Listen port |
127.0.0.1:port |
Listen 192.168.1.15:80 |
192.168.1.15:80 |
Listen ipaddress:port |
ipaddress:port |
Listen myhostname:80 |
myhostname:80 |
For this example, IHS is installed in /usr/IBMIHS-1.3.28
,
the high CPU process is 35482, the non-SSL port can be
reached from the web server machine on address
127.0.0.1:8080
, and ihsdiag was unpacked into directory
/root/ihsdiag-1.3.4
.
# java -jar /root/ihsdiag-1.3.4/ServerDoc.jar GatherHighCpuDoc /usr/IBMIHS-1.3.28 \ 35482 127.0.0.1:8080 Tracing process for 10 seconds... Seconds remaining before gathering information again: 60...54...48...42...36...30...24...18...12...6... Tracing process for 10 seconds... Seconds remaining before gathering information again: 30...27...24...21...18...15...12...9...6...3... Tracing process for 10 seconds... Reports, log files, and configuration files have been saved to directory HighCpuDoc.200501201122 If you have additional log files or configuration files, copy them there before packing up the directory. Web server log and conf files other than the default will have to be copied manually. WebSphere plug-in conf and log files will have to be copied manually. Hint for packing up the directory: tar -cf HighCpuDoc.200501201122.tar HighCpuDoc.200501201122 gzip HighCpuDoc.200501201122.tar [trawick@b80-2 ServerDoc]$ ls -l HighCpuDoc.200501201122/ total 11479 -rw-r--r-- 1 root system 32986 Jan 20 11:22 access_log -rw-r--r-- 1 root system 7129 Jan 20 11:22 apachectl -rw-r--r-- 1 root system 2251 Jan 20 11:22 error_log -rw-r--r-- 1 root system 662246 Jan 20 11:22 httpd -rw-r--r-- 1 root system 22835 Jan 20 11:22 httpd.conf -rw-r--r-- 1 root system 197962 Jan 20 11:24 log -rw-r--r-- 1 root system 1624 Jan 20 11:24 report -rw-r--r-- 1 root system 1641481 Jan 20 11:22 trace.0 -rw-r--r-- 1 root system 1642385 Jan 20 11:24 trace.1 -rw-r--r-- 1 root system 1650769 Jan 20 11:24 trace.2 -rwxr-xr-x 1 root system 1442 Jan 20 11:22 traceprocess.sh
The next step is to copy any other web server or plug-in configuration files and logs into the new HighCpuDoc directory. Here is a list of files to copy if they are being used:
The last step is to pack up and compress the documentation directory using zip, tar followed by gzip, or pax followed by compress. The easiest way is to cut and paste the messages displayed by ServerDoc previously which showed the commands to use. The suggested commands will vary by platform. On z/OS, for example, pax and compress will be suggested instead of tar and gzip.
# tar -cf HighCpuDoc.200501201122.tar HighCpuDoc.200501201122 # gzip HighCpuDoc.200501201122.tar
The resulting compressed file is the file to send to IBM support.
root
requirementWhen gathering information on high CPU problems, the tool must attach to a live web server process to obtain information about the state of that process.
If the web server is started as root
, then that
process will be owned by root
or by the web server user
id (e.g., nobody
or www
). Only
root
has the authority to attach to any of the web
server processes, so it is easiest if the tool itself is run as
root
. If the web server administrator does not have
authority to log in or switch user to root
, a simple
script can be created to gather the high CPU documentation, and the
system administrator can give the web server administrator
sudo
access to that script. sudo
is a
third-party tool available without cost for all appropriate
platforms.
If the web server is not started as root
, there are no
such concerns, and the high CPU documentation tool may be run by the
user id which starts the web server.
If the tool is run as non-root
and it is unable to
gather the required information, the problem will have to be
recreated. It may not be possible to determine if this problem
occurred until the documentation has been analyzed by IBM HTTP Server
support.