Diagnosing IBM HTTP Server problem symptoms in environments with third-party modules

1.0 Introduction

This documentation applies to catastrophic symptoms such as crashes, hangs, and high CPU conditions which are not reproducible without a third-party module loaded.

Customers frequently open PMRs with IBM HTTP Server support to investigate problem symptoms encountered in configurations which include third-party modules. These problem symptoms include crashes, hangs, high CPU, incorrect output, and other issues. In most cases, the root cause of the problem symptom is not caused by IBM HTTP Server and is not something that IBM can diagnose.

If you need IBM to investigate an IBM HTTP Server problem which is encountered in an environment with a third-party module, you need to apply our latest applicable fixes and recreate the problem before a PMR can be investigated. In addition, you need to utilize applicable IBM HTTP Server serviceability features to collect the necessary documentation. There are several reasons for this requirement:

Even when all available serviceability features are utilized, it may still be impossible for IBM HTTP Server support to diagnose the cause of a problem with third-party modules loaded. All possible attempts should be made to reproduce the problem with only IBM code loaded into the web server.

Other limitations of third party modules in IBM HTTP Server

2.0 Required minimum maintenance levels

IBM HTTP Server release Required minimum maintenance level
1.3.26 Fix pack 1.3.26.2 plus e-fix PK05084 or later
1.3.28 Fix pack 1.3.28.1 plus e-fix PK05084 or later
2.0.42 Full-install e-fix PQ85834 plus e-fix PK07831 or later
2.0.47 Fix pack 2.0.47.1 plus e-fix PK07831 or later
If the customer is currently at 2.0.47-PQ85834, the e-fix can be applied over it. If the customer is on an earlier level of 2.0.47, they need to apply fix pack 2.0.47.1 before the e-fix.
6.0 6.0.2, or later
6.1 or later GA

Current maintenance for the different IBM HTTP Server releases may be found at the Recommended Updates for IBM HTTP Server page.

If the problem can be reproduced in an environment without third-party modules in the configuration, the above minimum software levels do not apply.

3.0 Prepare for hangs, crashes, and high CPU problems ahead of time

Getting the best documentation for problem diagnosis requires setup in advance of encountering the problem. You may choose to be proactive and be ready for multiple types of problem symptoms, or you may wish to set up only for a type of problem symptom currently experienced.

Preparation steps, by platform:

Preparation step Type of problem to which it applies Windows AIX Linux Solaris HP-UX z/OS
Enable mod_whatkilledus for IBM HTTP Server 1.3 or IBM HTTP Server 2.0/6.0/6.1 and later. crashes   X X X X X
Enable mod_status. hangs, crashes, high CPU X X X X X X
Set ExtendedStatus to On. crashes, hangs, high CPU,
unexpected request failures
X X X X X X
Configure IBM HTTP Server to log the name of the plug-in module which failed, or successfully handled, the request. unexpected request failures X X X X X X
Configure Windows to save a USER.DMP file when a crash occurs. crashes X          
Configure IBM HTTP Server and the operating system to obtain core dumps when a crash occurs. crashes   X X X X  
Configure IBM HTTP Server and the operating system to obtain a system dump or CEEDUMP when a crash occurs. crashes           X
Enable mod_backtrace for IBM HTTP Server 1.3 or IBM HTTP Server 2.0/6.0/6.1 and later. crashes     X     X
2.0 and above: Enable mod_mpmstats. hangs X X X X X X
Be able to run the crash, hang, and high CPU tools for Unix and Linux platforms. crashes, hangs, high CPU   X X X X  
Be able to use the crash and hang or high CPU instructions for Windows. crashes, hangs, high CPU X          

3.1 Log the module which handled the request

This feature is available with

Occasionally, third-party modules may fail requests without reporting the reason of the failure to log files, making it difficult to diagnose the problem. Newer levels of IBM HTTP Server have the capability of tracking the name of the module which failed, or successfully handled, the request. This information can be reported to the access log.

  1. Enable mod_status (including ExtendedStatus step)
  2. Add %{RH}e to the format string used for the access log. Here is an example LogFormat directive:
    LogFormat "%h %l %u %t \"%r\" %>s %b %D %{RH}e" common
    

A new field will be written to the end of every access log record.

4.0 Perform analysis when the problem occurs

4.1 Web server crashes

4.1.1 IBM HTTP Server 1.3, 2.0, 6.0, and 6.1 and later on Unix and Linux platforms

  1. Find the mod_whatkilledus report in the IBM HTTP Server error log. Here is an example:
    [Tue May  3 22:03:41 2005] pid 15534 mod_whatkilledus sig 11 crash
    [Tue May  3 22:03:41 2005] pid 15534 mod_whatkilledus active module: mod_silly.c
    [Tue May  3 22:03:41 2005] pid 15534 mod_whatkilledus active connection: 127.0.0.1:42776->127.0.0.1:8080 (conn_rec 20068090)
    [Tue May  3 22:03:41 2005] pid 15534 mod_whatkilledus active request (request_rec 200690a0):
    GET /silly/?sigsegv
    [Tue May  3 22:03:41 2005] pid 15534 mod_whatkilledus end of report
    

    If the active module is identified and it is a third-party module, the vendor of that module needs to investigate further. For example, if the report says "Active module: mod_sm.c", further investigation needs to be done by Computer Associates. Other third-party modules will show up with different module names. No further documentation needs to be sent to IBM when the active module is from a non-IBM source.

    Crashes caused by third-party modules often occur on threads created by the module. IBM HTTP Server is not aware of these threads, and mod_whatkilledus cannot identify the active module in these cases.

  2. If no active module is identified, or the active module is from IBM, use the crash documentation tool to gather information about the crash, and send the documentation to IBM HTTP Support for analysis.

4.1.2 IBM HTTP Server 1.3, 2.0, 6.0, 6.1, and 7.0 on Windows

There is no automatic mechanism to determine the component which caused the crash. Follow the crash instructions for Windows.

4.2 Web server hangs or other unresponsive behavior

  1. Try a request for the server status report with active module display enabled (http://www.example.com/server-status/?showmodule) and view the module names listed in the extended status portion of the report.
    1. If a third-party module is listed as the active module for requests which are hanging, report the problem to the vendor and follow their recommended diagnostic steps.
    2. If mod_was_ap20_http.c is listed as the active module for requests which are hanging, the most likely cause is that the application running in WebSphere is not responding. Follow applicable WebSphere MustGather steps.
    3. Otherwise, use the hang documentation tool for Unix, or the hang instructions for Windows, to gather information about the state of the server.
  2. If you are unable to view the server status page, check the IBM HTTP Server error log for reports from mod_mpmstats. There should be reports in the following format, appearing at intervals.
    [Fri Mar 18 13:16:25 2005] [notice] mpmstats: rdy 50 bsy 100 rd 0 wr 100 ka 0 log 0 dns 0 cls 0
    [Fri Mar 18 13:16:25 2005] [notice] mpmstats: bsy: 100 in mod_XXX.c
    
    1. If a third-party module is listed as the module where most or all IBM HTTP Server worker threads are busy at the time of the hang, report the problem to the vendor and follow their recommended diagnostic steps.
    2. If mod_was_ap20_http.c is listed as the active module for requests which are hanging, the most likely cause is that the application running in WebSphere is not responding. Follow applicable WebSphere MustGather steps.
    3. Otherwise, proceed to the next step.
  3. If you are unable to view the server status page or mod_mpmstats reports, or that information does not indicate that any specific module is where requests are hung, use the hang documentation tool for Unix, or the hang instructions for Windows, to gather information about the state of the server.

4.3 Web server high CPU conditions

4.3.1 Unix and Linux platforms

Follow the high CPU instructions for Unix and Linux.

4.3.2 Windows platforms

Follow the hang or high CPU instructions for Windows.

4.4 Unexpected request failures

This requires the %{RH}e feature to be enabled (described above in section 3.1).

Here are some example access log records showing failures:

127.0.0.1 - - [11/Aug/2005:19:38:47 -0400] "GET /cgi-bin/foo" 404 326 774 (mod_cgid.c/404/handler)
127.0.0.1 - - [11/Aug/2005:19:39:04 -0400] "GET /cgi-bin/foo" 500 653 15515 (mod_cgid.c/500/handler)

The name of the module which failed the request in both cases is mod_cgid. The name of a third-party module would appear when the third-party module failed the request.