Troubleshooting Guide

Using the DB2 Trace Facility (db2trc)

The DB2 trace facility (db2trc command) lets you trace events, dump the trace data to a file, and format the data into a readable form. You may be requested by DB2 Customer Support to take a trace if the db2diag.log is not enough to diagnose a problem.

DB2 trace information is stored either in memory or on disk. When it is being collected, the information is recorded in chronological order.

The amount of information gathered by a trace grows rapidly. Thus, you want to capture only the error situation and avoid any other activities (for example, starting the database manager instance with db2start or connecting to a database with db2 CONNECT). When taking a trace, reproduce the smallest scenario that can be re-created and capture it for further analysis.

The process of performing a trace has a global effect on the behavior of a DB2 instance. The degree of performance degradation is dependent on the type of problem and on how many resources are being used to gather the trace information.

Depending on your platform, there are various ways to perform a trace. For example, on UNIX-based systems, you must use the db2trc command.

Before you Begin

Before attempting to use the DB2 trace facility, consider the following:

Using Trace Facility (db2trc) Commands

The DB2 trace facility tracing is activated by the db2trc on command. This command has various parameters you can use to control the amount and type of data collected.

The syntax for the db2trc command is shown below. Explanations of the parameters follow.

>>- db2trc--+--------------------------------------------------------+->
            |      .-----------------------------------------------. |
            |      V                                               | |
            +- on-----+-----------------------------------------+--+-+
            |         |      .-.------------------------------. |    |
            |         |      V                                | |    |
            |         +- -m----+- *------------------------+--+-+    |
            |         |        |  .-,--------------------. |    |    |
            |         |        |  V                      | |    |    |
            |         |        '------ num--+-------+----+-'    |    |
            |         |                     '- -num-'           |    |
            |         +- -e-- max_sys_errors--------------------+    |
            |         +- -r-- max_record_size-------------------+    |
            |         +- -f-- filename--------------------------+    |
            |         +-+- -l--+---------------+-+--------------+    |
            |         | |      '- buffer_size--' |              |    |
            |         | '- -i--+---------------+-'              |    |
            |         |        '- buffer_size--'                |    |
            |         '- -t-------------------------------------'    |
            +- off---------------------------------------------------+
            +- dump-- filename---------------------------------------+
            +- flw-- dump_file-- output_file-------------------------+
            '- fmt-- dump_file-- output_file-------------------------'
 
>--------------------------------------------------------------><
 

Parameters

You must issue the db2trc command several times to turn tracing on, produce a dump file, format the dump file, and turn tracing off. The following list represents the order in which you use the parameters:

on
Use this parameter to start the DB2 trace facility. See Starting a DB2 Trace, for information about the options for this parameter.

dump
If you are tracing to memory, use this parameter to dump the trace information out to a file once you reproduce the error. The following command puts the information in the current directory in a file called db2trc.dmp:
db2trc dump db2trc.dmp

You must specify a file name with this parameter. The file is saved in the current directory unless you explicitly specify a path.

off
After you dump the trace to a file, turn the trace off by typing:
db2trc off

flw | fmt
After you dump the trace to a binary file, confirm that the trace was successful by formatting it into an ASCII file. Use either the flw option (to sort by process or thread), or the fmt option (to list every event chronologically). For either option, you must specify the name of the dump file and should specify the name of an output file that will be generated. For example:
db2trc flw db2trc.dmp db2trc.flw
db2trc fmt db2trc.dmp db2trc.fmt

These options are provided for you to verify the trace. However, these files may also need to be sent to DB2 Customer Service and Support at their request.

Starting a DB2 Trace

To start the trace facility, type db2trc on. The default trace option values are:

-m *.*.*.*
Trace everything

-e -1
Collect all errors

-r 16000
Maximum record size is 16 KB

-s
Trace destination to shared memory (instead of to a file)

-l 2000000
Last trace records are retained, with a buffer size of 2 MB (the default buffer size is 512 KB instead for UNIX-based systems other than AIX)

Occasionally, you may be instructed to specify options to tailor the trace. Use the following options only as directed by DB2 Customer Support:

-m mask
Specifies trace record types to focus the search. The mask variable consists of four one-byte masks separated by periods. These masks correspond to products, event types, components, and functions, respectively, and act as a filter to accept or reject the trace record sent by DB2 for each event based on its ID.

If DB2 Customer Support requires mask values different from the default values, you will be instructed on which values to use.

-e max_sys_errors
Limits the number of DB2 internal system errors that the trace will hold to max_sys_errors. The default value is -1 (collect all errors).

-r max_record_size
Limits the size of trace records to max_record_size bytes. Longer trace records are truncated.

-s | -f filename
By default, trace output is stored in shared memory (the -s option). If the workstation is being suspended preventing you from accessing this output, you can trace to a file with the -f option. When using this output, you must specify a file where the trace output will be stored. See Example of Tracing to a File for an example.

-l [ buffer_size] | -i [buffer_size]
The option -l (lowercase "L") specifies that the last trace records are retained (the first records are overwritten when the buffer is full). The option -i specifies that the initial trace records are retained (no more records are written to the buffer once it is full).

Use either of these options to specify the buffer size.

-t
Includes timestamps. Applicable to UNIX-based environments only, where the logging of timestamps severely affects performance.

Verifying a DB2 Trace

Trace information may or may not be helpful in diagnosing an error. For example, it may not capture the error condition in the following situations:

To verify that a trace file can be read, format the binary trace file to show the flow control and send the formatted output to a null device. The following example shows the command to perform this task:

db2trc flw example.trc nul

The output for this command will explicitly tell you if there is a problem reading the file, and whether or not the trace was wrapped.

Examples of Using db2trc

The following examples provide several scenarios to show you how to use db2trc.

Example of Tracing to Memory

Tracing to memory is the preferred method of tracing. The size of the trace buffer is specified as 4 MB in this example.

  1. Shut down all other applications.
  2. Turn the trace on, using the following command:
    db2trc on -l 4000000 -e -1
    
  3. Reproduce the problem scenario. There should be as little DB2 activity as possible, to minimize event entries.
  4. Dump the trace to a file, using the following command:
    db2trc dump db2trc.dmp
    

    (Dump the trace immediately after the problem occurs. Otherwise an SQL error that occurs after the error is captured may cause important information to be lost.)

  5. Turn the trace off, using the following command:
    db2trc off
    

Example of Tracing to a File

Tracing to a file is useful when the problem being re-created suspends the workstation, preventing you from dumping the trace to a file. When you trace to a file, each trace entry is written to disk when it happens. The information in the file captures those events leading up to (but not including) the suspension.

In this example, the size of the file is specified as 4 MB and the name of the file is db2trc.dmp:

  1. Shut down all other applications.
  2. Turn the trace on, using the following command:
    db2trc on -l 4000000 -e -1 -f db2trc.dmp
    
  3. Reproduce the problem scenario.

    The db2trc.dmp file will be present when you reboot your workstation. It will contain the events leading up to the event that caused the system to crash.

Example of Using db2diag.log and the DB2 Trace

Problem: The SQL1042C error message is received when trying to connect to a database from a client. Access to the database is not possible.

Approach: To find the solution to the problem, first find the location of the error. Because the error occurs on the client, start by looking for the error on the client. To do this, look at the db2diag.log on the client machine. For more information, see Interpreting the db2diag.log.

Assume that the db2diag.log on the client machine contains no entries. Look at the db2diag.log on the DB2 server machine to see if an error has occurred on the server.

Assume that the db2diag.log on the server contains the following entries:
1997-03-16-08:54:37.001160    Instance:payroll   Node:000
PID:74467(db2syscs (SAMPLE))     Appid:*LOCAL.payroll.970317140834
buffer_pool_services  sqlbStartPools    Probe:0   Database:SAMPLE
Starting the database. (1)
1997-03-16-08:54:38.001160    Instance:payroll   Node:000
PID:74467(db2syscs (SAMPLE))     Appid:*LOCAL.payroll.970317140834
data_protection       sqlpresr    Probe:0   Database:SAMPLE
Crash Recovery has been initiated. (2)
1997-03-16-08:54:38.001160    Instance:payroll   Node:000
PID:74467(db2syscs (SAMPLE))     Appid:*LOCAL.payroll.970317140834
data_protection       sqlpresr    Probe:0   Database:SAMPLE
Low transaction lsn: 0000 005d c00c            
1997-03-16-08:54:38.001160    Instance:payroll   Node:000
PID:74467(db2syscs (SAMPLE))     Appid:*LOCAL.payroll.970317140834
data_protection       sqlpresr    Probe:0   Database:SAMPLE
Minimum buffer lsn: 0000 005d c00c                                         
1997-03-16-08:54:38.001160    Instance:payroll   Node:000
PID:74467(db2syscs (SAMPLE))     Appid:*LOCAL.payroll.970317140834
data_management (5)     sqldmund    Probe:375   Database:SAMPLE
Error during undo. (3) 0ae6 ffff 0ae6 ffff 0000 005e efa2 6363  (4)

Interpretation:

(1)
An initial request to connect to database SAMPLE was received. This indicates that the client request could reach the database server and communications are working.

(2)
The database is in a state where restart is required. That is, the database was not shut down cleanly the last time it was used.

(3)
An error was encountered when a transaction in the database had not been committed and was being rolled back ("undone") to make the database consistent.

(4)
The first dumped error code is ffff e60a (remember to byte-reverse any error codes that are in the form nnnn ffff). Converting this code to decimal does not yield a valid SQL code. Looking up return code E60A in Appendix A, DB2 Internal Return Codes, we see that a file does not exist.

(5)
Since the subcomponent that encountered the error is data management, we can predict that a data object is missing.

Approach: Unfortunately, the db2diag.log does not contain information to help us determine which file is missing. We must proceed by taking a trace of the error condition and scanning for the E60A error code in the trace file.

Assume that the trace file looks like the following:
3478    DB2 non-fatal_err oper_system_services sqloopenp (1.4.15.140)
        pid 55; tid 38; cpid 112; time 365535; trace_point 6
        433a 5c44 4232 5c53 514c 3030 3030 315c   /DB2/SQL00001/
        5351 4c54 3030 3032 2e30 5c53 514c 3030 SQLT0002.0/SQL00
        3031 302e 4441 54                       010.DAT (2)
3479    DB2 cei_data      oper_system_services sqloopenp (1.25.15.140)
        pid 55; tid 38; cpid 112; time 365535; trace_point 7
        ffff ffff                               
3480    DB2 cei_errcode   oper_system_services sqloopenp (1.6.15.140)
        pid 55; tid 38; cpid 112; time 365535; trace_point 254
        return_code = 0xffffe60a (1) = -6646 = SQLO_FNEX

Interpretation:

(1)
Because trace file entries are ordered chronologically, we search backwards from the bottom of the trace file. We find the return code E60A in trace entry 3480.

(2)
Scanning backwards from trace entry 3480, we see that the file being opened was SQL00010.DAT in the table space SQLT0002.0

Solution: Restore from a backup known to be good, and roll forward to the end of the logs to ensure that no data is lost.


[ Top of Page | Previous Page | Next Page ]