Important! The best way to get this information from a core dump is by using the ServerDoc tool, described here. Unless there is a problem running the automated tool, that should be used instead of these manual steps. |
Platform | Preferred tool | Alternate tool |
AIX | dbx | (none) |
Linux | gdb | (none) |
Solaris | pstack | adb |
HP-UX | gdb | (none) |
gdb
instructionsgdb
Example:
# gdb /opt/IBMHTTPD/bin/httpd /tmp/core.13587 (gdb) where (gdb) thread apply all bt
gdb
for various platformsgdb
dbx
instructions# dbx /usr/HTTPServer/bin/httpd /usr/HTTPServer/core Type 'help' for help. warning: The core file is truncated. You may need to increasethe ulimit for file and coredump, or free some space on the filesystem. reading symbolic information ... [using memory image in /usr/HTTPServer/core] Segmentation fault in sig_coredump at 0x10003f3c ($t1) 0x10003f3c (sig_coredump+0x3c) 80410014 lwz r2,0x14(r1) (dbx) where sig_coredump() at 0x10003f3c wait_or_timeout() at 0x10004564 standalone_main() at 0x10001a00 main() at 0x1000134c
Please beware of a limitation of analyzing illegal instruction (SIGILL) coredumps on AIX. No backtrace is possible. Here is a typical encounter with this limitation:
#dbx /opt/HTTPServer/bin/httpd /IBPP/logs/core Type 'help' for help. reading symbolic information ...warning: no source compiled with -g [using memory image in /IBPP/logs/core] Illegal instruction (illegal opcode) in sig_coredump at 0x10003ee0 ($t1) 0x10003ee0 (sig_coredump+0x3c) 80410014 lwz r2,0x14(r1) (dbx) where sig_coredump() at 0x10003ee0 warning: Unable to access address 0x0 from core warning: could not locate trace table from starting address 0x0 (dbx)
In the event of an illegal instruction (SIGILL) coredump on AIX, it is best to send the actual core dump file, along with the dbx output, to IBM where some information can be extracted. This is very probablematic and the best we can expect is to get some hints about what module might have crashed.
bos.adt.debug
is installed
pstack
if target system doesn't
have this installed)
gdb
adb
instructions# adb /opt/IBMHTTPD/bin/httpd /opt/IBMHTTPD/core core file = /opt/IBMHTTPD/core -- program ``httpd.emerson'' on --> -- platform SUNW,Ultra-250 SIGBUS: Bus Error $c send_silly(18e938,fe4f0e81,ffffffff,7efefeff,79,79) + f4 ap_invoke_handler(18e938,fde20148,0,0,6,6) + 174 process_request_internal(18e938,1,40,ffbef99c,4,1) + 61c ap_process_request(18e938,4,18e938,ffbefa24,ffbefa34,2) + 30 child_main(2,2ce00,ff37f6a8,ff37e000,0,0) + 720 make_child(9e7a8,2,3ddea716,ffffffc0,10,fde7a0f4) + 158 startup_children(3,9e7a8,93d1c,9e7a8,80790,7a58c) + 88 standalone_main(1,ffbefc94,93d1c,ff23a000,ff23cfec,807d8) + 1dc main(1,ffbefc94,ffbefc9c,93800,0,0) + 574 ^D #Note that the command to get the backtrace is
$c
- dollar
sign followed by c.
The command to get out is the eof character, usually ^D
-
control-D.
adb
pstack
instructionspstack
command against the coredump.
pstack
is part of the base operating system, so it does
not have to be installed separately. This is the recommended way to
get backtraces on Solaris, especially when Sun's dbx
tool is not available, since pstack
can display function
arguments for programs built without symbolic information (like
official product builds) whereas gdb
can't. Also, there
have been circumstances where gdb
didn't display the
complete backtrace for a segfaulting thread but pstack
did.
Note that pstack
doesn't know how many arguments there
are so it always displays six. So if you know that some function has
only two arguments, ignore whatever pstack
displays after
the first argument.
Example:
# pstack core.httpd.1008 core 'core.httpd.1008' of 1008: /opt/IBMHTTPD/bin/httpd ----------------- lwp# 1 / thread# 1 -------------------- 0002e3e8 ???????? (ffbeee7c, 1425, d, a16f0, 82b68, 9b098) 00031188 main (1, ffbeef94, 96408, ff238018, ff23b03c, 82cf0) + 478 00031bec parse_byterange (1, ffbeef94, ffbeef9c, 96000, 0, 0) + 484 00017308 load_module (0, 0, 0, 0, 0, 0) + 140 ----------------- lwp# 2 / thread# 2 -------------------- ff21ad54 _signotifywait (ff16e000, 0, 0, ff23b540, 0, 0) + 8 ff151ae4 thr_yield (0, 0, 0, 0, 0, 0) + 8c ----------------- lwp# 3 / thread# 3 -------------------- ff21b3e0 _lwp_sema_wait (fe30de30, ff16e000, 0, fe30dd78, 250c4, 0) + c ff14944c _swtch (fe30dd78, fe30dd78, ff16e000, 5, 1000, 1) + 424 ff14d8a4 _reap_wait (ff172a08, 20a38, 0, ff16e000, 0, 0) + 38 ff14d5fc _reaper (ff16ee30, ff255d18, ff172a08, ff16ee08, 0, fe400000) + 38 ff15ba1c _thread_start (0, 0, 0, 0, 0, 0) + 40 #The
pstack
command automatically displays the
backtrace for each thread.
pflags
command displays information about the
various threads in a coredump or live process. Here is some output
showing how it labels the thread that did something bad:
$ pflags core core 'core' of 20897: /export/home/trawick/ph/2.0.42/built/bin/httpd -k start data model = _ILP32 /1: flags = PR_PCINVAL sigmask = 0xffffbefc,0x00001fff cursig = SIGSEGV /2: flags = PR_STOPPED|PR_ASLWP why = PR_SUSPENDED sigmask = 0xffbffeff,0x00001fff /5: flags = PR_STOPPED why = PR_SUSPENDED /4: flags = PR_STOPPED why = PR_SUSPENDED /6: flags = PR_STOPPED why = PR_SUSPENDED /7: flags = PR_STOPPED why = PR_SUSPENDED (rest of output omitted)Note that thread 1 has cursig = SIGSEGV next to it. That is the flag that Solaris thinks did the dirty deed. This is often correct. (Note: For other types of problems it may say SIGILL or SIGSEGV or SIGABND or something else.)