Commit graph

45 commits

Author SHA1 Message Date
Peter Collingbourne
f3d542fe9f Create a debugger_process_info data structure with the process info pointers.
Similar to r.android.com/1247247 I'll be adding more of them for MTE.

Also, change the protocol between the crasher and crash_dump to make
it easier to add new fields and change the referenced data structures
without needing to worry about versioning. The version number for
static executables is now always 1 (where the protocol will never
change), while the version number for dynamic executables is always
4 (where the protocol can change, because the linker and crash_dump
are version locked).

Bug: 135772972
Change-Id: Ib4696d0544d7c87cb429aaaa15f18c3640059e16
2020-03-24 17:23:15 -07:00
Mitch Phillips
e0b4bb1b2e [GWP-ASan] Add GWP-ASan information to tombstones.
GWP-ASan can provide information about a crash that it caused. Grab the
GWP-ASan regions from the globals shared by the linker for crash-handler
purpopses, pull the information from GWP-ASan, and display it.

This adds two regions:
 1. Causality tracking by GWP-ASan. We now print a cause header about
 the crash, like `Cause: [GWP-ASan]: Use After Free on a 1-byte
 allocation at 0x7365bb3ff8`
 2. Allocation and deallocation stack traces.

Bug: 135634846
Test: atest debuggerd_test

Change-Id: Id28d5400c9a9a053fcde83a4788f971e677d4643
2020-02-18 16:49:50 -08:00
Josh Gao
55c7ed4e2e debuggerd_handler: increase thread stack size.
1 page isn't enough to log on AArch64, and clean pages are free, so
increase the stack size to 8 pages.

Bug: http://b/144887737
Test: treehugger
Change-Id: I731b3bc27ab37f4b830a9478a04cd34d4f7648d3
2020-01-17 17:25:30 -08:00
Josh Gao
a48b41bcb8 debuggerd: switch to using platform headers for DEBUGGER_SIGNAL.
Test: treehugger
Change-Id: Ie9736c4a077dba1029d2352bd94d47ce07323aec
2019-12-17 16:36:05 -08:00
Nick Desaulniers
67d52aa0f6 [debuggerd] fix -Wreorder-init-list
C++20 wants members to be ordered unlike C99.

Bug: 139945549
Test: mm
Change-Id: I3cbca589511c1e0bbc10c691949e18de77e16031
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
2019-10-10 14:54:35 -07:00
Josh Gao
18cb681247 debuggerd: call setsid in our children.
There appears to be a kernel bug that causes SIGHUP and SIGCONT to be
sent to the parent process group we spawn from if the process group
contains stopped jobs (e.g. the parent itself, because of wait_for_gdb).

Call setsid in all of our children to prevent this from happening.

Bug: http://b/31124563
Test: adb shell 'setprop debug.debuggerd.wait_for_gdb 1; killall -ABRT surfaceflinger'
Change-Id: I1a48d70886880a5bfbe2deb80d48deece55faf09
2019-04-16 13:17:08 -07:00
Josh Gao
5e8d68c2b2 debuggerd_handler: demote abort on exec failure to log.
If a process is ptraced already, we might not be able to exec crash_dump
due to selinux. Since we can be called for non-fatal events, we
shouldn't abort in that case.

Bug: http://b/128054996
Test: treehugger
Change-Id: I1442041caa7af908df2ab87b9e010c44082e7587
2019-03-18 14:39:47 -07:00
Josh Gao
6f9eeecd2b Fix multithreaded backtraces for seccomp processes.
Add threads to the existing seccomp backtrace test to prevent
regressing this.

Bug: http://b/114139908
Bug: http://b/115349586
Test: debuggerd_test32
Test: debuggerd_test64
Change-Id: I07fbe1619b60f0008deb045a249f9045404478c2
2018-09-12 18:12:13 -07:00
Josh Gao
4843c18634 debuggerd_handler: receive abort messages via sigqueue(DEBUGGER_SIGNAL).
Make it possible for code such as fdsan that generates debugging
tombstones via raise(DEBUGGER_SIGNAL) to pass an abort message as well.

Bug: http://b/112770187
Test: debuggerd_test
Change-Id: Idc34263241c18033573e466da3a45aa6f716ddb3
2018-08-27 16:55:07 -07:00
Josh Gao
9da1f51c10 crash_dump: pass the address of the fdsan table.
Pass the address of the fdsan table down to crash_dump so that we can
dump the fdsan table along with the open file descriptor list.

Test: debuggerd_test
Test: manually ran an old static_crasher
Change-Id: Icbac5487109f2db1e1061c4d46de11b016b299e3
2018-08-06 18:50:10 -07:00
Josh Gao
c954ec09c5 debuggerd_handler: use syscall(__NR_close) instead of close.
Avoid bionic's file descriptor ownership checks by calling the close
syscall manually.

Test: debuggerd_test
Change-Id: I10af6aca0e66fe030fd7a53506ae61c87695641d
2018-07-18 18:11:46 -07:00
Elliott Hughes
70d8f28945 Show signal sender for SI_FROMUSER signals.
Suicide doesn't change:

  signal 6 (SIGABRT), code -6 (SI_TKILL), fault addr --------

But homicide now looks like this (this is `sleep 666` killed by
`kill -SEGV` as root:

  signal 11 (SIGSEGV), code 0 (SI_USER from pid 4446, uid 0), fault addr --------

Bug: http://b/78594105
Test: manual
Change-Id: I8c2feafba8cc5a3db85e8250004d428a464c5d9e
2018-04-26 08:19:17 -07:00
Luis Hector Chavez
4841e744c2 debuggerd_handler: set PR_SET_PTRACER before running crash_dump.
Set and restore PR_SET_PTRACER when performing a dump, so that when
Android is running on a kernel that has the Yama LSM enabled (and the
value of ptrace_scope is > 0), crash_dump can attach to processes and
print nice, symbolized stack traces.

Bug: 70992745
Test: kill -6 `pidof surfaceflinger` && logcat -d -b crash
      # in both sailfish and Chrome OS

Change-Id: If4646442c6000fdcc69cf4ab95fdc71ae74baaaf
2017-12-27 13:19:31 -08:00
Josh Gao
7302097e77 debuggerd: wait for dump completion on crashes.
When a process crashes, both ActivityManager and init will try to kill
its process group when they notice. The recent change to minimize the
amount of time a process is paused results in crash dumps being killed
before they finish as a result of this. Since anything that needs to be
low-latency is probably not going to be too happy if it crashes, just
wait for completion whenever we're processing a real crash.

Bug: http://b/70343110
Test: debuggerd_test
Change-Id: I894bb06efd264b1ba005df06f7326a72f4b767bb
2017-12-22 14:20:12 -08:00
Josh Gao
2b2ae0c88e crash_dump: fork a copy of the target's address space.
Reduce the amount of time that a process remains paused by pausing its
threads, fetching their registers, and then performing unwinding on a
copy of its address space. This also works around a kernel change
that's in 4.9 that prevents ptrace from reading memory of processes
that we don't have immediate permissions to ptrace (even if we
previously ptraced them).

Bug: http://b/62112103
Bug: http://b/63989615
Test: treehugger
Change-Id: I7b9cc5dd8f54a354bc61f1bda0d2b7a8a55733c4
2017-12-15 14:11:12 -08:00
Christopher Ferris
664d2a9093 Force call the fallback handler.
Always check to see if the fallback handler has been called and is
not trying to dump a specific thread.

Bug: 69110957

Test: Verified on a system where the prctl value changes, that before the
Test: change it dumps multiple tombstones, and after the change it
Test: works as expected.
Test: Ran debuggerd unit tests.
Test: Dumped process using debuggerd -b <PID> and debuggerd <PID>.
Change-Id: Id98bbe96cced9335f7c3e17088bb4ab2ad2e7a64
2017-11-16 20:07:13 -08:00
Josh Gao
cdea750576 crash_dump: don't inherit environment from parent.
Bug: http://b/68381717
Test: debuggerd_test
Change-Id: Ie1b342bc9901cb9ae9b79147899928a19052cbad
2017-11-03 16:57:56 -07:00
Josh Gao
fdf832dfd3 base: add Pipe and Socketpair wrappers.
Also, switch debuggerd_handler over to using android::base::unique_fd.

Test: treehugger
Change-Id: I97b2ce22f1795ce1c4370f95d00d769846cc54b8
2017-08-28 14:51:07 -07:00
Josh Gao
81e6c0b613 debuggerd_handler: print pid and process name.
Bug: http://b/64483618
Test: manual
Change-Id: Ie772324895a8ffcd41d919a4a6113862a6468d12
2017-08-11 15:38:51 -07:00
Narayan Kamath
a73df601b7 tombstoned: allow intercepts for java traces.
All intercept requests and crash dump requests must now specify a
dump_type, which can be one of kDebuggerdNativeBacktrace,
kDebuggerdTombstone or kDebuggerdJavaBacktrace. Each process can have
only one outstanding intercept registered at a time.

There's only one non-trivial change in this changeset; and that is
to crash_dump. We now pass the type of dump via a command line
argument instead of inferring it from the (resent) signal, this allows
us to connect to tombstoned before we wait for the signal as the
protocol requires.

Test: debuggerd_test

Change-Id: I189b215acfecd08ac52ab29117e3465da00e3a37
2017-05-31 10:35:32 +01:00
Josh Gao
2e7b8e2d1a debuggerd_handler: use syscall(__NR_get[pt]id) instead of get[pt]id.
bionic's cached values for getpid/gettid can be invalid if the crashing
process manually invoked clone to create a thread or process, which
will lead the crash_dump refusing to do anything, because it sees the
actual values.

Use the getpid/gettid syscalls directly to ensure correct values on
this end.

Bug: http://b/37769298
Test: debuggerd_test
Change-Id: I0b1e652beb1a66e564a48b88ed7fa971d61c6ff9
2017-05-05 14:58:12 -07:00
Christopher Ferris
ac225780dd Move libc_logging to libasync_safe.
Move the name of the "private/libc_logging.h" header to <async_safe/log.h>.

For use of libc_malloc_debug_backtrace, remove the libc_logging library.
The library now includes the async safe log functions.

Remove the references to libc_logging.cpp in liblog, it isn't needed because
the code is already protected by a check of the __ANDROID__ define.

Test: Compiled and boot bullhead device.
Test: Run debuggerd unit tests.
Test: Run liblog unit tests on target and host.
Test: Run libmemunreachable unit tests (these tests are flaky though).
Change-Id: Ie79d7274febc31f210b610a2c4da958b5304e402
2017-05-02 18:38:46 -07:00
Josh Gao
e06f2a4886 debuggerd_handler: don't assume that abort message implies fatal.
Applications can set abort messages via android_set_abort_message
without actually aborting. This leads to following non-fatal dumps
printing their output to logcat in the same format as a regular crash.

Bug: http://b/37754992
Test: debuggerd_test
Change-Id: I9c5e942984dfda36448860202b0ff1c2950bdd07
2017-04-27 17:28:05 -07:00
Elliott Hughes
561da6aa82 "Requested dump for tid XXX" message shouldn't be fatal.
This just means we were asked to dump, not that something necessarily went
wrong.

Bug: http://b/36191903
Test: builds
Change-Id: I5638b38f3a13081b1e971512f43238010febb59c
2017-03-23 23:04:27 -07:00
Josh Gao
ec91809dae debuggerd_handler: restore errno.
Bug: http://b/31448909
Test: mma
Change-Id: I737d66e8bed5fb31c2558f68608d3df460fa73c9
2017-03-10 14:44:54 -08:00
Josh Gao
e1aa0ca58a debuggerd_handler: implement missing fallback functionality.
Allow the fallback implementation to dump traces and create tombstones
in seccomped processes.

Bug: http://b/35858739
Test: debuggerd -b `pidof media.codec`; killall -ABRT media.codec
Change-Id: I381b283de39a66d8900f1c320d32497d6f2b4ec4
2017-03-09 11:26:05 -08:00
Josh Gao
5ad965bf41 crash_dump: fix overflow.
`1 << 32` overflows, resulting in bogus PR_CAP_AMBIENT_RAISE attempts,
and breaking dumping for processes with capabilities in the top 32 bits.

Bug: http://b/35241370
Test: debuggerd -b `pidof com.android.bluetooth`
Change-Id: I29c45a8bd36bdeb3492c9f74599993c139821088
2017-02-16 20:16:58 -08:00
Josh Gao
e73c932373 libdebuggerd_handler: in-process crash dumping for seccomped processes.
Do an in-process unwind for processes that have PR_SET_NO_NEW_PRIVS
enabled.

Bug: http://b/34684590
Test: debuggerd_test, killall -ABRT media.codec
Change-Id: I62562ec2c419d6643970100ab1cc0288982a1eed
2017-02-15 17:03:44 -08:00
Josh Gao
60515bf9f1 debuggerd_handler: don't use snprintf in handler.
snprintf isn't safe to call in the linker after initialization, because
it uses MB_CUR_MAX which is implemented via pthread_getspecific, which
uses TLS slots shared with libc. If the TLS slots are assigned in a
different order between libc.so and the linker, MB_CUR_MAX will
evaluate to an incorrect value, and lead to snprintf doing bad things.

Switch to __libc_format_buffer.

Bug: http://b/35367169
Test: debuggerd -b `pidof zygote`
Change-Id: I9d315cf63e5f3fd2f4545d6e3f707cdbe94ec606
2017-02-15 12:24:09 -08:00
Josh Gao
2f11a25a48 debuggerd_handler: set PR_SET_DUMPABLE before running crash_dump.
Set and restore PR_SET_DUMPABLE when performing a dump, so that
processes that have it implicitly cleared (e.g. services that acquire
filesystem capabilities) still get crash dumps.

Bug: http://b/35174939
Test: debuggerd -b `pidof surfaceflinger`
Change-Id: Ife933c10086e546726dec12a7efa3f9cedfeea60
2017-02-14 21:19:38 -08:00
Josh Gao
d2069632bd debuggerd_handler: raise capabilities before running crash_dump.
Raise CapInh and CapAmb after forking to exec crash_dump, so that it
can ptrace us.

Bug: http://b/35174939
Test: debuggerd -b `pidof surfaceflinger`
Change-Id: I32567010a3603cfa494aae9dc0e3ce73fb86b590
2017-02-14 14:40:47 -08:00
Josh Gao
c3c8c029ec debuggerd_handler: don't use waitpid(..., __WCLONE).
waitpid(..., __WCLONE) fails with ECHILD when passed an explicit PID to
wait for. __WALL and __WCLONE don't seem to be necessary when waiting
for a specific pid, so just pass 0 in the flags instead.

Bug: http://b/35327712
Test: /data/nativetest/debuggerd_test/debuggerd_test32 --gtest_filter="*zombie*"
Change-Id: I3dd7a1bdf7ff35fdfbf631429c089ef4e3172855
2017-02-13 17:01:24 -08:00
Josh Gao
54ef57d0b8 debuggerd_handler: fix prctl return value check.
Fixed this when I tested on internal, but failed to copy the fix over
when submitting to AOSP.

Bug: http://b/35070339
Test: `adb bugreport` on angler
Change-Id: Ib84d212e5f890958cd21f5c018fbc6f368138d1e
2017-02-06 21:10:48 -08:00
Josh Gao
279cb8b39a Merge changes from topic 'debuggerd_ambient'
* changes:
  debuggerd_handler: don't use clone(..., SIGCHLD, ...)
  crash_dump: drop capabilities after we ptrace attach.
  crash_dump: use /proc/<pid> fd to check tid process membership.
  debuggerd_handler: raise ambient capset before execing.
  Revert "Give crash_dump CAP_SYS_PTRACE."
2017-02-06 18:37:55 +00:00
Josh Gao
b3ee52e4d0 debuggerd_handler: don't use clone(..., SIGCHLD, ...)
Processes that handle SIGCHLD can race with the crash handler to wait
on the crash_dump process. Use clone flags that cause the forked
child's death to not be reported via SIGCHLD, and don't bail out of
dumping when waitpid returns ECHILD (in case another thread is already
in a waitpid(..., __WALL))

Note that the use of waitid was switched to waitpid, because waitid
doesn't support __WCLONE until kernel version 4.7.

Bug: none
Test: "debuggerd -b `pidof zygote64`" a few times (failed roughly 50%
      of the time previously)
Change-Id: Ia41a26a61f13c6f9aa85c4c2f88aef8d279d35ad
2017-02-02 13:54:39 -08:00
Josh Gao
7ae426c731 debuggerd_handler: raise ambient capset before execing.
Raise the ambient capability set to match CapEff so that crash_dump can
inherit all of the capabilities of the dumped process to be able to
ptrace. Note that selinux will prevent crash_dump from actually use
any of the capabilities.

Bug: http://b/34853272
Test: debuggerd -b `pidof system_server`
Test: debuggerd -b `pidof zygote`
Change-Id: I1fe69eff54c1c0a5b3ec63f6fa504b2681c47a88
2017-02-02 13:54:38 -08:00
Josh Gao
6462bb41e0 debuggerd_handler: add and use fatal_errno.
Bug: none
Test: mma
Change-Id: I24d913abdbe74f9463feda78f7817ca8b92af9cc
2017-01-31 14:59:05 -08:00
Josh Gao
4ed00c8d73 debuggerd_handler: improve nonfatal signal message.
"Fatal signal 35 (???)" -> "Requested dump for"

Bug: http://b/34809044
Test: debuggerd -b $$
Change-Id: I9ece0ee1117203d30142b843973ed7e5435e21da
2017-01-30 17:58:04 -08:00
Josh Gao
e5288f292a debuggerd_handler: remove PR_SET_DUMPABLE check.
crash_dump has CAP_SYS_PTRACE and this was never obeyed by debuggerd.

Change-Id: Ifee5e94b97b1f6440ad0be79758f0db2d2aaba2e
2017-01-26 15:08:18 -08:00
Josh Gao
7e14d020f1 debuggerd_handler: don't dump PR_NO_NEW_PRIVS processes.
We can't do an selinux transition when this is on.

Bug: http://b/34472671
Test: logcat -c; debuggerd `pidof media.codec`; logcat
Change-Id: Ie6c1832ab838df48879c32a86126862de9a15420
2017-01-25 11:16:03 -08:00
Josh Gao
529b3066d5 debuggerd_handler: don't resend nonfatal signals when not dumping.
Bug: http://b/34516140
Test: debuggerd -b `pidof surfaceflinger`
Change-Id: I0275ffca24bf4840e264eaa4b79611e2404edfb0
2017-01-25 11:15:01 -08:00
Josh Gao
fca7ca3585 debuggerd_handler: properly crash when PR_GET_DUMPABLE is 0.
Actually exit when receiving a signal via kill(2) or raise(2) and
PR_GET_DUMPABLE is 0.

Bug: none
Test: /data/nativetest/debuggerd_test/debuggerd_test32
Test: /data/nativetest64/bionic-unit-tests/bionic-unit-tests --gtest_filter=pthread_DeathTest.pthread_mutex_lock_null_64
Change-Id: I833a2a34238129237bd9f953959ebda51d8d04d7
2017-01-23 14:13:36 -08:00
Josh Gao
575941115e crash_dump: clear the default crash handlers.
crash_dump is a dynamic executable that gets the default crash dumping
handlers set by the linker. Turn them off to prevent crash_dump from
dumping itself.

Bug: http://b/34472671
Test: inserted an abort into crash_dump
Change-Id: Ic9d708805ad47afbb2a9ff37e2ca059f23f421de
2017-01-23 11:34:49 -08:00
Josh Gao
b64dd85c94 debuggerd_handler: actually wait for pseudothread to exit.
Occasionally, the pseudothread wouldn't exit in time after unlocking
the mutex to get crash_dump to proceed, resulting in spurious error
messages. Instead of using a mutex to emulate pthread_join, just
implement it correctly.

Bug: http://b/34472671
Test: debuggerd_test
Change-Id: I5c2658a84e9407ed8cc0ef2ad0fb648c388b7ad1
2017-01-23 11:34:49 -08:00
Josh Gao
cbe70cb0a8 debuggerd: advance our amazing bet.
Remove debuggerd in favor of a helper process that gets execed by
crashing processes.

Bug: http://b/30705528
Test: debuggerd_test
Change-Id: I9906c69473989cbf7fe5ea6cccf9a9c563d75906
2017-01-17 13:57:57 -08:00