Some testing environments can have a test that is sending many
thousands of messages to the log. When this type of process crashes
all of these log messages are captured and can cause OOM errors
while creating the tombstone.
Added a test to verify the log messages are truncated. Leaving this
test disabled for now since it is inherently flaky due to having to
assume that 500 messages are in the log.
Added a test for a newline in a log message since it's somewhat
related to this change.
NOTE: The total number of messages is capped at 500, but if a message
contains multiple newlines, the total messages will exceed 500.
Counting messages this way seems to be in the spirit of the cap,
that a process logging a large message with multiple newlines does
not completely fill the tombstone log data.
Bug: 269182937
Bug: 282661754
Test: All unit tests pass.
Test: The disabled max_log_messages test passes.
Change-Id: If18e62b29f899c2c4670101b402e37762bffbec6
Also add new unit tests to verify this behavior.
Bug: 276934420
Test: New unit tests pass.
Test: Ran new unit tests without pthread_setname_np call and verified
Test: the tests fail.
Test: Force crash logd and verify log messages are not gathered.
Test: Force crash a logd thread and verify log messages are not gathered.
Change-Id: If8effef68f629432923cdc89e57d28ef5b8b4ce2
Signed-off-by: Liu Cunyuan <liucunyuan.lcy@linux.alibaba.com>
Signed-off-by: Mao Han <han_mao@linux.alibaba.com>
Change-Id: Ie22c2895fc30fab68eddc18713c80e403f44b203
r.android.com/2108505 was intended to fix a crash in Scudo in
the case where the stack depot, region info or ring buffer were
unreadable. However, it also ended up introducing a number of bugs into
the code. It failed to call __scudo_get_error_info if the page at the
fault address was unreadable. This can happen in legitimate crash cases
if a primary allocation was close to the boundary of a mapped region,
or if the allocation was a secondary allocation with guard pages. It
also used long as the type for tags, whereas Scudo expects it to be
char. In combination this ended up causing most of the MTE tests to
fail. Therefore, mostly revert that change.
Fix the original crash by null checking the pointers returned by
AllocAndReadFully before proceeding with the rest of the function.
Bug: 233720136
Change-Id: I04d70d2abffaa35fe315d15d9224f9b412a9825d
The code doesn't properly check if data is not read properly, so
make it fail if reads fail. Also, change the algorithm so that
first try and read the faulting page then 16 pages before and 16
pages after. Rather than trying to read every one of these pages,
stop as soon as one is unreadable. This means that the total memory
passed to the scudo error function is all valid data, rather than
potentially being some uninitialized memory.
Added new unit tests to cover scudo address processing.
Bug: 233720136
Test: All unit tests pass.
Test: atest CtsIncidentHostTestCases
Change-Id: I18a97bdee9a0c44075c1c31ccd1b546d10895be9
This simplifies most of the calls to avoid doing any Android
specific code.
Bug: 120606663
Test: All unit tests pass.
Change-Id: I511e637b9459a1f052a01e501b134e31d65b5fbe
The functionality moved from the Unwinder object to the MapInfo
object and means that the individual unreadable files can be
displayed now.
Included adding the unreadable elfs per thread in the protobuf.
Updated the unwinder test.
Test: All unit tests pass.
Change-Id: I7140bde16938736da005f926e10bbdb3dbc0f6f5
When dumping a tombstone using the fallback path, only the main
thread was showing up. Modify the code to dump the threads using
a slightly different path for the tombstone generation code.
In addition, while looking at this code, two MTE variables were
not set in the tombstone fallback code. Added those variables
so MTE devices will work properly in this fallback path.
Modified the tombstone unit tests for seccomp to have
multiple threads and verify those threads show up in the tombstone.
Bug: 208933016
Test: Ran unit tests.
Test: Ran debuggerd <PID> on a privileged process and verified
Test: all threads dumped. Also verified that the tagged_addr_ctrl
Test: variable is present on the raven device.
Change-Id: I16eadb0cc2c37a7dbc5cac16af9b5051008b5127
Hard to get otherwise if you're trying to debug PAC issues.
Bug: http://b/214314197
Test: treehugger
Change-Id: I2e5502809f84579bf287364e59d6e7ff67770919
The frame data no longer contains map_XXX fields which represent
the map data. Now there is only a shared pointer to the MapInfo
object with which this frame is associated.
Bug: 120606663
Test: Unit tests pass.
Change-Id: I89282963f742f6fcc07e48533da4108dc16bdce9
On the main thread, the siginfo pointer will never be nullptr.
Add a CHECK to make sure this is true.
Test: Unit tests pass both 32 bit and 64 bit.
Test: Ran with debug.debuggerd.translate_proto_to_text set to 0
Test: to exercise old path.
Change-Id: I9d5ed0de5d652de8a4f9cd85eb57cbb1ec676404
This code was added, but a svelte config still tries to use scudo
related code that doesn't exist.
Bug: 201007100
Test: Ran unit tests on normal config.
Test: Ran unit tests on svelte config.
Change-Id: Ic84bae37717d213121aef182bac2f82dbee25213
I was here because we have a case where timeout(1) kills logcat, but
debuggerd alleges that the process that was killed had started less than
a second ago. I'm not sure this is the problem there, but I did notice
that far too many tombstones were claiming improbably short process
uptimes. It turns out that the code was measuring the *thread* uptime,
not the *process* uptime.
Also simplify the code a bit by switching to sysinfo(2) rather than
reading a file.
Test: manual, plus the existing unit test
Change-Id: Ie2810b1d5777ad9182be92bfb3f60795dc978b24
The tombstone will add a newline after the abort message, so remove
any trailing newlines before saving/printing.
Bug: 196414062
Test: Unit tests pass.
Test: Set system property debug.debuggerd.translate_proto_to_text to 0
test: and unit tests still pass.
Change-Id: I0d3dc215eb5d8be93d99e5b9d4f0a14b1d61396d
When moving to the proto-ized tombstones, the note about unreadable
elf files in a backtrace got lost. This re-adds it and adds a test
to verify that the note properly shows up.
Bug: 185428454
Test: Ran unit tests.
Change-Id: I1150cc737772e1b79fd73ec5c782caadc4629421
Proto tombstones were missing tagged fault addresses, tagged_addr_ctrl,
tags in memory dumps and Scudo and GWP-ASan error reports. Since text
tombstones now go via protos, all of these features broke when we
switched to text tombstones generated from protos by default. Fix
the features by adding support for them to the proto format,
tombstone_proto and tombstone_proto_to_text.
Bug: 135772972
Bug: 182489365
Change-Id: I3ca854546c38755b1f6410a1f6198a44d25ed1c5
Application developers would like to know how long their process has
been alive for to distinguish between crashes that happen immediately
upon startup and crashes in regular operation.
Test: manual
Change-Id: Ia31eeadfcced358b478c7a7c7bb2e8a0252e30f4
Otherwise we can fail to find map entries for tagged addresses,
such as those of heap objects.
Bug: 135772972
Change-Id: Ia626b0587c8461eb575b2de5c08562c73ba4a66e
libbase logging uses getprogname() to get the default tag, which breaks
for the fallback handler which is statically linked into the dynamic
linker. Switch to libasync_safe for logging.
Test: atest -c CtsSeccompHostTestCases:android.seccomp.cts.SeccompHostJUnit4DeviceTest#testAppZygoteSyscalls
Change-Id: Ieeaf33fb26cff4ba7e1589d1d883ac2fcc74cf47
Revert "Let crash_dump read /proc/$PID."
Revert submission 1556807-tombstone_proto
Reason for revert: b/178455196, Broken test: android.seccomp.cts.SeccompHostJUnit4DeviceTest#testAppZygoteSyscalls on git_master on cf_x86_64_phone-userdebug
Reverted Changes:
Ide6811297:tombstoned: switch from goto to RAII.
I8d285c4b4:tombstoned: make it easier to add more types of ou...
Id0f0fa285:tombstoned: support for protobuf fds.
I6be6082ab:Let crash_dump read /proc/$PID.
Id812ca390:Make protobuf vendor_ramdisk_available.
Ieeece6e6d:libdebuggerd: add protobuf implementation.
Change-Id: Ia0a1ee57e7630e01c495dc166218f665340aad7f
This reverts commit 675cb30f05.
Reason for revert: b/178455196, Broken test: android.seccomp.cts.SeccompHostJUnit4DeviceTest#testAppZygoteSyscalls on git_master on cf_x86_64_phone-userdebug
Change-Id: I82d228f2bc3e6b426d4703732e1c8766815ccc97
This commit implements protobuf output for tombstones, along with a
translator that should emit bytewise identical output to the existing
tombstone dumping code, except for ancillary data from GWP-ASan and
Scudo, which haven't been implemented yet.
Test: setprop debug.debuggerd.translate.translate_proto_to_text 1 &&
/data/nativetest64/debuggerd_test/debuggerd_test
Test: for TOMBSTONE in /data/tombstones/tombstone_??; do
pbtombstone $TOMBSTONE.pb | diff $TOMBSTONE -
done
Change-Id: Ieeece6e6d1c26eb608b00ec24e2e725e161c8c92