llkd: Add __get_user_page stack symbol checking

Add ro.llk.stack to list a set of symbols that should rarely happen
but if persistent in multiple checks, indicates a live lock condition.
At ro.llk.stack.timeout_ms the process is sent a kill, if it remains,
then panic the kernel.

There is no ABA detection in the paths, the condition for the
stack symbol being present instantaneously must be its rarity of
being caught.  If a livelock occurs in the path of the symbol, then
it is possible more than one path could be stuck in the state, but
the best candidate symbols are found underneath a lock resulting in
only one process being the culprit, and the best aim.  There may be
processes that induce a look of persistence, if so the symbol is not
a candidate for checking.

The current candidate is __get_user_pages, after mm is locked,
should be a very short reference to look up a page, but can be
longer term if starved, or a condition causes a conflicting loop.

Test: compile
Bug: 33808187
Change-Id: I946e85641e59229b7491e929fcab5f1240794254
This commit is contained in:
Mark Salyzyn 2018-08-07 08:13:13 -07:00
parent 96505fad80
commit a9afe5933d
2 changed files with 2 additions and 2 deletions

View file

@ -127,7 +127,7 @@ Only active on userdebug and eng builds.
default 2 minutes samples of threads for D or Z.
#### ro.llk.stack
default *empty* or false, comma separated list of kernel symbols.
default __get_user_pages, comma separated list of kernel symbols.
The string "*false*" is the equivalent to an *empty* list.
Look for kernel stack symbols that if ever persistently present can
indicate a subsystem is locked up.

View file

@ -48,7 +48,7 @@ unsigned llkCheckMilliseconds(void);
/* LLK_CHECK_MS_DEFAULT = actual timeout_ms / LLK_CHECKS_PER_TIMEOUT_DEFAULT */
#define LLK_CHECKS_PER_TIMEOUT_DEFAULT 5
#define LLK_CHECK_STACK_PROPERTY "ro.llk.stack"
#define LLK_CHECK_STACK_DEFAULT ""
#define LLK_CHECK_STACK_DEFAULT "__get_user_pages"
#define LLK_BLACKLIST_PROCESS_PROPERTY "ro.llk.blacklist.process"
#define LLK_BLACKLIST_PROCESS_DEFAULT \
"0,1,2,init,[kthreadd],[khungtaskd],lmkd,lmkd.llkd,llkd,watchdogd,[watchdogd],[watchdogd/0]"