Commit graph

264 commits

Author SHA1 Message Date
Bart Van Assche
97047b54e9 init: Combine two global sigchld_fd variables into one
Remove the Service::SetSigchldFd() method. Make the Service::GetSigchldFd()
create a signalfd for SIGCHLD. This makes it possible to use a SIGCHLD
signalfd in unit tests.

Change-Id: I0b41caa8f46c79f4d400e49aaba5227fad53c251
Signed-off-by: Bart Van Assche <bvanassche@google.com>
2023-11-20 09:32:59 -08:00
Kelvin Zhang
109932146d Attempt process kill even if cgroup is already removed
Test: th
Bug: 308900853
Change-Id: I21ae5bacf4a25cc06a1fd47e2aadbf5ae22661a7
2023-11-14 11:13:28 -08:00
Treehugger Robot
7d1f582d36 Merge changes I5e259fdd,I5b9ab456 into main
* changes:
  init: Make WaitToBeReaped() wait less long
  init: Create different file descriptors for SIGCHLD and SIGTERM
2023-11-08 01:35:58 +00:00
Bart Van Assche
a75f210398 init: Make WaitToBeReaped() wait less long
Reduce the time spent in WaitToBeReaped() by waiting for SIGCHLD instead
of waiting for 50 ms.

Bug: 308687042
Change-Id: I5e259fdd22dec68e45d27205def2fc6463c06ca3
Signed-off-by: Bart Van Assche <bvanassche@google.com>
2023-11-07 10:52:26 -08:00
T.J. Mercier
599d979126 libprocessgroup: Remove max_processes from KillProcessGroup API
The max_processes calculation is incorrect for KillProcessGroup because
the set of processes in cgroup.procs can differ between the multiple
reads in the implementation. Luckily the exact value isn't very
important because it's just logged. Remove max_processes from the API
and remove the warning about the new behavior in Android 11.

Note that we still always LOG(INFO) that any cgroup is being killed.

Bug: 301871933
Change-Id: I8e449f5089d4a48dbc1797b6d979539e87026f43
2023-10-31 16:31:44 +00:00
Jooyung Han
646e001bb5 init: clean up DelayService()
ServiceList's services_update_finished flag was overlapped with the
global flag: is_default_mount_namespace_ready. Now DelayService() relies
on the is_default_mount_namespace_ready flag.

Add a service description with 'updatable' flag and invoke 'start
<name>' in 'on init' block (which comes before APEX activation).
See the log for "Cannot start an updatable service".

Bug: 293535323
Test: see the comment
Change-Id: I9341ba1a95d9b3b7c6081b530850d61f105f0a56
2023-08-21 18:37:23 +09:00
Jooyung Han
918971c69e No need to read ro.apex.updatable now
Bug: 288202251
Test: m
Test: device boots
Change-Id: I97a3c2fab69489cdfbb5103b148194d7e2ee4d1a
2023-06-23 14:22:44 +09:00
Jiyong Park
0d277d777f init: non-crashing service can restart immediately
This CL allows restart_period to be set to a value shorter than 5s.
Previously this was prohibited to rate limit crashing services. That
behavior is considered to be a bit too conservative because some
services don't crash, but exit deliverately.

adbd is the motivating example. When adb root or adb unroot is
requested, it changes its mode of operation (via sysprop), exits itself,
and restarts (by init) to enter into the mode. However, due to the 5s
delay, the mode change can complete no earlier than 5 seconds after adbd
was started last time. This can slow the mode change when it is
requested right after the boot.

With this CL, restart_period can be set to a value smaller than 5. And
services like adbd can make use of it. However, in ordef to rate limit
crashing service, the default is enforced if the service was crashed
last time. In addition, such intended restart is not counted as crashes
when monitoring successive crashes during booting.

Bug: 286061817
Test: /packages/modules/Virtualization/vm/vm_shell.sh start-microdroid \
 --auto-connect -- --protected
* with this change: within 2s
* without this change: over 6s

Change-Id: I1b3f0c92d349e8c8760821cf50fb69997b67b242
2023-06-09 13:06:06 +09:00
Steven Moreland
f5d22ef7cd init: log when 'user' is unspecified
NOTE: in master, but should be submitted in AOSP.
Waiting to hear from security folks. Also might
need cleanup.

Not currently done. Seems errorprone.

Bug: 276813155
Test: boot, check logs
Change-Id: I7cbc39b282889dd582f06a8eedc38ae637c8edec
2023-04-17 20:18:00 +00:00
Alistair Delva
f9bfe0d16d Stop respawning serial console if disabled
After introducing ro.boot.serialconsole=0, the console will no longer be
spawned, but a step was missed to disable the service to prevent
respawns.

Bug: 266982931
Bug: 223797063
Bug: 267428635
Change-Id: I12b159eaa1999781aec31c05ce431b55e2ba4017
2023-03-13 16:09:36 -07:00
Treehugger Robot
8dab2ef586 Merge "Use ro.boot.serialconsole to disable console services" 2023-03-02 20:04:33 +00:00
Steven Moreland
8e25d9c5b0 init: add log w/ service PID
We could combine this with the existing log, but I
wouldn't want to make that appear later.

Ironically, adding this log to try to reduce logs.

Bug: 36785118
Test: :) adb logcat -d | grep "started service" | wc -l
131

Change-Id: I38f4e9740871aa256eef0c62e897038eb46871a5
2023-02-28 01:42:24 +00:00
Alistair Delva
5591f12834 Use ro.boot.serialconsole to disable console services
For many years, services declaring "console" would only be started if the
console device specified by androidboot.console= was present under /dev.
However, they would also be started if the /dev/console node existed.

This fallback causes problems with newer GKI kernel images which now
hard-code "console=ttynull" via CONFIG_CMDLINE, which essentially means
/dev/console always exists, even though this console points nowhere.

It also causes problems on devices where the androidboot.console was not
the same as the kernel dmesg console ("console="), such as cuttlefish,
because those platforms could not simultaneously enable kernel logging
but disable the interactive serial console feature. The framework just
assumed both would be muxed on the same serial port. Cuttlefish had a
workaround, to use "androidboot.console=invalid" to avoid the fallback,
but this doesn't work on devices which still want to mux the kernel logs
and interactive serial console.

This change resolves the issue in a better way, by introducing a new
boolean property called "androidboot.serialconsole". Setting this to "0"
will disable the console services, regardless of whether the
/dev/console or /dev/${ro.boot.console} devices exist. Older kernels
and bootloaders don't need to set this and can rely on the old behavior
in init, but bootloaders booting newer kernels must set it to avoid the
"performance is impacted" message due to console services being started.

Bug: 266982931
Bug: 223797063
Bug: 267428635
Test: "launch_cvd" with "androidboot.console=invalid" removed;
      See the "performance is impacted" message.
Test: "launch_cvd" with "androidboot.serialconsole=0";
      The "performance is impacted" message is gone.
Change-Id: Iaad4d27ffe4df74ed49606d3cabe83483c350df4
2023-02-22 14:31:24 -08:00
Daniel Rosenberg
de76688e40 init: Add gentle_kill service property
If a service specifies gentle_kill, attempt to stop it will send SIGTERM
instead of SIGKILL. After 200ms, it will issue a SIGKILL.

Bug: 249043036
Test: atest CtsInitTestCases:init#GentleKill
      Added in next patch
Change-Id: Ieb0e4e24d31780aca1cf291f9d21d49cee181cf2
2023-01-10 18:29:46 -08:00
Bart Van Assche
f85317fb43 Make an error message more informative
From
https://android-build.googleplex.com/builds/tests/view?testResultId=TR66328435937757440&invocationId=I00700010119503421:

system/core/init/init_test.cpp:219: Failure
Failed
Value of: service-&gt;Start()
  Actual: createProcessGroup(0, 15611) failed for service 'console'
  Expected: is ok

The above error message does not contain enough information to
root-cause the test failure. Hence this CL that makes an error message
more informative.

Bug: 262090304
Change-Id: I09929b2f2aabf1eec4d90ec93234a9e968888da4
Signed-off-by: Bart Van Assche <bvanassche@google.com>
2022-12-31 23:41:29 +00:00
Steven Moreland
bb1ee3c689 Merge "ignore error -> log" 2022-12-14 01:29:20 +00:00
Steven Moreland
507209ba55 ignore error -> log
Current code ignores an error, which is a code
rot risk.

Bug: 261700511
Change-Id: I04ca7046dc42d761ecaaf56f6100c96cc8298ec5
Test: N/A
2022-12-13 22:43:58 +00:00
Treehugger Robot
5c3e24816d Merge "Kill services even when cgroups is disabled" 2022-12-12 01:20:45 +00:00
Inseob Kim
a049a9928b Kill services even when cgroups is disabled
process_cgroup_empty_ is used to indicate that a service is already
killed or not. If cgroup support lacks, services cannot be killed
because process_cgroup_empty_ is always true.

This change fixes it by not assigning process_cgroup_empty_ as true.
Instead, make KillProcessGroup send signals even when cgroup is
disabled. Also DoKillProcessGroupOnce() is updated so it returns a number of killed processes, excluding already dead processes. This behavior agrees with its name (DoKillProcessOnce), and it prevents regression upon missing cgroups, because kill(-pgid) will always
"succeed" so KillProcessGroup will loop even when all processes are
already dead.

Bug: 257264124
Test: boot microdroid, see services are terminated
Change-Id: I19abf19ff1b70c666cd6f12d0a12956765174aaa
2022-12-12 01:19:26 +00:00
Bart Van Assche
42764c4e3e init: Make an error message more informative
Make it easier to diagnose service failures.

Bug: 213617178
Change-Id: I27135cb32b6a98b2fe24ab2324dffbf5b591fdd5
Signed-off-by: Bart Van Assche <bvanassche@google.com>
2022-12-05 11:41:10 -08:00
Bart Van Assche
29d8a42d14 Revert "init: Add more diagnostics for signalfd hangs."
Revert commit 14f9c15e05 ("init: Add more diagnostics for signalfd
hangs") because:
* That commit was intented to help with root-causing b/223076262.
* The root cause of b/223076262 has been fixed (not blocking SIGCHLD
  in all threads in the init process).

Test: Treehugger
Change-Id: I586663ec0588e74a9d58512f7f31155398cf4f52
Signed-off-by: Bart Van Assche <bvanassche@google.com>
2022-11-30 09:17:16 -08:00
Bart Van Assche
01e6669c66 init: Fix a race condition in KillProcessGroup()
Multiple tests in CtsInitTestCases, e.g. RebootTest#StopServicesSIGKILL,
can trigger the following race condition:
* A service is started. This involves calling fork() and also to call
  RunService() in the child process. RunService() calls setpgid().
* Service::Stop() is called and calls KillProcessGroup().
  KillProcessGroup() calls kill(-pgid, SIGKILL) before the child process
  has called setpgid(). pgid is the process ID of the child process. The
  kill() call fails because setpgid() has not yet been called.

Fix this race condition by adding a setpgid() call in the parent process
and by waiting from the parent until the child has called setsid() if a
console is attached.

Bug: 213617178
Change-Id: Ieb9e6908df725447e3695ed66bb8bd30e4e38aa9
Signed-off-by: Bart Van Assche <bvanassche@google.com>
2022-11-21 11:42:44 -08:00
Bart Van Assche
c8f34254b8 init: Introduce symbolic names for certain constants
Make the code easier to read by introducing symbolic names for the
constants used by Service::Start() for communication between the parent
and child processes.

Bug: 213617178
Change-Id: I3e735e149682fa9df2ed57f75eb5a67d7c68bd92
Signed-off-by: Bart Van Assche <bvanassche@google.com>
2022-11-18 09:49:01 -08:00
Bart Van Assche
41787239ec Revert "init: Rename 'cgroups_activated' into 'fifo'"
Revert commit 9c61dad67e in preparation of
introducing a second interprocess communication channel.

Bug: 213617178
Change-Id: I2959a3902a1b994cca2ac99855be1fc60d63bcbb
Signed-off-by: Bart Van Assche <bvanassche@google.com>
2022-11-18 09:42:14 -08:00
Bart Van Assche
f26e59ebba Revert "init: Fix a race condition in KillProcessGroup()"
This reverts commit 15e5ecdcd7.

Reason for revert: breaks console support.
Bug: 213617178
Bug: 258754901
Change-Id: Iffe213e2cd295461a427621f2b84933f1bebd39f
2022-11-15 00:55:45 +00:00
Bart Van Assche
15e5ecdcd7 init: Fix a race condition in KillProcessGroup()
Multiple tests in CtsInitTestCases, e.g. RebootTest#StopServicesSIGKILL,
can trigger the following race condition:
* A service is started. This involves calling fork() and also to call
  RunService() in the child process. RunService() calls setpgid().
* Service::Stop() is called and calls KillProcessGroup().
  KillProcessGroup() calls kill(-pgid, SIGKILL) before the child process
  has called setpgid(). pgid is the process ID of the child process. The
  kill() call fails because setpgid() has not yet been called.

Fix this race condition by adding a setpgid() call in the parent process
and by waiting from the parent until the child has called setsid() if a
console is attached.

Bug: 213617178
Test: Cuttlefish + atest 'CtsInitTestCases'
Change-Id: I6931cd579e607c247b4f79a5b375455ca3d52e29
Signed-off-by: Bart Van Assche <bvanassche@google.com>
2022-11-10 09:08:21 -08:00
Bart Van Assche
9c61dad67e init: Rename 'cgroups_activated' into 'fifo'
Prepare for using the interprocess communication channel in two
directions.

Bug: 213617178
Change-Id: Ic78a3d8a2ec1f808fa5b4c4b198051655ee1b0ec
Signed-off-by: Bart Van Assche <bvanassche@google.com>
2022-11-04 14:56:43 -07:00
Bart Van Assche
dcc378e53c Revert "init: Fix a race condition in KillProcessGroup()"
This reverts commit d8ef6f84d4.

Reason for revert: b/256874349

Change-Id: I86a1e03a0d2979db1c54abd3e78c32591fda98a1
2022-11-03 15:15:25 +00:00
Bart Van Assche
d8ef6f84d4 init: Fix a race condition in KillProcessGroup()
Multiple tests in CtsInitTestCases, e.g. RebootTest#StopServicesSIGKILL,
can trigger the following race condition:
* A service is started. This involves calling fork() and also to call
  RunService() in the child process. RunService() calls setpgid().
* Service::Stop() is called and calls KillProcessGroup().
  KillProcessGroup() calls kill(-pgid, SIGKILL) before the child process
  has called setpgid(). pgid is the process ID of the child process. The
  kill() call fails because setpgid() has not yet been called.

Fix this race condition by adding a setpgid() call in the parent process
and by waiting from the parent until the child has called setsid() if a
console is attached.

Bug: 213617178
Test: Cuttlefish + atest 'CtsInitTestCases'
Change-Id: I4c55790c2dcde8716b860aecd57708d51a081086
Signed-off-by: Bart Van Assche <bvanassche@google.com>
2022-10-27 14:29:35 -07:00
Bart Van Assche
1693f420d4 init: Introduce class InterprocessFifo
Prepare for introducing a second interprocess communication channel by
introducing the class InterprocessFifo. Stop using std::unique_ptr<> for
holding the pipe file descriptors. Handle EOF consistently.

Bug: 213617178
Change-Id: Ic0cf18d3d8ea61b8ee17e64de8a9df2736e26728
Signed-off-by: Bart Van Assche <bvanassche@google.com>
2022-10-25 06:56:50 -07:00
Nikita Ioffe
537ab23872 Merge "init: skip cgroup/task_profiles configuration if cgroups are disabled" 2022-10-24 07:14:12 +00:00
Bart Van Assche
77f3fe5e68 init: Fix the implementation of the task_profiles keyword
The documentation added by commit c9c0bbac53 ("init: Add task_profiles
init command") mentions that the task_profiles keyword sets process
attributes. Make the implementation of that keyword match the
documentation.

Change-Id: Ia080132f16bfc2488f8c25176d6aed37a2c42780
Signed-off-by: Bart Van Assche <bvanassche@google.com>
2022-10-21 15:34:19 -07:00
Nikita Ioffe
c2b1654c11 init: skip cgroup/task_profiles configuration if cgroups are disabled
We are planning to remove cgroups from the Micrdroid kernel, since the
entire VM belongs exclusively to a single owner, and is in the control
of the cgroups on the host side.

This patch expoxes CgroupAvailable API from libprocessgroup, and changes
init to query the CgroupAvailable API before doing any
cgroups/task_profiles related work.

Bug: 239367015
Test: run MicrodroidDemoApp
Test: atest --test-mapping packages/modules/Virtualization:avf-presubmit
Change-Id: I82787141cd2a7f9309a4e9b24acbd92ca21c145b
2022-10-21 13:14:23 +01:00
Florian Mayer
84a30c8526 Merge "[MTE] Add device config to control upgrade time" 2022-10-03 17:47:12 +00:00
Florian Mayer
caa7a60e2d [MTE] Add device config to control upgrade time
Bug: 169277947
Change-Id: I67eb94a668e60a2970bb086f82cc69396275340a
2022-09-16 09:49:38 -07:00
Florian Mayer
d705c2dbcd [MTE] only upgrade to SYNC mode for MTE crashes
Bug: 244471804
Test: atest mte_ugprade_test on emulator
Change-Id: Ie974cf2dec96267012f1b01b9a40dad86551b1be
2022-09-13 15:35:07 -07:00
Treehugger Robot
c113dc3a95 Merge "Upgrade MTE to SYNC after ASYNC crash." 2022-09-06 21:29:14 +00:00
Florian Mayer
2ef47f8f6d Upgrade MTE to SYNC after ASYNC crash.
Bug: 169277947
Test: atest mte_ugprade_test on emulator.
Test: ASSUMPTION_FAILED on non-MTE
Test: ASSUMPTION_FAILED on HWASan
Change-Id: I5328d094ffb106abaa548feb76058c9ebd11d745
2022-09-06 20:10:57 +00:00
Bart Van Assche
fcf047113f init: Apply the NormalIoProfile when creating a service
Prepare for migration of the blkio controller to the v2 cgroup hierarchy
by applying the NormalIoProfile when starting a service. While the
NormalIoProfile is automatically applied when the blkio controller is
mounted in the v1 hierarchy, this is not the case for the v2 hierarchy.

Bug: 213617178
Change-Id: I3cad288a31aa2692e10c778ae1e5fdd04acd66d7
Signed-off-by: Bart Van Assche <bvanassche@google.com>
2022-08-25 14:01:01 -07:00
Deyao Ren
aebf88191b Merge "Add apex name to service" am: ec73481e58
Original change: https://android-review.googlesource.com/c/platform/system/core/+/2155014

Change-Id: I3c44c321568173fa11588c2d6c69a43ad48c63f9
Signed-off-by: Automerger Merge Worker <android-build-automerger-merge-worker@system.gserviceaccount.com>
2022-07-22 23:09:19 +00:00
Deyao Ren
df40ed1be1 Add apex name to service
Passed apex file name to service. The file name will be parsed
to determine 1) whether the service is from an apex; 2) apex name

Bug: 236090201

Change-Id: I2c292c0c067f4bf44bb25b1f80e4f972b94f7258
2022-07-22 04:00:30 +00:00
Jooyung Han
000b85449c Merge "init starts servicemanagers in "default" mount ns" am: e89c457157
Original change: https://android-review.googlesource.com/c/platform/system/core/+/2153354

Change-Id: I9fcb98938403626697ea5b515e1f5d2c82fbefd8
Signed-off-by: Automerger Merge Worker <android-build-automerger-merge-worker@system.gserviceaccount.com>
2022-07-15 00:34:25 +00:00
Jooyung Han
c5fa15e08c init starts servicemanagers in "default" mount ns
servicemanager/hwservicemanager are pre-apexd services but still wants
to see VINTF fragments from APEXes, especially from /data.

Like ueventd, these services need to be started in "default" mount
namespace.

Bug: 237672865
Test: m && boot
Change-Id: I0266c5be5530a1a07f8ffa23a26186d45a55613f
2022-07-14 18:31:21 +09:00
Treehugger Robot
af4e6561d7 Merge "init: log services requested restart" am: 0ddcf6d2f1 am: 4d3bf512b0
Original change: https://android-review.googlesource.com/c/platform/system/core/+/2099238

Change-Id: I8ef99df0a8ecb38f14d5fdf12374f240f0439f37
Signed-off-by: Automerger Merge Worker <android-build-automerger-merge-worker@system.gserviceaccount.com>
2022-05-18 06:38:58 +00:00
Steven Moreland
61169c76dd init: log services requested restart
We have a case where a service is requested to be started and does
not appear to be running, but we see no indication that it is
actually starting. This log should be enough information to see
if init is in a bad state.

Bug: 232297944
Test: doesn't add too much spam
    ~/android/aosp/system/core/init :) adb logcat -d | grep "requested start" | wc -l
    42
Change-Id: Ic07f250c98b200b9e5b4432200c3668c6ca0ff35
2022-05-17 22:54:55 +00:00
Suren Baghdasaryan
d53a8ed83d Merge changes from topic "228160715_fix" am: 25f0c1c457 am: 42bab74623
Original change: https://android-review.googlesource.com/c/platform/system/core/+/2080619

Change-Id: Ifcb16ff2c2cf9889e6765c4a3abbf68354fe1e2b
Signed-off-by: Automerger Merge Worker <android-build-automerger-merge-worker@system.gserviceaccount.com>
2022-05-03 20:32:46 +00:00
Suren Baghdasaryan
af64077f83 init: Purge empty process groups on zygote restart
When system_server crashes or gets killed, it causes zygote to kill
itself, which in turn leads to killing all processes in the same
process group (all apps). This leaves empty process groups because
system_server is not there to remove them.
Purge empty process groups when init detects zygote death.

Bug: 228160715
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Change-Id: I0ce27eea28f8713e52033bbec2d5363a7b8ff5db
2022-04-29 17:17:51 +00:00
David Anderson
61b06cc84f Merge "init: Add more diagnostics for b/223076262." into tm-dev 2022-04-05 18:13:09 +00:00
David Anderson
d7f2bfba54 init: Add more diagnostics for b/223076262.
This adds three more diagnostics to stuck exec services:

1. /proc/pid/fds is dumped
2. /proc/pid/status is dumped
3. HandleSignalFd is called to see if a SIGCHLD got stuck somewhere

Bug: 223076262
Test: while (1) in linkerconfig
Ignore-AOSP-First: diagnostics
Change-Id: Ida601d86e18be9d49b143fb88b418cbc171ecac6
2022-04-05 07:16:27 +00:00
Suren Baghdasaryan
1bd1746447 init: Treat failure to create a process group as fatal
During process startup, system creates a process group and places the
new process in it. If process group creation fails for some reason, the
new child process will stay in its parent's group. This poses danger
when the child is being frozen because the whole group is affected and
its parent is being frozen as well.
Fix this by treating group creation failure as a fatal error which would
prevent the app from starting.

Bug: 227395690
Test: fake group creation failure and confirm service failure to start
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Change-Id: I83261bef803751759c7fd709bf1ccd33ccad3a0b
2022-04-01 23:32:47 +00:00