Commit graph

243 commits

Author SHA1 Message Date
Bart Van Assche
01e6669c66 init: Fix a race condition in KillProcessGroup()
Multiple tests in CtsInitTestCases, e.g. RebootTest#StopServicesSIGKILL,
can trigger the following race condition:
* A service is started. This involves calling fork() and also to call
  RunService() in the child process. RunService() calls setpgid().
* Service::Stop() is called and calls KillProcessGroup().
  KillProcessGroup() calls kill(-pgid, SIGKILL) before the child process
  has called setpgid(). pgid is the process ID of the child process. The
  kill() call fails because setpgid() has not yet been called.

Fix this race condition by adding a setpgid() call in the parent process
and by waiting from the parent until the child has called setsid() if a
console is attached.

Bug: 213617178
Change-Id: Ieb9e6908df725447e3695ed66bb8bd30e4e38aa9
Signed-off-by: Bart Van Assche <bvanassche@google.com>
2022-11-21 11:42:44 -08:00
Bart Van Assche
c8f34254b8 init: Introduce symbolic names for certain constants
Make the code easier to read by introducing symbolic names for the
constants used by Service::Start() for communication between the parent
and child processes.

Bug: 213617178
Change-Id: I3e735e149682fa9df2ed57f75eb5a67d7c68bd92
Signed-off-by: Bart Van Assche <bvanassche@google.com>
2022-11-18 09:49:01 -08:00
Bart Van Assche
41787239ec Revert "init: Rename 'cgroups_activated' into 'fifo'"
Revert commit 9c61dad67e in preparation of
introducing a second interprocess communication channel.

Bug: 213617178
Change-Id: I2959a3902a1b994cca2ac99855be1fc60d63bcbb
Signed-off-by: Bart Van Assche <bvanassche@google.com>
2022-11-18 09:42:14 -08:00
Bart Van Assche
f26e59ebba Revert "init: Fix a race condition in KillProcessGroup()"
This reverts commit 15e5ecdcd7.

Reason for revert: breaks console support.
Bug: 213617178
Bug: 258754901
Change-Id: Iffe213e2cd295461a427621f2b84933f1bebd39f
2022-11-15 00:55:45 +00:00
Bart Van Assche
15e5ecdcd7 init: Fix a race condition in KillProcessGroup()
Multiple tests in CtsInitTestCases, e.g. RebootTest#StopServicesSIGKILL,
can trigger the following race condition:
* A service is started. This involves calling fork() and also to call
  RunService() in the child process. RunService() calls setpgid().
* Service::Stop() is called and calls KillProcessGroup().
  KillProcessGroup() calls kill(-pgid, SIGKILL) before the child process
  has called setpgid(). pgid is the process ID of the child process. The
  kill() call fails because setpgid() has not yet been called.

Fix this race condition by adding a setpgid() call in the parent process
and by waiting from the parent until the child has called setsid() if a
console is attached.

Bug: 213617178
Test: Cuttlefish + atest 'CtsInitTestCases'
Change-Id: I6931cd579e607c247b4f79a5b375455ca3d52e29
Signed-off-by: Bart Van Assche <bvanassche@google.com>
2022-11-10 09:08:21 -08:00
Bart Van Assche
9c61dad67e init: Rename 'cgroups_activated' into 'fifo'
Prepare for using the interprocess communication channel in two
directions.

Bug: 213617178
Change-Id: Ic78a3d8a2ec1f808fa5b4c4b198051655ee1b0ec
Signed-off-by: Bart Van Assche <bvanassche@google.com>
2022-11-04 14:56:43 -07:00
Bart Van Assche
dcc378e53c Revert "init: Fix a race condition in KillProcessGroup()"
This reverts commit d8ef6f84d4.

Reason for revert: b/256874349

Change-Id: I86a1e03a0d2979db1c54abd3e78c32591fda98a1
2022-11-03 15:15:25 +00:00
Bart Van Assche
d8ef6f84d4 init: Fix a race condition in KillProcessGroup()
Multiple tests in CtsInitTestCases, e.g. RebootTest#StopServicesSIGKILL,
can trigger the following race condition:
* A service is started. This involves calling fork() and also to call
  RunService() in the child process. RunService() calls setpgid().
* Service::Stop() is called and calls KillProcessGroup().
  KillProcessGroup() calls kill(-pgid, SIGKILL) before the child process
  has called setpgid(). pgid is the process ID of the child process. The
  kill() call fails because setpgid() has not yet been called.

Fix this race condition by adding a setpgid() call in the parent process
and by waiting from the parent until the child has called setsid() if a
console is attached.

Bug: 213617178
Test: Cuttlefish + atest 'CtsInitTestCases'
Change-Id: I4c55790c2dcde8716b860aecd57708d51a081086
Signed-off-by: Bart Van Assche <bvanassche@google.com>
2022-10-27 14:29:35 -07:00
Bart Van Assche
1693f420d4 init: Introduce class InterprocessFifo
Prepare for introducing a second interprocess communication channel by
introducing the class InterprocessFifo. Stop using std::unique_ptr<> for
holding the pipe file descriptors. Handle EOF consistently.

Bug: 213617178
Change-Id: Ic0cf18d3d8ea61b8ee17e64de8a9df2736e26728
Signed-off-by: Bart Van Assche <bvanassche@google.com>
2022-10-25 06:56:50 -07:00
Nikita Ioffe
537ab23872 Merge "init: skip cgroup/task_profiles configuration if cgroups are disabled" 2022-10-24 07:14:12 +00:00
Bart Van Assche
77f3fe5e68 init: Fix the implementation of the task_profiles keyword
The documentation added by commit c9c0bbac53 ("init: Add task_profiles
init command") mentions that the task_profiles keyword sets process
attributes. Make the implementation of that keyword match the
documentation.

Change-Id: Ia080132f16bfc2488f8c25176d6aed37a2c42780
Signed-off-by: Bart Van Assche <bvanassche@google.com>
2022-10-21 15:34:19 -07:00
Nikita Ioffe
c2b1654c11 init: skip cgroup/task_profiles configuration if cgroups are disabled
We are planning to remove cgroups from the Micrdroid kernel, since the
entire VM belongs exclusively to a single owner, and is in the control
of the cgroups on the host side.

This patch expoxes CgroupAvailable API from libprocessgroup, and changes
init to query the CgroupAvailable API before doing any
cgroups/task_profiles related work.

Bug: 239367015
Test: run MicrodroidDemoApp
Test: atest --test-mapping packages/modules/Virtualization:avf-presubmit
Change-Id: I82787141cd2a7f9309a4e9b24acbd92ca21c145b
2022-10-21 13:14:23 +01:00
Florian Mayer
84a30c8526 Merge "[MTE] Add device config to control upgrade time" 2022-10-03 17:47:12 +00:00
Florian Mayer
caa7a60e2d [MTE] Add device config to control upgrade time
Bug: 169277947
Change-Id: I67eb94a668e60a2970bb086f82cc69396275340a
2022-09-16 09:49:38 -07:00
Florian Mayer
d705c2dbcd [MTE] only upgrade to SYNC mode for MTE crashes
Bug: 244471804
Test: atest mte_ugprade_test on emulator
Change-Id: Ie974cf2dec96267012f1b01b9a40dad86551b1be
2022-09-13 15:35:07 -07:00
Treehugger Robot
c113dc3a95 Merge "Upgrade MTE to SYNC after ASYNC crash." 2022-09-06 21:29:14 +00:00
Florian Mayer
2ef47f8f6d Upgrade MTE to SYNC after ASYNC crash.
Bug: 169277947
Test: atest mte_ugprade_test on emulator.
Test: ASSUMPTION_FAILED on non-MTE
Test: ASSUMPTION_FAILED on HWASan
Change-Id: I5328d094ffb106abaa548feb76058c9ebd11d745
2022-09-06 20:10:57 +00:00
Bart Van Assche
fcf047113f init: Apply the NormalIoProfile when creating a service
Prepare for migration of the blkio controller to the v2 cgroup hierarchy
by applying the NormalIoProfile when starting a service. While the
NormalIoProfile is automatically applied when the blkio controller is
mounted in the v1 hierarchy, this is not the case for the v2 hierarchy.

Bug: 213617178
Change-Id: I3cad288a31aa2692e10c778ae1e5fdd04acd66d7
Signed-off-by: Bart Van Assche <bvanassche@google.com>
2022-08-25 14:01:01 -07:00
Deyao Ren
aebf88191b Merge "Add apex name to service" am: ec73481e58
Original change: https://android-review.googlesource.com/c/platform/system/core/+/2155014

Change-Id: I3c44c321568173fa11588c2d6c69a43ad48c63f9
Signed-off-by: Automerger Merge Worker <android-build-automerger-merge-worker@system.gserviceaccount.com>
2022-07-22 23:09:19 +00:00
Deyao Ren
df40ed1be1 Add apex name to service
Passed apex file name to service. The file name will be parsed
to determine 1) whether the service is from an apex; 2) apex name

Bug: 236090201

Change-Id: I2c292c0c067f4bf44bb25b1f80e4f972b94f7258
2022-07-22 04:00:30 +00:00
Jooyung Han
000b85449c Merge "init starts servicemanagers in "default" mount ns" am: e89c457157
Original change: https://android-review.googlesource.com/c/platform/system/core/+/2153354

Change-Id: I9fcb98938403626697ea5b515e1f5d2c82fbefd8
Signed-off-by: Automerger Merge Worker <android-build-automerger-merge-worker@system.gserviceaccount.com>
2022-07-15 00:34:25 +00:00
Jooyung Han
c5fa15e08c init starts servicemanagers in "default" mount ns
servicemanager/hwservicemanager are pre-apexd services but still wants
to see VINTF fragments from APEXes, especially from /data.

Like ueventd, these services need to be started in "default" mount
namespace.

Bug: 237672865
Test: m && boot
Change-Id: I0266c5be5530a1a07f8ffa23a26186d45a55613f
2022-07-14 18:31:21 +09:00
Treehugger Robot
af4e6561d7 Merge "init: log services requested restart" am: 0ddcf6d2f1 am: 4d3bf512b0
Original change: https://android-review.googlesource.com/c/platform/system/core/+/2099238

Change-Id: I8ef99df0a8ecb38f14d5fdf12374f240f0439f37
Signed-off-by: Automerger Merge Worker <android-build-automerger-merge-worker@system.gserviceaccount.com>
2022-05-18 06:38:58 +00:00
Steven Moreland
61169c76dd init: log services requested restart
We have a case where a service is requested to be started and does
not appear to be running, but we see no indication that it is
actually starting. This log should be enough information to see
if init is in a bad state.

Bug: 232297944
Test: doesn't add too much spam
    ~/android/aosp/system/core/init :) adb logcat -d | grep "requested start" | wc -l
    42
Change-Id: Ic07f250c98b200b9e5b4432200c3668c6ca0ff35
2022-05-17 22:54:55 +00:00
Suren Baghdasaryan
d53a8ed83d Merge changes from topic "228160715_fix" am: 25f0c1c457 am: 42bab74623
Original change: https://android-review.googlesource.com/c/platform/system/core/+/2080619

Change-Id: Ifcb16ff2c2cf9889e6765c4a3abbf68354fe1e2b
Signed-off-by: Automerger Merge Worker <android-build-automerger-merge-worker@system.gserviceaccount.com>
2022-05-03 20:32:46 +00:00
Suren Baghdasaryan
af64077f83 init: Purge empty process groups on zygote restart
When system_server crashes or gets killed, it causes zygote to kill
itself, which in turn leads to killing all processes in the same
process group (all apps). This leaves empty process groups because
system_server is not there to remove them.
Purge empty process groups when init detects zygote death.

Bug: 228160715
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Change-Id: I0ce27eea28f8713e52033bbec2d5363a7b8ff5db
2022-04-29 17:17:51 +00:00
David Anderson
61b06cc84f Merge "init: Add more diagnostics for b/223076262." into tm-dev 2022-04-05 18:13:09 +00:00
David Anderson
d7f2bfba54 init: Add more diagnostics for b/223076262.
This adds three more diagnostics to stuck exec services:

1. /proc/pid/fds is dumped
2. /proc/pid/status is dumped
3. HandleSignalFd is called to see if a SIGCHLD got stuck somewhere

Bug: 223076262
Test: while (1) in linkerconfig
Ignore-AOSP-First: diagnostics
Change-Id: Ida601d86e18be9d49b143fb88b418cbc171ecac6
2022-04-05 07:16:27 +00:00
Suren Baghdasaryan
1bd1746447 init: Treat failure to create a process group as fatal
During process startup, system creates a process group and places the
new process in it. If process group creation fails for some reason, the
new child process will stay in its parent's group. This poses danger
when the child is being frozen because the whole group is affected and
its parent is being frozen as well.
Fix this by treating group creation failure as a fatal error which would
prevent the app from starting.

Bug: 227395690
Test: fake group creation failure and confirm service failure to start
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Change-Id: I83261bef803751759c7fd709bf1ccd33ccad3a0b
2022-04-01 23:32:47 +00:00
David Anderson
fe62ca7165 Merge "init: Add more diagnostics for signalfd hangs." 2022-03-16 23:11:12 +00:00
David Anderson
14f9c15e05 init: Add more diagnostics for signalfd hangs.
This adds two new diagnostics. First, signalfd reads are now non-blocking. If the read takes more than 10 seconds, we log an error.

Second, init now wakes up from epoll() every 10 seconds. If it waits on an "exec" command for more than 10 seconds, it logs an error.

This change will be reverted as soon as we get feedback.

Bug: 223076262
Test: device boots
Change-Id: I7ee98d159599217a641b3de2564a92c2435f57ef
2022-03-16 05:06:17 +00:00
Bart Van Assche
bd73665e68 Introduce the RunService() method
The Service::Start() method is so long that its length negatively
affects readability of the code. Hence this patch that splits
Service::Start().

Test: Booted Android in Cuttlefish.
Change-Id: I5a6f587ecc5e6470137de6cceda7e685bce28ced
Signed-off-by: Bart Van Assche <bvanassche@google.com>
2022-03-01 15:17:33 +00:00
Bart Van Assche
f2222aab6a Introduce the ConfigureMemcg() method
The Service::Start() method is so long that its length negatively
affects readability of the code. Hence this patch that splits
Service::Start().

Test: Booted Android in Cuttlefish.
Change-Id: I972f4e60844bb0d133b1cca1fd4e06bb89fc5f37
Signed-off-by: Bart Van Assche <bvanassche@google.com>
2022-03-01 15:17:33 +00:00
Bart Van Assche
847b80a112 Introduce the Service::CheckConsole() method
The Service::Start() method is so long that its length negatively
affects readability of the code. Hence this patch that splits
Service::Start().

Test: Booted Android in Cuttlefish.
Change-Id: Ib8e1e87fbd335520cbe3aac2a88d250fcf3b4ff0
Signed-off-by: Bart Van Assche <bvanassche@google.com>
2022-03-01 15:17:33 +00:00
Bart Van Assche
ee36ba39f9 Fix a race condition in Service::Start()
The SetTaskProfiles() call modifies cgroup attributes. Modifying cgroup
attributes can only succeed after the cgroups and cgroup attributes have
been created. Hence this patch that makes the child process wait until
the parent has finished creating cgroups and activating cgroup
controllers.

Bug: 213617178
Test: Without this patch the migration to the v2 hierarchy does not work reliably. With this patch applied, the migration to the v2 hierarchy works reliably.
Change-Id: I80a7c0a35453d8fd89ed798d077086aa8ba9ea17
Signed-off-by: Bart Van Assche <bvanassche@google.com>
2022-02-15 22:31:09 +00:00
Eric Biggers
dd41635cef init: remove the class_{start,reset}_post_data commands
Remove the class_start_post_data and class_reset_post_data commands,
since they aren't used anymore.  They were only used on devices that
used FDE (Full Disk Encryption), via actions in rootdir/init.rc.  These
actions have been removed, since support for FDE has been removed.
There is no use case for these commands in vendor init scripts either.

Keep the mark_post_data command, since DoUserspaceReboot() uses the
post-data service flag even on non-FDE devices.

Bug: 191796797
Change-Id: Ibcd97543daa724feb610546b5fc2a0dd7f1e62e7
2021-11-11 14:36:47 -08:00
David Anderson
0e5ad5a093 snapuserd: Allow connecting to the first-stage daemon.
Currently there is no socket for daemon instances launched during the
selinux phase of init. We don't create any sockets due to the complexity
of the required sepolicy.

This workaround will allow us to create the socket with very minimal
sepolicy changes. init will launch a one-off instance of snapuserd in
"proxy" mode, and then the following steps will occur:

1. The proxy daemon will be given two sockets, the "normal" socket that
snapuserd clients would connect to, and a "proxy" socket.
2. The proxy daemon will listen on the proxy socket.
3. The first-stage daemon will wake up and connect to the proxy daemon
as a client.
4. The proxy will send the normal socket via SCM_RIGHTS, then exit.
5. The first-stage daemon can now listen and accept on the normal
socket.

Ordering of these events is achieved through a snapuserd.proxy_ready
property.

Some special-casing was needed in init to make this work. The snapuserd
socket owned by snapuserd_proxy is placed into a "persist" mode so it
doesn't get deleted when snapuserd_proxy exits. There's also a special
case method to create a Service object around a previously existing pid.

Finally, first-stage init is technically on a different updateable
partition than snapuserd. Thus, we add a way to query snapuserd to see
if it supports socket handoff. If it does, we communicate this
information through an environment variable to second-stage init.

Bug: 193833730
Test: manual test
Change-Id: I1950b31028980f0138bc03578cd455eb60ea4a58
2021-07-27 19:35:29 -07:00
Eric Biggers
d14a178d01 Revert "init: make reboot_on_failure not apply to manually stopped services"
This reverts commit 1c51525f66 because it
accidentally made reboot_on_failure be a no-op for all services.  This
is because Reap() itself calls KillProcessGroup() on devices with a
vendor level >= R, which in turn sets SVC_STOPPING.  I had overlooked
this somehow, probably because I didn't consider that a service can
consist of multiple processes.

It turns out that real FDE devices don't actually need the above commit
because FDE devices aren't allowed to have updatable apexes enabled, and
without updatable apexes enabled, apexd exits automatically and
therefore doesn't have to be stopped.  This can be verified by using the
aosp_cf_x86_phone_noapex build target, rather than aosp_cf_x86_phone
which I had used for testing before.  So just revert it for now.

Bug: 194370048
Change-Id: I90eddf2a87397449b241e5acaaa8d4a4241d73a9
2021-07-22 13:06:41 -07:00
Eric Biggers
1c51525f66 init: make reboot_on_failure not apply to manually stopped services
Add a new service flag SVC_STOPPING which tracks whether a service is
being manually stopped by init, and make the "reboot_on_failure" service
setting not apply when SVC_STOPPING is set.

This is needed for devices that use FDE, because otherwise the device
reboots during the following init script fragment:

    on property:vold.decrypt=trigger_shutdown_framework
        class_reset late_start
        class_reset main
        class_reset_post_data core
        class_reset_post_data hal

... because that stops all services, including apexd which has been
marked with reboot_on_failure since
https://android-review.googlesource.com/c/platform/system/apex/+/1325212.
So init was killing apexd, then rebooting the device because apexd
"failed" due to having been killed.  Making reboot_on_failure not apply
when init stops a service itself fixes the problem.

This is one of a set of changes that is needed to get FDE working again
so that devices that launched with FDE can be upgraded to Android 12.

Bug: 186165644
Test: Tested FDE on Cuttlefish
Change-Id: I599f7ba107e6c126e8f31d0ae659f0ae672a25e4
2021-05-03 21:38:50 -07:00
Jooyung Han
e5232a71b2 init: apexd is started in the current mount namespace
init starts services in "bootstrap" mount namespace until the "default"
mount namespace is ready even when init's current mount namespace is
"default".

apexd and linkerconfig are those processes to set up the mount
namespaces: apexd activates apexes and linkerconfig generates linker
configs.

Previously apexd is allowed to be started in the "current" namespace by
checking its "service name"(it should be "apexd"). But there can be a
certain environment apexd is started in a different way. For example, in
microdroid, apexd is started using "exec -- /system/bin/apexd --vm"
because it wants to run in a different execution mode.

So, instead of checking the service name, its executable's path is
checked against to allow apexd to be started in the current mount
namespace.

Bug: 179342589
Test: MicrodroidTestCase (microdroid boots)
Test: cuttlefish boots
Change-Id: I7c2490e15d481c28ddf382d2d3fdf58a78e467ec
2021-04-20 22:50:12 +09:00
Kiyoung Kim
0cbee0de2a Check if service is executed before APEX is ready
Any service which is executed when Runtime apex is mounted, but
linkerconfig is not updated can fail to be executed due to missing
information in ld.config.txt. This change updates init to have a status
variable which contains if current mount namespace is default
and APEX is not ready from ld.config.txt, and use bootstrap namespace if
it is not ready.

Bug: 181348374
Test: cuttlefish boot succeeded
Change-Id: Ia574b1fad2110d4e68586680dacbe6137186546e
2021-03-05 16:42:20 +09:00
Jiyong Park
9c4ecdd84e Revert^2 "Remove ART APEX from the bootstrap apexes"
6d869dd6ab

Change-Id: I24906b7520ae01e586687ae26fcf6d8b63d9978d
2021-02-10 07:17:19 +00:00
chapin
6d869dd6ab Revert "Remove ART APEX from the bootstrap apexes"
Revert submission 1563392-remove_art_from_bootstrap

Reason for revert: Bug: 179002105
Reverted Changes:
I65e2a2089:Remove ART APEX from the bootstrap apexes
Ic20df80e2:Remove ART APEX from the bootstrap apexes

Change-Id: I474ab95805c5ca28e0bba91f3d226e8db5a7a9ea
2021-02-01 22:29:59 +00:00
Jiyong Park
b99c12ef10 Remove ART APEX from the bootstrap apexes
Test: forrest
Bug: 169779935
Change-Id: I65e2a2089fa12674f3abbbe2f154eeec984dd5df
2021-01-29 12:08:31 +09:00
Woody Lin
ef9d460ea8 Add init.svc_debug.no_fatal.<svc_name> to skip SVC_CRITICAL
For user who would like to retain the crash symptom and avoid device
from power cycle for live debugging, set
init.svc_debug.no_fatal.<svc_name> to "true" to skip FATAL reboot.

Bug: 177593855
Change-Id: I0bdb6191e5963c08e1ea301a60060acf916dd49b
2021-01-22 15:01:36 +08:00
Elliott Hughes
d92c6a12da Use freecon() with getcon()/getfilecon().
Bug: https://issuetracker.google.com/175090444
Test: treehugger
Change-Id: Ia2b8102f1c9a4fd56ec1ff026ba5b4f375102b9b
2020-12-08 22:30:17 -08:00
Daniel Norman
f597fa5d1d Returns a service parse error on overrides across the treble boundary.
Also includes new --out_<partition> flags for
  system,system_ext,product,vendor,odm
to allow host_init_verifier to work with a collection of init rc files.

Test: host_init_verifier --out_system=... --out_vendor=...
      where vendor contains an init rc file that overrides a service
      present in system. Observe parse failure and non-zero exit.
Bug: 163089173
Change-Id: I520fef613e0036df8a7d47a98d47405eaa969110
2020-11-19 10:02:56 -08:00
Steven Moreland
abc5f8830e init: log 'updatable process' clarification
-> process with updatable components

Fixes: 172605179
Test: N/A
Change-Id: I0f9353fe65cea623e1d2292f0163cc545bfc909d
2020-11-06 17:01:51 +00:00
Woody Lin
45215ae6e5 init/service_parser: Add arguments window' and target' for `critical'
The critical services can now using the interface `critical
[window=<fatal crash window mins>] [target=<fatal reboot target>]` to
setup the timing window that when there are more than 4 crashes in it,
the init will regard it as a fatal system error and reboot the system.

Config `window=${zygote.critical_window.minute:-off}' and
`target=zygote-fatal' for all system-server services, so platform that
configures ro.boot.zygote_critical_window can escape the system-server
crash-loop via init fatal handler.

Bug: 146818493
Change-Id: Ib2dc253616be6935ab9ab52184a1b6394665e813
2020-10-26 11:38:01 +08:00
Jooyung Han
4f23d5a236 init: start ueventd in the default mount namespace
Init starts ueventd in the default mount namespace to support loading
firmware from APEXes.

Bug: 155023652
Test: devices boots
      adb$ nsenter -t (pid of ueventd) -m ls /apex
      => shows all APEXes
Change-Id: Ibb8b33a07eb014752275e3bca4541b8b694dc64b
2020-06-11 15:10:40 +09:00