Using vector + unordered_map to retrieve the index in
a COW op vector consumes significant memory; this
is a problem especially when there are hundreds
of thousands of operations.
Instead, just store the index of the COW op vector
during pre-processing.
On Pixel, peak memory usage when all the partitions
are mapped:
Without patch:
RssAnon: 118804 kB
With path:
RssAnon: 55772 kB
Additionally, post OTA reboot, memory usage further goes
down as the partition merge completes.
Bug: 237490659
Test: OTA on Pixel
Ignore-AOSP-First: cherry-pick from aosp
Signed-off-by: Akilesh Kailash <akailash@google.com>
Change-Id: Icc68a9688ceb89572821cee2dac689779f5e7c11
Fsync failures are special because they may indicate a failure of an
operation before the current operation. Report these cases as a new,
distinct error.
Test: Cause fsync failure and check error response
Bug: 239105007
Change-Id: Ie9d4a1949586e90006256c975786e21ced655e66
Merged-In: Ie9d4a1949586e90006256c975786e21ced655e66
(cherry picked from commit 1c75d1e3a7)
Previously we did not support STORAGE_MSG_FLAG_POST_COMMIT for anything
but RPMB operations (in which case it was a no-op). We need to support
this flag in order to store a superblock in non-secure storage, as we
need that write to commit atomically wrt all other writes.
Test: com.android.storage-unittest.nsp
Bug: 228793975
Change-Id: Ia453c1916970e0b65a91e42f18b920ac4e1f01db
Merged-In: Ia453c1916970e0b65a91e42f18b920ac4e1f01db
(cherry picked from commit 57770a5318)
This will be in sync with incremental OTA's where the sync
is done every 2MB. This improves performance on devices
with low memory.
Merge times for full OTA may increase by couple of seconds but
that is ok given it decreases the memory footprint.
Bug: 237490659
Test: OTA
Ignore-AOSP-First: cherry-pick from aosp
Signed-off-by: Akilesh Kailash <akailash@google.com>
Change-Id: Ic2c8d2ffdbdb677e0c4d44e5de68ce8ccf86df34
Currently, when daemon is spin up, it runs at the highest
priority with nice value set to -20. This can potetially
lead to a problem in a busy system especially after OTA
reboot when all the merge threads are running in parallel.
Now that we reduced the number of merge threads in-flight
to 2, we reduce the priority as well by setting the nice
value to -5. The other threads which serve I/O's
from dm-user (from root filesystem) still runs at higher
priority. We need this because post OTA reboot, these
threads serve I/O's until merge is completed.
Merge threads on the other hand can run at a relatively
lower priority. We need to make sure that there
is always forward progress even in a busy system
and hence set the priority to -5 as compared
to default value of 0.
No boot time regressions observed.
Output of NICE value of merge and worker threads post OTA reboot:
1 S 0 427 451 1 0 39 -20 64 2314640 dev_r+ ? 00:00:00 8
1 S 0 427 486 1 4 39 -20 64 2314640 dev_r+ ? 00:00:02 8
1 S 0 427 487 1 4 39 -20 64 2314640 dev_r+ ? 00:00:02 8
1 S 0 427 488 1 3 39 -20 64 2314640 dev_r+ ? 00:00:02 8
5 R 0 427 634 1 1 24 -5 64 2314640 0 ? 00:00:00 8
5 R 0 427 935 1 5 24 -5 64 2314640 0 ? 00:00:02 8
Bug: 237490659
Test: Full and incremental OTA
Ignore-AOSP-First: cherry-pick from aosp
Signed-off-by: Akilesh Kailash <akailash@google.com>
Change-Id: I6791dd72ccd8cd5bba6eff663bb3f9598bce7ed2
Currently, there is one thread per partition
for snapshot merge. When all these threads are
run in parallel, this may stress the system
as the merge threads are both CPU and I/O bound.
Allow only two merge threads to be in-flight
at any point in time. This will ensure that there
is forward progress done with respect to snapshot-merge
and only two cores are used as against using
5-6 cores.
Additionally, system and prodcut partitions are merged
first. This is primarily because /root is mounted
of system partition and faster the merge completes
on /system partition, we can switch the dm tables
immediately. There is no change in the merge phase
from libsnapshot perspective. This prioritization
is based on each merge phase. If the system partition
merge is in second phase, then it takes priority
in that phase.
As a side benefit, this should also
reduce the memory usage when merge is in-flight
given that we now limit the threads.
There is slight delay in overall merge time as
we now throttle the merge.
No boot time regressions observed.
Full OTA:
Merge time (Without this patch): 42 seconds
Merge time (With this patch): 46 seconds
Incremental OTA:
Merge time (Without this patch): 52 seconds
Merge time (With this patch): 57 seconds
system partition merge completes in the first ~12-16 seconds.
App-launch (COLD) on Pixel:
Baseline (After snapshot-merge is completed when there is no daemon):
==========================
Chrome: 250
youtube: 631
camera: 230
==========================
Without this patch when snapshot-merge is in-progress (in ms):
Full - OTA
Chrome: 1729
youtube: 3126
camera: 1525
==========================
With this patch when snapshot-merge is in-progress (in ms):
Full - OTA
Chrome: 1061
youtube: 820
camera: 1378
Incremental - OTA (350M)
Chrome: 495
youtube: 1442
camera: 641
=====================
Bug: 237490659
Ignore-AOSP-First: cherry-pick from aosp
Test: Full and incremental OTA
Signed-off-by: Akilesh Kailash <akailash@google.com>
Change-Id: I887d5073dba88e9a8a85ac10c771e4ccee7c84ff
This reverts commit da94c7f650.
Reason for revert: It appears this change slows down boot on normal devices.
Technically, this change is not necessary, but it prevents starting the secondary and having it throw an error in the only run 64 bit zygote config. But it's easier to throw the error than slow down boot up.
Bug: 238971179
Test: Verified that on a 64 with 32 config, the secondary zygote
Test: starts but exits.
Change-Id: I7ab0496a402db83e70168d52e5d5911b82a3b06a
Merged-In: I7ab0496a402db83e70168d52e5d5911b82a3b06a
(cherry picked from commit 3fa3f861d4)
This is part of the changes that will allow creating a single
system image but a different set of properties will either
start or not start the secondary zygote.
Bug: 227482437
Test: Verified that secondary doesn't start with same system image
Test: with ro.zygote set to zygote64 and abilists set appropriately.
Test: Verified that secondary does not start when restarting netd.
Test: Verified that secondary does start with same system image
Test: with ro.zygote set to zygote64_32 and abilists set appropriately.
Test: Verified that secondary does start when restarting netd.
Test: Verified that a 64 bit device only starts the primary.
Test: Verified that a 32 bit device only starts the primary.
Change-Id: Id37a223c73f9a61868b2e26450ef4b6964f7b496
Merged-In: Id37a223c73f9a61868b2e26450ef4b6964f7b496
EROFS is not mandatory for android T and below,
so skip the test for those.
Bug: 237765186
Test: vts_fs_test fs#ErofsSupported
Change-Id: Iceea46f8f2d443636de504962b718a2461605591
Ignore-AOSP-First: already present in aosp/master
Skip checking for userspace snapshots enabled property
for API level < T as this feature is not applicable for
GRF targets.
Bug: 236450435
Test: vts_ota_config_test
Change-Id: Ib5083f6237cdf4962aae06f166811d67cf6c385e
Ignore-AOSP-First: already present in aosp/master
If the vendor partition is on S and system partition is on T,
certain tests in vts_libsnapshot_test used to fail. This is primarily
because of inconsistent check between daemon and vts test.
vts test checks the userspace.snapshots.enabled property which is true on T
but never checks if the underlying vendor partition is on S. Hence,
vts test will enable userspace snapshots. However, daemon checks
the vendor partition and disables userspace snapshots thereby
leading to inconsistency.
This is only a problem on vts tests. The underlying OTA on devices
works fine as we have the vendor partition check.
Bug: 236311008
Test: vts_libsnapshot_test on S vendor and T system
vts_libsnapshot_test on T vendor and T system
Ignore-AOSP-First: cherry-pick from aosp
Signed-off-by: Akilesh Kailash <akailash@google.com>
Change-Id: Iad4f299bd2e07c9c01f5fbee6a20e2f01bf1778a
When a process is started as a native service,
oom_score_adj is set to -1000 so that processes
are unkillable by lmkd.
During boot, snapuserd process is not started as a service;
hence, we need to set the oom_score_adj explicitly else in
the event of low memory situation, lmkd can kill the
process thereby device can never boot.
Bug: 234691483
Test: th and OTA on Pixel
Ignore-AOSP-First: cherry-pick from AOSP
Signed-off-by: Akilesh Kailash <akailash@google.com>
Change-Id: Ic2c85aa470522b4bc847a16b4f5cebfc528ed3cf