summaryrefslogtreecommitdiffstats
path: root/meta-phosphor/recipes-phosphor/systemd-policy/phosphor-systemd-policy/service-restart-policy.conf
diff options
context:
space:
mode:
authorAndrew Geissler <geissonator@yahoo.com>2018-09-27 14:26:45 -0500
committerBrad Bishop <bradleyb@fuzziesquirrel.com>2018-09-28 14:15:20 +0000
commit2074b620ef97e48fbc4aeb27a39db21500ed36cb (patch)
treeb60be134406a97cf26e91cee81f131d6aa0ba3e3 /meta-phosphor/recipes-phosphor/systemd-policy/phosphor-systemd-policy/service-restart-policy.conf
parent80d844b8c68bd73caa22e991a5fd90ae759d9164 (diff)
downloadtalos-openbmc-2074b620ef97e48fbc4aeb27a39db21500ed36cb.tar.gz
talos-openbmc-2074b620ef97e48fbc4aeb27a39db21500ed36cb.zip
Revert "Increase StartLimitIntervalSec to 240s"
This reverts commit 342c041b108a30d2bcaee3553971a7f3ca5798e0. These settings apply to all service starts (not just the error ones) so this will not work because we have multiple oneshot services that can be started multiple times if someone is powering on and off a system quickly. It does not appear to apply to services that are stopped by conflict but it does affect oneshot services. Will need to spend some more time investigating this. Could give our oneshot services some override settings but would like to see if something more universal can be done. Resolves openbmc/openbmc#3393 (From meta-phosphor rev: 3537e1c6eeaef3f4f96201697b8b59f69824168b) Change-Id: Ia8ca83dd210fc82261e3296c270c18187ba5309a Signed-off-by: Andrew Geissler <geissonator@yahoo.com> Signed-off-by: Brad Bishop <bradleyb@fuzziesquirrel.com>
Diffstat (limited to 'meta-phosphor/recipes-phosphor/systemd-policy/phosphor-systemd-policy/service-restart-policy.conf')
-rw-r--r--meta-phosphor/recipes-phosphor/systemd-policy/phosphor-systemd-policy/service-restart-policy.conf14
1 files changed, 5 insertions, 9 deletions
diff --git a/meta-phosphor/recipes-phosphor/systemd-policy/phosphor-systemd-policy/service-restart-policy.conf b/meta-phosphor/recipes-phosphor/systemd-policy/phosphor-systemd-policy/service-restart-policy.conf
index 17c9e6bea..54516c2d4 100644
--- a/meta-phosphor/recipes-phosphor/systemd-policy/phosphor-systemd-policy/service-restart-policy.conf
+++ b/meta-phosphor/recipes-phosphor/systemd-policy/phosphor-systemd-policy/service-restart-policy.conf
@@ -13,23 +13,19 @@
# restarting once does the job or restarting all 5 times does not help
# and we just end up hitting the 5 limit anyway.
#
-# - Change the StartLimitIntervalSec to 240s
+# - Change the StartLimitIntervalSec to 30s
# The BMC CPU performance is already challenged. When a service is
# failing and a core dump is being generated and collected into a dump,
# it's even more challenged. Recent failures have shown situations where
# the service does not fail again until 15-20 seconds after the initial
# failure which means the default of 10s for this results in the service
-# being restarted indefinitely.
-# Another issue that has cropped up recently is that the DefaultTimeoutStartSec
-# is 90s. If a service is hitting this timeout repeatedly then there
-# is a similar issue as noted above. Because of this, the StartLimitIntervalSec
-# needs to be StartLimitBurst*DefaultTimeoutStartSec +
-# StartLimitBurst* worst case processing time (30s)
-# which currently would be 2x90 + 2x30
+# being restarted indefinitely. Change this to 30s to only allow a service
+# to be restarted StartLimitBurst times within a 30s interval before
+# being put in a permanent fail state.
#
# See systemd-system.conf(5) for details on the conf files
[Manager]
DefaultRestartSec=1s
DefaultStartLimitBurst=2
-DefaultStartLimitIntervalSec=240s
+DefaultStartLimitIntervalSec=30s
OpenPOWER on IntegriCloud