diff options
author | OpenShift Merge Robot <openshift-merge-robot@users.noreply.github.com> | 2018-01-10 17:58:44 -0800 |
---|---|---|
committer | GitHub <noreply@github.com> | 2018-01-10 17:58:44 -0800 |
commit | 693769209936849a6f83c4ef85bda39dabfb8800 (patch) | |
tree | b3e17a9559c7dea02accb8e75865aad7eee3f764 /inventory | |
parent | e45ef801051202f9d79a0dc814d4a3e056b257d2 (diff) | |
parent | 0841917f05cfad2701164edbb271167c277d3300 (diff) | |
download | openshift-693769209936849a6f83c4ef85bda39dabfb8800.tar.gz openshift-693769209936849a6f83c4ef85bda39dabfb8800.tar.bz2 openshift-693769209936849a6f83c4ef85bda39dabfb8800.tar.xz openshift-693769209936849a6f83c4ef85bda39dabfb8800.zip |
Merge pull request #5080 from sdodson/drain-timeouts
Automatic merge from submit-queue.
Add the ability to specify a timeout for node drain operations
A timeout to wait for nodes to drain pods can be specified to ensure that the upgrade continues even if nodes fail to drain pods in the allowed time. The default value of 0 will wait indefinitely allowing the admin to investigate the root cause and ensuring that disruption budgets are respected. In practice the `oc adm drain` command will eventually error out, at least that's what we've seen in our large online clusters, when that happens a second attempt will be made to drain the nodes, if it fails again it will abort the upgrade for that node or for the entire cluster based on your defined `openshift_upgrade_nodes_max_fail_percentage`.
`openshift_upgrade_nodes_drain_timeout=0` is the default and will wait until all pods have been drained successfully
`openshift_upgrade_nodes_drain_timeout=600` would wait for 600s before moving on to the tasks which would forcefully stop pods such as stopping docker, node, and openvswitch.
Diffstat (limited to 'inventory')
-rw-r--r-- | inventory/hosts.example | 8 |
1 files changed, 8 insertions, 0 deletions
diff --git a/inventory/hosts.example b/inventory/hosts.example index d786146fc..9064dc683 100644 --- a/inventory/hosts.example +++ b/inventory/hosts.example @@ -1005,6 +1005,14 @@ openshift_master_identity_providers=[{'name': 'htpasswd_auth', 'login': 'true', # where as this would not # openshift_upgrade_nodes_serial=4 openshift_upgrade_nodes_max_fail_percentage=50 # +# A timeout to wait for nodes to drain pods can be specified to ensure that the +# upgrade continues even if nodes fail to drain pods in the allowed time. The +# default value of 0 will wait indefinitely allowing the admin to investigate +# the root cause and ensuring that disruption budgets are respected. If the +# a timeout of 0 is used there will also be one attempt to re-try draining the +# node. If a non zero timeout is specified there will be no attempt to retry. +#openshift_upgrade_nodes_drain_timeout=0 +# # Multiple data migrations take place and if they fail they will fail the upgrade # You may wish to disable these or make them non fatal # |