From ba144fab071258a97cf3c42a0defeb0aae41a353 Mon Sep 17 00:00:00 2001 From: "Suren A. Chilingaryan" Date: Sun, 6 Oct 2019 05:00:55 +0200 Subject: Document latest problems with docker images and resource reclaimation, add docker performance checks in the monitoring scripts, helpers to filter the logs --- docs/problems.txt | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) (limited to 'docs/problems.txt') diff --git a/docs/problems.txt b/docs/problems.txt index 1d729cd..e616fe4 100644 --- a/docs/problems.txt +++ b/docs/problems.txt @@ -7,7 +7,9 @@ Actions Required Rogue network interfaces on OpenVSwitch bridge ============================================== - Sometimes OpenShift fails to clean-up after terminated pod properly. The actual reason is unclear. + Sometimes OpenShift fails to clean-up after terminated pod properly. The actual reason is unclear, but + severity of the problem is increased if extreme amount of images is presented in local Docker storage. + Several thousands is defenitively intensifies this problem. * The issue is discussed here: https://bugzilla.redhat.com/show_bug.cgi?id=1518684 * And can be determined by looking into: @@ -23,7 +25,8 @@ Rogue network interfaces on OpenVSwitch bridge * Even if not failed, it takes several minutes to schedule the pod on the affected nodes. Cause: - * Unclear, but it seems periodic ADEI cron jobs causes the issue. + * Unclear, but it seems periodic ADEI cron jobs causes the issue if many images are present + in docker. * Could be related to 'container kill failed' problem explained in the section bellow. Cannot kill container ###: rpc error: code = 2 desc = no such process @@ -35,6 +38,8 @@ Rogue network interfaces on OpenVSwitch bridge * The simplest work-around is to just remove rogue interface. They will be re-created, but performance problems only starts after hundreds accumulate. ovs-vsctl del-port br0 + * It seems helpful to purge unused docker images to reduce the rate of interface apperance. + Status: * Cron job is installed which cleans rogue interfaces as they number hits 25. -- cgit v1.2.3