diff options
32 files changed, 942 insertions, 164 deletions
diff --git a/.dockerignore b/.dockerignore index 968811df5..0a70c5bfa 100644 --- a/.dockerignore +++ b/.dockerignore @@ -1,8 +1,12 @@ .* bin docs +hack +inventory test utils **/*.md *.spec +*.ini +*.txt setup* @@ -33,35 +33,6 @@ To build a container image of `openshift-ansible` using standalone **Docker**: cd openshift-ansible docker build -f images/installer/Dockerfile -t openshift-ansible . -### Building on OpenShift - -To build an openshift-ansible image using an **OpenShift** [build and image stream](https://docs.openshift.org/latest/architecture/core_concepts/builds_and_image_streams.html) the straightforward command would be: - - oc new-build registry.centos.org/openshift/playbook2image~https://github.com/openshift/openshift-ansible - -However: because the `Dockerfile` for this repository is not in the top level directory, and because we can't change the build context to the `images/installer` path as it would cause the build to fail, the `oc new-app` command above will create a build configuration using the *source to image* strategy, which is the default approach of the [playbook2image](https://github.com/openshift/playbook2image) base image. This does build an image successfully, but unfortunately the resulting image will be missing some customizations that are handled by the [Dockerfile](images/installer/Dockerfile) in this repo. - -At the time of this writing there is no straightforward option to [set the dockerfilePath](https://docs.openshift.org/latest/dev_guide/builds/build_strategies.html#dockerfile-path) of a `docker` build strategy with `oc new-build`. The alternatives to achieve this are: - -- Use the simple `oc new-build` command above to generate the BuildConfig and ImageStream objects, and then manually edit the generated build configuration to change its strategy to `dockerStrategy` and set `dockerfilePath` to `images/installer/Dockerfile`. - -- Download and pass the `Dockerfile` to `oc new-build` with the `-D` option: - -``` -curl -s https://raw.githubusercontent.com/openshift/openshift-ansible/master/images/installer/Dockerfile | - oc new-build -D - \ - --docker-image=registry.centos.org/openshift/playbook2image \ - https://github.com/openshift/openshift-ansible -``` - -Once a build is started, the progress of the build can be monitored with: - - oc logs -f bc/openshift-ansible - -Once built, the image will be visible in the Image Stream created by `oc new-app`: - - oc describe imagestream openshift-ansible - ## Build the Atomic System Container A system container runs using runC instead of Docker and it is managed diff --git a/README_CONTAINER_IMAGE.md b/README_CONTAINER_IMAGE.md index cf3b432df..a2151352d 100644 --- a/README_CONTAINER_IMAGE.md +++ b/README_CONTAINER_IMAGE.md @@ -1,6 +1,6 @@ # Containerized openshift-ansible to run playbooks -The [Dockerfile](images/installer/Dockerfile) in this repository uses the [playbook2image](https://github.com/openshift/playbook2image) source-to-image base image to containerize `openshift-ansible`. The resulting image can run any of the provided playbooks. See [BUILD.md](BUILD.md) for image build instructions. +The [Dockerfile](images/installer/Dockerfile) in this repository can be used to build a containerized `openshift-ansible`. The resulting image can run any of the provided playbooks. See [BUILD.md](BUILD.md) for image build instructions. The image is designed to **run as a non-root user**. The container's UID is mapped to the username `default` at runtime. Therefore, the container's environment reflects that user's settings, and the configuration should match that. For example `$HOME` is `/opt/app-root/src`, so ssh keys are expected to be under `/opt/app-root/src/.ssh`. If you ran a container as `root` you would have to adjust the container's configuration accordingly, e.g. by placing ssh keys under `/root/.ssh` instead. Nevertheless, the expectation is that containers will be run as non-root; for example, this container image can be run inside OpenShift under the default `restricted` [security context constraint](https://docs.openshift.org/latest/architecture/additional_concepts/authorization.html#security-context-constraints). @@ -14,8 +14,6 @@ This provides consistency with other images used by the platform and it's also a ## Usage -The `playbook2image` base image provides several options to control the behaviour of the containers. For more details on these options see the [playbook2image](https://github.com/openshift/playbook2image) documentation. - At the very least, when running a container you must specify: 1. An **inventory**. This can be a location inside the container (possibly mounted as a volume) with a path referenced via the `INVENTORY_FILE` environment variable. Alternatively you can serve the inventory file from a web server and use the `INVENTORY_URL` environment variable to fetch it, or `DYNAMIC_SCRIPT_URL` to download a script that provides a dynamic inventory. @@ -52,8 +50,6 @@ Here is a detailed explanation of the options used in the command above: Further usage examples are available in the [examples directory](examples/) with samples of how to use the image from within OpenShift. -Additional usage information for images built from `playbook2image` like this one can be found in the [playbook2image examples](https://github.com/openshift/playbook2image/tree/master/examples). - ## Running openshift-ansible as a System Container Building the System Container: See the [BUILD.md](BUILD.md). diff --git a/images/installer/Dockerfile b/images/installer/Dockerfile index 915dfe377..d03f33a1d 100644 --- a/images/installer/Dockerfile +++ b/images/installer/Dockerfile @@ -1,10 +1,18 @@ -# Using playbook2image as a base -# See https://github.com/openshift/playbook2image for details on the image -# including documentation for the settings/env vars referenced below -FROM registry.centos.org/openshift/playbook2image:latest +FROM centos:7 MAINTAINER OpenShift Team <dev@lists.openshift.redhat.com> +USER root + +# install ansible and deps +RUN INSTALL_PKGS="python-lxml pyOpenSSL python2-cryptography openssl java-1.8.0-openjdk-headless httpd-tools openssh-clients" \ + && yum install -y --setopt=tsflags=nodocs $INSTALL_PKGS \ + && EPEL_PKGS="ansible python-passlib python2-boto" \ + && yum install -y epel-release \ + && yum install -y --setopt=tsflags=nodocs $EPEL_PKGS \ + && rpm -q $INSTALL_PKGS $EPEL_PKGS \ + && yum clean all + LABEL name="openshift/origin-ansible" \ summary="OpenShift's installation and configuration tool" \ description="A containerized openshift-ansible image to let you run playbooks to install, upgrade, maintain and check an OpenShift cluster" \ @@ -12,40 +20,24 @@ LABEL name="openshift/origin-ansible" \ io.k8s.display-name="openshift-ansible" \ io.k8s.description="A containerized openshift-ansible image to let you run playbooks to install, upgrade, maintain and check an OpenShift cluster" \ io.openshift.expose-services="" \ - io.openshift.tags="openshift,install,upgrade,ansible" + io.openshift.tags="openshift,install,upgrade,ansible" \ + atomic.run="once" -USER root +ENV USER_UID=1001 \ + HOME=/opt/app-root/src \ + WORK_DIR=/usr/share/ansible/openshift-ansible \ + OPTS="-v" -# Create a symlink to /opt/app-root/src so that files under /usr/share/ansible are accessible. -# This is required since the system-container uses by default the playbook under -# /usr/share/ansible/openshift-ansible. With this change we won't need to keep two different -# configurations for the two images. -RUN mkdir -p /usr/share/ansible/ && ln -s /opt/app-root/src /usr/share/ansible/openshift-ansible +# Add image scripts and files for running as a system container +COPY images/installer/root / +# Include playbooks, roles, plugins, etc. from this repo +COPY . ${WORK_DIR} -RUN INSTALL_PKGS="skopeo openssl java-1.8.0-openjdk-headless httpd-tools" && \ - yum install -y --setopt=tsflags=nodocs $INSTALL_PKGS && \ - rpm -V $INSTALL_PKGS && \ - yum clean all +RUN /usr/local/bin/user_setup \ + && rm /usr/local/bin/usage.ocp USER ${USER_UID} -# The playbook to be run is specified via the PLAYBOOK_FILE env var. -# This sets a default of openshift_facts.yml as it's an informative playbook -# that can help test that everything is set properly (inventory, sshkeys) -ENV PLAYBOOK_FILE=playbooks/byo/openshift_facts.yml \ - OPTS="-v" \ - INSTALL_OC=true - -# playbook2image's assemble script expects the source to be available in -# /tmp/src (as per the source-to-image specs) so we import it there -ADD . /tmp/src - -# Running the 'assemble' script provided by playbook2image will install -# dependencies specified in requirements.txt and install the 'oc' client -# as per the INSTALL_OC environment setting above -RUN /usr/libexec/s2i/assemble - -# Add files for running as a system container -COPY images/installer/system-container/root / - -CMD [ "/usr/libexec/s2i/run" ] +WORKDIR ${WORK_DIR} +ENTRYPOINT [ "/usr/local/bin/entrypoint" ] +CMD [ "/usr/local/bin/run" ] diff --git a/images/installer/Dockerfile.rhel7 b/images/installer/Dockerfile.rhel7 index f861d4bcf..3110f409c 100644 --- a/images/installer/Dockerfile.rhel7 +++ b/images/installer/Dockerfile.rhel7 @@ -1,55 +1,46 @@ -FROM openshift3/playbook2image +FROM rhel7.3:7.3-released MAINTAINER OpenShift Team <dev@lists.openshift.redhat.com> -# override env vars from base image -ENV SUMMARY="OpenShift's installation and configuration tool" \ - DESCRIPTION="A containerized openshift-ansible image to let you run playbooks to install, upgrade, maintain and check an OpenShift cluster" +USER root + +# Playbooks, roles, and their dependencies are installed from packages. +RUN INSTALL_PKGS="atomic-openshift-utils atomic-openshift-clients python-boto openssl java-1.8.0-openjdk-headless httpd-tools" \ + && yum repolist > /dev/null \ + && yum-config-manager --enable rhel-7-server-ose-3.6-rpms \ + && yum-config-manager --enable rhel-7-server-rh-common-rpms \ + && yum install -y --setopt=tsflags=nodocs $INSTALL_PKGS \ + && rpm -q $INSTALL_PKGS \ + && yum clean all LABEL name="openshift3/ose-ansible" \ - summary="$SUMMARY" \ - description="$DESCRIPTION" \ + summary="OpenShift's installation and configuration tool" \ + description="A containerized openshift-ansible image to let you run playbooks to install, upgrade, maintain and check an OpenShift cluster" \ url="https://github.com/openshift/openshift-ansible" \ io.k8s.display-name="openshift-ansible" \ - io.k8s.description="$DESCRIPTION" \ + io.k8s.description="A containerized openshift-ansible image to let you run playbooks to install, upgrade, maintain and check an OpenShift cluster" \ io.openshift.expose-services="" \ io.openshift.tags="openshift,install,upgrade,ansible" \ com.redhat.component="aos3-installation-docker" \ version="v3.6.0" \ release="1" \ - architecture="x86_64" - -# Playbooks, roles and their dependencies are installed from packages. -# Unlike in Dockerfile, we don't invoke the 'assemble' script here -# because all content and dependencies (like 'oc') is already -# installed via yum. -USER root -RUN INSTALL_PKGS="atomic-openshift-utils atomic-openshift-clients python-boto skopeo openssl java-1.8.0-openjdk-headless httpd-tools" && \ - yum repolist > /dev/null && \ - yum-config-manager --enable rhel-7-server-ose-3.6-rpms && \ - yum-config-manager --enable rhel-7-server-rh-common-rpms && \ - yum install -y $INSTALL_PKGS && \ - yum clean all - -# The symlinks below are a (hopefully temporary) hack to work around the fact that this -# image is based on python s2i which uses the python27 SCL instead of system python, -# and so the system python modules we need would otherwise not be in the path. -RUN ln -s /usr/lib/python2.7/site-packages/{boto,passlib} /opt/app-root/lib64/python2.7/ - -USER ${USER_UID} + architecture="x86_64" \ + atomic.run="once" -# The playbook to be run is specified via the PLAYBOOK_FILE env var. -# This sets a default of openshift_facts.yml as it's an informative playbook -# that can help test that everything is set properly (inventory, sshkeys). -# As the playbooks are installed via packages instead of being copied to -# $APP_HOME by the 'assemble' script, we set the WORK_DIR env var to the -# location of openshift-ansible. -ENV PLAYBOOK_FILE=playbooks/byo/openshift_facts.yml \ - ANSIBLE_CONFIG=/usr/share/atomic-openshift-utils/ansible.cfg \ +ENV USER_UID=1001 \ + HOME=/opt/app-root/src \ WORK_DIR=/usr/share/ansible/openshift-ansible \ + ANSIBLE_CONFIG=/usr/share/atomic-openshift-utils/ansible.cfg \ OPTS="-v" -# Add files for running as a system container -COPY system-container/root / +# Add image scripts and files for running as a system container +COPY root / + +RUN /usr/local/bin/user_setup \ + && mv /usr/local/bin/usage{.ocp,} + +USER ${USER_UID} -CMD [ "/usr/libexec/s2i/run" ] +WORKDIR ${WORK_DIR} +ENTRYPOINT [ "/usr/local/bin/entrypoint" ] +CMD [ "/usr/local/bin/run" ] diff --git a/images/installer/system-container/README.md b/images/installer/README_CONTAINER_IMAGE.md index fbcd47c4a..bc1ebb4a8 100644 --- a/images/installer/system-container/README.md +++ b/images/installer/README_CONTAINER_IMAGE.md @@ -1,17 +1,34 @@ -# System container installer +ORIGIN-ANSIBLE IMAGE INSTALLER +=============================== + +Contains Dockerfile information for building an openshift/origin-ansible image +based on `centos:7` or `rhel7.3:7.3-released`. + +Read additional setup information for this image at: https://hub.docker.com/r/openshift/origin-ansible/ + +Read additional information about the `openshift/origin-ansible` at: https://github.com/openshift/openshift-ansible/blob/master/README_CONTAINER_IMAGE.md + +Also contains necessary components for running the installer using an Atomic System Container. + + +System container installer +========================== These files are needed to run the installer using an [Atomic System container](http://www.projectatomic.io/blog/2016/09/intro-to-system-containers/). +These files can be found under `root/exports`: * config.json.template - Template of the configuration file used for running containers. -* manifest.json - Used to define various settings for the system container, such as the default values to use for the installation. - -* run-system-container.sh - Entrypoint to the container. +* manifest.json - Used to define various settings for the system container, such as the default values to use for the installation. * service.template - Template file for the systemd service. * tmpfiles.template - Template file for systemd-tmpfiles. +These files can be found under `root/usr/local/bin`: + +* run-system-container.sh - Entrypoint to the container. + ## Options These options may be set via the ``atomic`` ``--set`` flag. For defaults see ``root/exports/manifest.json`` @@ -28,4 +45,4 @@ These options may be set via the ``atomic`` ``--set`` flag. For defaults see ``r * ANSIBLE_CONFIG - Full path for the ansible configuration file to use inside the container -* INVENTORY_FILE - Full path for the inventory to use from the host +* INVENTORY_FILE - Full path for the inventory to use from the host
\ No newline at end of file diff --git a/images/installer/system-container/root/exports/config.json.template b/images/installer/root/exports/config.json.template index 739c0080f..739c0080f 100644 --- a/images/installer/system-container/root/exports/config.json.template +++ b/images/installer/root/exports/config.json.template diff --git a/images/installer/system-container/root/exports/manifest.json b/images/installer/root/exports/manifest.json index 8b984d7a3..8b984d7a3 100644 --- a/images/installer/system-container/root/exports/manifest.json +++ b/images/installer/root/exports/manifest.json diff --git a/images/installer/system-container/root/exports/service.template b/images/installer/root/exports/service.template index bf5316af6..bf5316af6 100644 --- a/images/installer/system-container/root/exports/service.template +++ b/images/installer/root/exports/service.template diff --git a/images/installer/system-container/root/exports/tmpfiles.template b/images/installer/root/exports/tmpfiles.template index b1f6caf47..b1f6caf47 100644 --- a/images/installer/system-container/root/exports/tmpfiles.template +++ b/images/installer/root/exports/tmpfiles.template diff --git a/images/installer/root/usr/local/bin/entrypoint b/images/installer/root/usr/local/bin/entrypoint new file mode 100755 index 000000000..777bf3f11 --- /dev/null +++ b/images/installer/root/usr/local/bin/entrypoint @@ -0,0 +1,17 @@ +#!/bin/bash -e +# +# This file serves as the main entrypoint to the openshift-ansible image. +# +# For more information see the documentation: +# https://github.com/openshift/openshift-ansible/blob/master/README_CONTAINER_IMAGE.md + + +# Patch /etc/passwd file with the current user info. +# The current user's entry must be correctly defined in this file in order for +# the `ssh` command to work within the created container. + +if ! whoami &>/dev/null; then + echo "${USER:-default}:x:$(id -u):$(id -g):Default User:$HOME:/sbin/nologin" >> /etc/passwd +fi + +exec "$@" diff --git a/images/installer/root/usr/local/bin/run b/images/installer/root/usr/local/bin/run new file mode 100755 index 000000000..9401ea118 --- /dev/null +++ b/images/installer/root/usr/local/bin/run @@ -0,0 +1,46 @@ +#!/bin/bash -e +# +# This file serves as the default command to the openshift-ansible image. +# Runs a playbook with inventory as specified by environment variables. +# +# For more information see the documentation: +# https://github.com/openshift/openshift-ansible/blob/master/README_CONTAINER_IMAGE.md + +# SOURCE and HOME DIRECTORY: /opt/app-root/src + +if [[ -z "${PLAYBOOK_FILE}" ]]; then + echo + echo "PLAYBOOK_FILE must be provided." + exec /usr/local/bin/usage +fi + +INVENTORY="$(mktemp)" +if [[ -v INVENTORY_FILE ]]; then + # Make a copy so that ALLOW_ANSIBLE_CONNECTION_LOCAL below + # does not attempt to modify the original + cp -a ${INVENTORY_FILE} ${INVENTORY} +elif [[ -v INVENTORY_URL ]]; then + curl -o ${INVENTORY} ${INVENTORY_URL} +elif [[ -v DYNAMIC_SCRIPT_URL ]]; then + curl -o ${INVENTORY} ${DYNAMIC_SCRIPT_URL} + chmod 755 ${INVENTORY} +else + echo + echo "One of INVENTORY_FILE, INVENTORY_URL or DYNAMIC_SCRIPT_URL must be provided." + exec /usr/local/bin/usage +fi +INVENTORY_ARG="-i ${INVENTORY}" + +if [[ "$ALLOW_ANSIBLE_CONNECTION_LOCAL" = false ]]; then + sed -i s/ansible_connection=local// ${INVENTORY} +fi + +if [[ -v VAULT_PASS ]]; then + VAULT_PASS_FILE=.vaultpass + echo ${VAULT_PASS} > ${VAULT_PASS_FILE} + VAULT_PASS_ARG="--vault-password-file ${VAULT_PASS_FILE}" +fi + +cd ${WORK_DIR} + +exec ansible-playbook ${INVENTORY_ARG} ${VAULT_PASS_ARG} ${OPTS} ${PLAYBOOK_FILE} diff --git a/images/installer/system-container/root/usr/local/bin/run-system-container.sh b/images/installer/root/usr/local/bin/run-system-container.sh index 9ce7c7328..9ce7c7328 100755 --- a/images/installer/system-container/root/usr/local/bin/run-system-container.sh +++ b/images/installer/root/usr/local/bin/run-system-container.sh diff --git a/images/installer/root/usr/local/bin/usage b/images/installer/root/usr/local/bin/usage new file mode 100755 index 000000000..3518d7f19 --- /dev/null +++ b/images/installer/root/usr/local/bin/usage @@ -0,0 +1,33 @@ +#!/bin/bash -e +cat <<"EOF" + +The origin-ansible image provides several options to control the behaviour of the containers. +For more details on these options see the documentation: + + https://github.com/openshift/openshift-ansible/blob/master/README_CONTAINER_IMAGE.md + +At a minimum, when running a container using this image you must provide: + +* ssh keys so that Ansible can reach your hosts. These should be mounted as a volume under + /opt/app-root/src/.ssh +* An inventory file. This can be mounted inside the container as a volume and specified with the + INVENTORY_FILE environment variable. Alternatively you can serve the inventory file from a web + server and use the INVENTORY_URL environment variable to fetch it. +* The playbook to run. This is set using the PLAYBOOK_FILE environment variable. + +Here is an example of how to run a containerized origin-ansible with +the openshift_facts playbook, which collects and displays facts about your +OpenShift environment. The inventory and ssh keys are mounted as volumes +(the latter requires setting the uid in the container and SELinux label +in the key file via :Z so they can be accessed) and the PLAYBOOK_FILE +environment variable is set to point to the playbook within the image: + +docker run -tu `id -u` \ + -v $HOME/.ssh/id_rsa:/opt/app-root/src/.ssh/id_rsa:Z,ro \ + -v /etc/ansible/hosts:/tmp/inventory:Z,ro \ + -e INVENTORY_FILE=/tmp/inventory \ + -e OPTS="-v" \ + -e PLAYBOOK_FILE=playbooks/byo/openshift_facts.yml \ + openshift/origin-ansible + +EOF diff --git a/images/installer/root/usr/local/bin/usage.ocp b/images/installer/root/usr/local/bin/usage.ocp new file mode 100755 index 000000000..50593af6e --- /dev/null +++ b/images/installer/root/usr/local/bin/usage.ocp @@ -0,0 +1,33 @@ +#!/bin/bash -e +cat <<"EOF" + +The ose-ansible image provides several options to control the behaviour of the containers. +For more details on these options see the documentation: + + https://github.com/openshift/openshift-ansible/blob/master/README_CONTAINER_IMAGE.md + +At a minimum, when running a container using this image you must provide: + +* ssh keys so that Ansible can reach your hosts. These should be mounted as a volume under + /opt/app-root/src/.ssh +* An inventory file. This can be mounted inside the container as a volume and specified with the + INVENTORY_FILE environment variable. Alternatively you can serve the inventory file from a web + server and use the INVENTORY_URL environment variable to fetch it. +* The playbook to run. This is set using the PLAYBOOK_FILE environment variable. + +Here is an example of how to run a containerized ose-ansible with +the openshift_facts playbook, which collects and displays facts about your +OpenShift environment. The inventory and ssh keys are mounted as volumes +(the latter requires setting the uid in the container and SELinux label +in the key file via :Z so they can be accessed) and the PLAYBOOK_FILE +environment variable is set to point to the playbook within the image: + +docker run -tu `id -u` \ + -v $HOME/.ssh/id_rsa:/opt/app-root/src/.ssh/id_rsa:Z,ro \ + -v /etc/ansible/hosts:/tmp/inventory:Z,ro \ + -e INVENTORY_FILE=/tmp/inventory \ + -e OPTS="-v" \ + -e PLAYBOOK_FILE=playbooks/byo/openshift_facts.yml \ + openshift3/ose-ansible + +EOF diff --git a/images/installer/root/usr/local/bin/user_setup b/images/installer/root/usr/local/bin/user_setup new file mode 100755 index 000000000..b76e60a4d --- /dev/null +++ b/images/installer/root/usr/local/bin/user_setup @@ -0,0 +1,17 @@ +#!/bin/sh +set -x + +# ensure $HOME exists and is accessible by group 0 (we don't know what the runtime UID will be) +mkdir -p ${HOME} +chown ${USER_UID}:0 ${HOME} +chmod ug+rwx ${HOME} + +# runtime user will need to be able to self-insert in /etc/passwd +chmod g+rw /etc/passwd + +# ensure that the ansible content is accessible +chmod -R g+r ${WORK_DIR} +find ${WORK_DIR} -type d -exec chmod g+x {} + + +# no need for this script to remain in the image after running +rm $0 diff --git a/inventory/byo/hosts.origin.example b/inventory/byo/hosts.origin.example index 93e821ba7..6e53b4fd9 100644 --- a/inventory/byo/hosts.origin.example +++ b/inventory/byo/hosts.origin.example @@ -113,6 +113,11 @@ openshift_release=v3.6 # Downgrades are not supported and will error out. Be careful when upgrading docker from < 1.10 to > 1.10. # docker_version="1.12.1" +# Specify whether to run Docker daemon with SELinux enabled in containers. Default is True. +# Uncomment below to disable; for example if your kernel does not support the +# Docker overlay/overlay2 storage drivers with SELinux enabled. +#openshift_docker_selinux_enabled=False + # Skip upgrading Docker during an OpenShift upgrade, leaves the current Docker version alone. # docker_upgrade=False diff --git a/inventory/byo/hosts.ose.example b/inventory/byo/hosts.ose.example index 9a0dce01b..e3e9220fc 100644 --- a/inventory/byo/hosts.ose.example +++ b/inventory/byo/hosts.ose.example @@ -109,6 +109,11 @@ openshift_release=v3.6 # Default value: "--log-driver=journald" #openshift_docker_options="-l warn --ipv6=false" +# Specify whether to run Docker daemon with SELinux enabled in containers. Default is True. +# Uncomment below to disable; for example if your kernel does not support the +# Docker overlay/overlay2 storage drivers with SELinux enabled. +#openshift_docker_selinux_enabled=False + # Specify exact version of Docker to configure or upgrade to. # Downgrades are not supported and will error out. Be careful when upgrading docker from < 1.10 to > 1.10. # docker_version="1.12.1" diff --git a/playbooks/byo/openshift-checks/README.md b/playbooks/byo/openshift-checks/README.md index 4b2ff1f94..f0f14b268 100644 --- a/playbooks/byo/openshift-checks/README.md +++ b/playbooks/byo/openshift-checks/README.md @@ -39,7 +39,9 @@ against your inventory file. Here is the step-by-step: $ cd openshift-ansible ``` -2. Run the appropriate playbook: +2. Install the [dependencies](../../../README.md#setup) + +3. Run the appropriate playbook: ```console $ ansible-playbook -i <inventory file> playbooks/byo/openshift-checks/pre-install.yml @@ -57,9 +59,8 @@ against your inventory file. Here is the step-by-step: $ ansible-playbook -i <inventory file> playbooks/byo/openshift-checks/certificate_expiry/default.yaml -v ``` -## Running via Docker image +## Running in a container This repository is built into a Docker image including Ansible so that it can -be run anywhere Docker is available. Instructions for doing so may be found -[in the README](../../README_CONTAINER_IMAGE.md). - +be run anywhere Docker is available, without the need to manually install dependencies. +Instructions for doing so may be found [in the README](../../../README_CONTAINER_IMAGE.md). diff --git a/playbooks/common/openshift-cluster/openshift_hosted.yml b/playbooks/common/openshift-cluster/openshift_hosted.yml index 8d94b6509..ce7f981ab 100644 --- a/playbooks/common/openshift-cluster/openshift_hosted.yml +++ b/playbooks/common/openshift-cluster/openshift_hosted.yml @@ -26,6 +26,8 @@ logging_elasticsearch_cluster_size: "{{ openshift_hosted_logging_elasticsearch_cluster_size | default(1) }}" logging_elasticsearch_ops_cluster_size: "{{ openshift_hosted_logging_elasticsearch_ops_cluster_size | default(1) }}" roles: + - role: openshift_default_storage_class + when: openshift_cloudprovider_kind is defined and (openshift_cloudprovider_kind == 'aws' or openshift_cloudprovider_kind == 'gce') - role: openshift_hosted - role: openshift_metrics when: openshift_hosted_metrics_deploy | default(false) | bool @@ -45,8 +47,6 @@ - role: cockpit-ui when: ( openshift.common.version_gte_3_3_or_1_3 | bool ) and ( openshift_hosted_manage_registry | default(true) | bool ) and not (openshift.docker.hosted_registry_insecure | default(false) | bool) - - role: openshift_default_storage_class - when: openshift_cloudprovider_kind is defined and (openshift_cloudprovider_kind == 'aws' or openshift_cloudprovider_kind == 'gce') - name: Update master-config for publicLoggingURL hosts: oo_masters_to_config:!oo_first_master diff --git a/roles/ansible_service_broker/defaults/main.yml b/roles/ansible_service_broker/defaults/main.yml index aa1c5022b..12929b354 100644 --- a/roles/ansible_service_broker/defaults/main.yml +++ b/roles/ansible_service_broker/defaults/main.yml @@ -4,6 +4,7 @@ ansible_service_broker_remove: false ansible_service_broker_log_level: info ansible_service_broker_output_request: false ansible_service_broker_recovery: true +ansible_service_broker_bootstrap_on_startup: true # Recommended you do not enable this for now ansible_service_broker_dev_broker: false ansible_service_broker_launch_apb_on_bind: false diff --git a/roles/ansible_service_broker/tasks/install.yml b/roles/ansible_service_broker/tasks/install.yml index 65dffc89b..b3797ef96 100644 --- a/roles/ansible_service_broker/tasks/install.yml +++ b/roles/ansible_service_broker/tasks/install.yml @@ -42,6 +42,14 @@ namespace: openshift-ansible-service-broker state: present +- name: Set SA cluster-role + oc_adm_policy_user: + state: present + namespace: "openshift-ansible-service-broker" + resource_kind: cluster-role + resource_name: admin + user: "system:serviceaccount:openshift-ansible-service-broker:asb" + - name: create ansible-service-broker service oc_service: name: asb @@ -254,6 +262,7 @@ launch_apb_on_bind: {{ ansible_service_broker_launch_apb_on_bind | bool | lower }} recovery: {{ ansible_service_broker_recovery | bool | lower }} output_request: {{ ansible_service_broker_output_request | bool | lower }} + bootstrap_on_startup: {{ ansible_service_broker_bootstrap_on_startup | bool | lower }} - name: Create the Broker resource in the catalog oc_obj: diff --git a/roles/docker/tasks/package_docker.yml b/roles/docker/tasks/package_docker.yml index 5f536005d..bc52ab60c 100644 --- a/roles/docker/tasks/package_docker.yml +++ b/roles/docker/tasks/package_docker.yml @@ -93,7 +93,7 @@ dest: /etc/sysconfig/docker regexp: '^OPTIONS=.*$' line: "OPTIONS='\ - {% if ansible_selinux.status | default(None) == '''enabled''' and docker_selinux_enabled | default(true) %} --selinux-enabled {% endif %}\ + {% if ansible_selinux.status | default(None) == 'enabled' and docker_selinux_enabled | default(true) | bool %} --selinux-enabled {% endif %}\ {% if docker_log_driver is defined %} --log-driver {{ docker_log_driver }}{% endif %}\ {% if docker_log_options is defined %} {{ docker_log_options | oo_split() | oo_prepend_strings_in_list('--log-opt ') | join(' ')}}{% endif %}\ {% if docker_options is defined %} {{ docker_options }}{% endif %}\ diff --git a/roles/openshift_default_storage_class/defaults/main.yml b/roles/openshift_default_storage_class/defaults/main.yml index bda83c933..4f371fd89 100644 --- a/roles/openshift_default_storage_class/defaults/main.yml +++ b/roles/openshift_default_storage_class/defaults/main.yml @@ -12,6 +12,7 @@ openshift_storageclass_defaults: provisioner: kubernetes.io/gce-pd type: pd-standard +openshift_storageclass_default: "true" openshift_storageclass_name: "{{ openshift_storageclass_defaults[openshift_cloudprovider_kind]['name'] }}" openshift_storageclass_provisioner: "{{ openshift_storageclass_defaults[openshift_cloudprovider_kind]['provisioner'] }}" openshift_storageclass_parameters: "{{ openshift_storageclass_defaults[openshift_cloudprovider_kind]['parameters'] }}" diff --git a/roles/openshift_default_storage_class/tasks/main.yml b/roles/openshift_default_storage_class/tasks/main.yml index fd5e4fabe..82cab6746 100644 --- a/roles/openshift_default_storage_class/tasks/main.yml +++ b/roles/openshift_default_storage_class/tasks/main.yml @@ -2,9 +2,8 @@ # Install default storage classes in GCE & AWS - name: Ensure storageclass object oc_storageclass: - kind: storageclass name: "{{ openshift_storageclass_name }}" - default_storage_class: "true" + default_storage_class: "{{ openshift_storageclass_default | default('true') | string}}" parameters: type: "{{ openshift_storageclass_parameters.type | default('gp2') }}" encrypted: "{{ openshift_storageclass_parameters.encrypted | default('false') | string }}" diff --git a/roles/openshift_health_checker/library/search_journalctl.py b/roles/openshift_health_checker/library/search_journalctl.py new file mode 100644 index 000000000..3631f71c8 --- /dev/null +++ b/roles/openshift_health_checker/library/search_journalctl.py @@ -0,0 +1,150 @@ +#!/usr/bin/python +"""Interface to journalctl.""" + +from time import time +import json +import re +import subprocess + +from ansible.module_utils.basic import AnsibleModule + + +class InvalidMatcherRegexp(Exception): + """Exception class for invalid matcher regexp.""" + pass + + +class InvalidLogEntry(Exception): + """Exception class for invalid / non-json log entries.""" + pass + + +class LogInputSubprocessError(Exception): + """Exception class for errors that occur while executing a subprocess.""" + pass + + +def main(): + """Scan a given list of "log_matchers" for journalctl messages containing given patterns. + "log_matchers" is a list of dicts consisting of three keys that help fine-tune log searching: + 'start_regexp', 'regexp', and 'unit'. + + Sample "log_matchers" list: + + [ + { + 'start_regexp': r'Beginning of systemd unit', + 'regexp': r'the specific log message to find', + 'unit': 'etcd', + } + ] + """ + module = AnsibleModule( + argument_spec=dict( + log_count_limit=dict(type="int", default=500), + log_matchers=dict(type="list", required=True), + ), + ) + + timestamp_limit_seconds = time() - 60 * 60 # 1 hour + + log_count_limit = module.params["log_count_limit"] + log_matchers = module.params["log_matchers"] + + matched_regexp, errors = get_log_matches(log_matchers, log_count_limit, timestamp_limit_seconds) + + module.exit_json( + changed=False, + failed=bool(errors), + errors=errors, + matched=matched_regexp, + ) + + +def get_log_matches(matchers, log_count_limit, timestamp_limit_seconds): + """Return a list of up to log_count_limit matches for each matcher. + + Log entries are only considered if newer than timestamp_limit_seconds. + """ + matched_regexp = [] + errors = [] + + for matcher in matchers: + try: + log_output = get_log_output(matcher) + except LogInputSubprocessError as err: + errors.append(str(err)) + continue + + try: + matched = find_matches(log_output, matcher, log_count_limit, timestamp_limit_seconds) + if matched: + matched_regexp.append(matcher.get("regexp", "")) + except InvalidMatcherRegexp as err: + errors.append(str(err)) + except InvalidLogEntry as err: + errors.append(str(err)) + + return matched_regexp, errors + + +def get_log_output(matcher): + """Return an iterator on the logs of a given matcher.""" + try: + cmd_output = subprocess.Popen(list([ + '/bin/journalctl', + '-ru', matcher.get("unit", ""), + '--output', 'json', + ]), stdout=subprocess.PIPE) + + return iter(cmd_output.stdout.readline, '') + + except subprocess.CalledProcessError as exc: + msg = "Could not obtain journalctl logs for the specified systemd unit: {}: {}" + raise LogInputSubprocessError(msg.format(matcher.get("unit", "<missing>"), str(exc))) + except OSError as exc: + raise LogInputSubprocessError(str(exc)) + + +def find_matches(log_output, matcher, log_count_limit, timestamp_limit_seconds): + """Return log messages matched in iterable log_output by a given matcher. + + Ignore any log_output items older than timestamp_limit_seconds. + """ + try: + regexp = re.compile(matcher.get("regexp", "")) + start_regexp = re.compile(matcher.get("start_regexp", "")) + except re.error as err: + msg = "A log matcher object was provided with an invalid regular expression: {}" + raise InvalidMatcherRegexp(msg.format(str(err))) + + matched = None + + for log_count, line in enumerate(log_output): + if log_count >= log_count_limit: + break + + try: + obj = json.loads(line) + + # don't need to look past the most recent service restart + if start_regexp.match(obj["MESSAGE"]): + break + + log_timestamp_seconds = float(obj["__REALTIME_TIMESTAMP"]) / 1000000 + if log_timestamp_seconds < timestamp_limit_seconds: + break + + if regexp.match(obj["MESSAGE"]): + matched = line + break + + except ValueError: + msg = "Log entry for systemd unit {} contained invalid json syntax: {}" + raise InvalidLogEntry(msg.format(matcher.get("unit"), line)) + + return matched + + +if __name__ == '__main__': + main() diff --git a/roles/openshift_health_checker/openshift_checks/docker_storage.py b/roles/openshift_health_checker/openshift_checks/docker_storage.py index e80691ef3..d2227d244 100644 --- a/roles/openshift_health_checker/openshift_checks/docker_storage.py +++ b/roles/openshift_health_checker/openshift_checks/docker_storage.py @@ -1,5 +1,6 @@ """Check Docker storage driver and usage.""" import json +import os.path import re from openshift_checks import OpenShiftCheck, OpenShiftCheckException, get_var from openshift_checks.mixins import DockerHostMixin @@ -20,10 +21,27 @@ class DockerStorage(DockerHostMixin, OpenShiftCheck): storage_drivers = ["devicemapper", "overlay", "overlay2"] max_thinpool_data_usage_percent = 90.0 max_thinpool_meta_usage_percent = 90.0 + max_overlay_usage_percent = 90.0 + + # TODO(lmeyer): mention these in the output when check fails + configuration_variables = [ + ( + "max_thinpool_data_usage_percent", + "For 'devicemapper' storage driver, usage threshold percentage for data. " + "Format: float. Default: {:.1f}".format(max_thinpool_data_usage_percent), + ), + ( + "max_thinpool_meta_usage_percent", + "For 'devicemapper' storage driver, usage threshold percentage for metadata. " + "Format: float. Default: {:.1f}".format(max_thinpool_meta_usage_percent), + ), + ( + "max_overlay_usage_percent", + "For 'overlay' or 'overlay2' storage driver, usage threshold percentage. " + "Format: float. Default: {:.1f}".format(max_overlay_usage_percent), + ), + ] - # pylint: disable=too-many-return-statements - # Reason: permanent stylistic exception; - # it is clearer to return on failures and there are just many ways to fail here. def run(self, tmp, task_vars): msg, failed, changed = self.ensure_dependencies(task_vars) if failed: @@ -34,17 +52,17 @@ class DockerStorage(DockerHostMixin, OpenShiftCheck): } # attempt to get the docker info hash from the API - info = self.execute_module("docker_info", {}, task_vars=task_vars) - if info.get("failed"): + docker_info = self.execute_module("docker_info", {}, task_vars=task_vars) + if docker_info.get("failed"): return {"failed": True, "changed": changed, "msg": "Failed to query Docker API. Is docker running on this host?"} - if not info.get("info"): # this would be very strange + if not docker_info.get("info"): # this would be very strange return {"failed": True, "changed": changed, - "msg": "Docker API query missing info:\n{}".format(json.dumps(info))} - info = info["info"] + "msg": "Docker API query missing info:\n{}".format(json.dumps(docker_info))} + docker_info = docker_info["info"] # check if the storage driver we saw is valid - driver = info.get("Driver", "[NONE]") + driver = docker_info.get("Driver", "[NONE]") if driver not in self.storage_drivers: msg = ( "Detected unsupported Docker storage driver '{driver}'.\n" @@ -53,26 +71,34 @@ class DockerStorage(DockerHostMixin, OpenShiftCheck): return {"failed": True, "changed": changed, "msg": msg} # driver status info is a list of tuples; convert to dict and validate based on driver - driver_status = {item[0]: item[1] for item in info.get("DriverStatus", [])} + driver_status = {item[0]: item[1] for item in docker_info.get("DriverStatus", [])} + + result = {} + if driver == "devicemapper": - if driver_status.get("Data loop file"): - msg = ( - "Use of loopback devices with the Docker devicemapper storage driver\n" - "(the default storage configuration) is unsupported in production.\n" - "Please use docker-storage-setup to configure a backing storage volume.\n" - "See http://red.ht/2rNperO for further information." - ) - return {"failed": True, "changed": changed, "msg": msg} - result = self._check_dm_usage(driver_status, task_vars) - result['changed'] = result.get('changed', False) or changed - return result + result = self.check_devicemapper_support(driver_status, task_vars) - # TODO(lmeyer): determine how to check usage for overlay2 + if driver in ['overlay', 'overlay2']: + result = self.check_overlay_support(docker_info, driver_status, task_vars) - return {"changed": changed} + result['changed'] = result.get('changed', False) or changed + return result - def _check_dm_usage(self, driver_status, task_vars): - """ + def check_devicemapper_support(self, driver_status, task_vars): + """Check if dm storage driver is supported as configured. Return: result dict.""" + if driver_status.get("Data loop file"): + msg = ( + "Use of loopback devices with the Docker devicemapper storage driver\n" + "(the default storage configuration) is unsupported in production.\n" + "Please use docker-storage-setup to configure a backing storage volume.\n" + "See http://red.ht/2rNperO for further information." + ) + return {"failed": True, "msg": msg} + result = self.check_dm_usage(driver_status, task_vars) + return result + + def check_dm_usage(self, driver_status, task_vars): + """Check usage thresholds for Docker dm storage driver. Return: result dict. Backing assumptions: We expect devicemapper to be backed by an auto-expanding thin pool implemented as an LV in an LVM2 VG. This is how docker-storage-setup currently configures devicemapper storage. The LV is "thin" because it does not use all available storage @@ -83,7 +109,7 @@ class DockerStorage(DockerHostMixin, OpenShiftCheck): could run out of space first; so we check both. """ vals = dict( - vg_free=self._get_vg_free(driver_status.get("Pool Name"), task_vars), + vg_free=self.get_vg_free(driver_status.get("Pool Name"), task_vars), data_used=driver_status.get("Data Space Used"), data_total=driver_status.get("Data Space Total"), metadata_used=driver_status.get("Metadata Space Used"), @@ -93,7 +119,7 @@ class DockerStorage(DockerHostMixin, OpenShiftCheck): # convert all human-readable strings to bytes for key, value in vals.copy().items(): try: - vals[key + "_bytes"] = self._convert_to_bytes(value) + vals[key + "_bytes"] = self.convert_to_bytes(value) except ValueError as err: # unlikely to hit this from API info, but just to be safe return { "failed": True, @@ -131,10 +157,12 @@ class DockerStorage(DockerHostMixin, OpenShiftCheck): vals["msg"] = "\n".join(messages or ["Thinpool usage is within thresholds."]) return vals - def _get_vg_free(self, pool, task_vars): - # Determine which VG to examine according to the pool name, the only indicator currently - # available from the Docker API driver info. We assume a name that looks like - # "vg--name-docker--pool"; vg and lv names with inner hyphens doubled, joined by a hyphen. + def get_vg_free(self, pool, task_vars): + """Determine which VG to examine according to the pool name. Return: size vgs reports. + Pool name is the only indicator currently available from the Docker API driver info. + We assume a name that looks like "vg--name-docker--pool"; + vg and lv names with inner hyphens doubled, joined by a hyphen. + """ match = re.match(r'((?:[^-]|--)+)-(?!-)', pool) # matches up to the first single hyphen if not match: # unlikely, but... be clear if we assumed wrong raise OpenShiftCheckException( @@ -163,7 +191,8 @@ class DockerStorage(DockerHostMixin, OpenShiftCheck): return size @staticmethod - def _convert_to_bytes(string): + def convert_to_bytes(string): + """Convert string like "10.3 G" to bytes (binary units assumed). Return: float bytes.""" units = dict( b=1, k=1024, @@ -183,3 +212,87 @@ class DockerStorage(DockerHostMixin, OpenShiftCheck): raise ValueError("Cannot convert to a byte size: " + string) return float(number) * multiplier + + def check_overlay_support(self, docker_info, driver_status, task_vars): + """Check if overlay storage driver is supported for this host. Return: result dict.""" + # check for xfs as backing store + backing_fs = driver_status.get("Backing Filesystem", "[NONE]") + if backing_fs != "xfs": + msg = ( + "Docker storage drivers 'overlay' and 'overlay2' are only supported with\n" + "'xfs' as the backing storage, but this host's storage is type '{fs}'." + ).format(fs=backing_fs) + return {"failed": True, "msg": msg} + + # check support for OS and kernel version + o_s = docker_info.get("OperatingSystem", "[NONE]") + if "Red Hat Enterprise Linux" in o_s or "CentOS" in o_s: + # keep it simple, only check enterprise kernel versions; assume everyone else is good + kernel = docker_info.get("KernelVersion", "[NONE]") + kernel_arr = [int(num) for num in re.findall(r'\d+', kernel)] + if kernel_arr < [3, 10, 0, 514]: # rhel < 7.3 + msg = ( + "Docker storage drivers 'overlay' and 'overlay2' are only supported beginning with\n" + "kernel version 3.10.0-514; but Docker reports kernel version {version}." + ).format(version=kernel) + return {"failed": True, "msg": msg} + # NOTE: we could check for --selinux-enabled here but docker won't even start with + # that option until it's supported in the kernel so we don't need to. + + return self.check_overlay_usage(docker_info, task_vars) + + def check_overlay_usage(self, docker_info, task_vars): + """Check disk usage on OverlayFS backing store volume. Return: result dict.""" + path = docker_info.get("DockerRootDir", "/var/lib/docker") + "/" + docker_info["Driver"] + + threshold = get_var(task_vars, "max_overlay_usage_percent", default=self.max_overlay_usage_percent) + try: + threshold = float(threshold) + except ValueError: + return { + "failed": True, + "msg": "Specified 'max_overlay_usage_percent' is not a percentage: {}".format(threshold), + } + + mount = self.find_ansible_mount(path, get_var(task_vars, "ansible_mounts")) + try: + free_bytes = mount['size_available'] + total_bytes = mount['size_total'] + usage = 100.0 * (total_bytes - free_bytes) / total_bytes + except (KeyError, ZeroDivisionError): + return { + "failed": True, + "msg": "The ansible_mount found for path {} is invalid.\n" + "This is likely to be an Ansible bug. The record was:\n" + "{}".format(path, json.dumps(mount, indent=2)), + } + + if usage > threshold: + return { + "failed": True, + "msg": ( + "For Docker OverlayFS mount point {path},\n" + "usage percentage {pct:.1f} is higher than threshold {thresh:.1f}." + ).format(path=mount["mount"], pct=usage, thresh=threshold) + } + + return {} + + # TODO(lmeyer): migrate to base class + @staticmethod + def find_ansible_mount(path, ansible_mounts): + """Return the mount point for path from ansible_mounts.""" + + mount_for_path = {mount['mount']: mount for mount in ansible_mounts} + mount_point = path + while mount_point not in mount_for_path: + if mount_point in ["/", ""]: # "/" not in ansible_mounts??? + break + mount_point = os.path.dirname(mount_point) + + try: + return mount_for_path[mount_point] + except KeyError: + known_mounts = ', '.join('"{}"'.format(mount) for mount in sorted(mount_for_path)) or 'none' + msg = 'Unable to determine mount point for path "{}". Known mount points: {}.' + raise OpenShiftCheckException(msg.format(path, known_mounts)) diff --git a/roles/openshift_health_checker/openshift_checks/etcd_traffic.py b/roles/openshift_health_checker/openshift_checks/etcd_traffic.py new file mode 100644 index 000000000..40c87873d --- /dev/null +++ b/roles/openshift_health_checker/openshift_checks/etcd_traffic.py @@ -0,0 +1,47 @@ +"""Check that scans journalctl for messages caused as a symptom of increased etcd traffic.""" + +from openshift_checks import OpenShiftCheck, get_var + + +class EtcdTraffic(OpenShiftCheck): + """Check if host is being affected by an increase in etcd traffic.""" + + name = "etcd_traffic" + tags = ["health", "etcd"] + + @classmethod + def is_active(cls, task_vars): + """Skip hosts that do not have etcd in their group names.""" + group_names = get_var(task_vars, "group_names", default=[]) + valid_group_names = "etcd" in group_names + + version = get_var(task_vars, "openshift", "common", "short_version") + valid_version = version in ("3.4", "3.5", "1.4", "1.5") + + return super(EtcdTraffic, cls).is_active(task_vars) and valid_group_names and valid_version + + def run(self, tmp, task_vars): + is_containerized = get_var(task_vars, "openshift", "common", "is_containerized") + unit = "etcd_container" if is_containerized else "etcd" + + log_matchers = [{ + "start_regexp": r"Starting Etcd Server", + "regexp": r"etcd: sync duration of [^,]+, expected less than 1s", + "unit": unit + }] + + match = self.execute_module("search_journalctl", { + "log_matchers": log_matchers, + }, task_vars) + + if match.get("matched"): + msg = ("Higher than normal etcd traffic detected.\n" + "OpenShift 3.4 introduced an increase in etcd traffic.\n" + "Upgrading to OpenShift 3.6 is recommended in order to fix this issue.\n" + "Please refer to https://access.redhat.com/solutions/2916381 for more information.") + return {"failed": True, "msg": msg} + + if match.get("failed"): + return {"failed": True, "msg": "\n".join(match.get("errors"))} + + return {} diff --git a/roles/openshift_health_checker/test/docker_storage_test.py b/roles/openshift_health_checker/test/docker_storage_test.py index bb25e3f66..99c529054 100644 --- a/roles/openshift_health_checker/test/docker_storage_test.py +++ b/roles/openshift_health_checker/test/docker_storage_test.py @@ -23,7 +23,8 @@ def test_is_active(is_containerized, group_names, is_active): assert DockerStorage.is_active(task_vars=task_vars) == is_active -non_atomic_task_vars = {"openshift": {"common": {"is_atomic": False}}} +def non_atomic_task_vars(): + return {"openshift": {"common": {"is_atomic": False}}} @pytest.mark.parametrize('docker_info, failed, expect_msg', [ @@ -56,7 +57,7 @@ non_atomic_task_vars = {"openshift": {"common": {"is_atomic": False}}} ( dict(info={ "Driver": "overlay2", - "DriverStatus": [] + "DriverStatus": [("Backing Filesystem", "xfs")], }), False, [], @@ -64,6 +65,27 @@ non_atomic_task_vars = {"openshift": {"common": {"is_atomic": False}}} ( dict(info={ "Driver": "overlay", + "DriverStatus": [("Backing Filesystem", "btrfs")], + }), + True, + ["storage is type 'btrfs'", "only supported with\n'xfs'"], + ), + ( + dict(info={ + "Driver": "overlay2", + "DriverStatus": [("Backing Filesystem", "xfs")], + "OperatingSystem": "Red Hat Enterprise Linux Server release 7.2 (Maipo)", + "KernelVersion": "3.10.0-327.22.2.el7.x86_64", + }), + True, + ["Docker reports kernel version 3.10.0-327"], + ), + ( + dict(info={ + "Driver": "overlay", + "DriverStatus": [("Backing Filesystem", "xfs")], + "OperatingSystem": "CentOS", + "KernelVersion": "3.10.0-514", }), False, [], @@ -85,8 +107,9 @@ def test_check_storage_driver(docker_info, failed, expect_msg): return docker_info check = dummy_check(execute_module=execute_module) - check._check_dm_usage = lambda status, task_vars: dict() # stub out for this test - result = check.run(tmp=None, task_vars=non_atomic_task_vars) + check.check_dm_usage = lambda status, task_vars: dict() # stub out for this test + check.check_overlay_usage = lambda info, task_vars: dict() # stub out for this test + result = check.run(tmp=None, task_vars=non_atomic_task_vars()) if failed: assert result["failed"] @@ -146,8 +169,8 @@ not_enough_space = { ]) def test_dm_usage(task_vars, driver_status, vg_free, success, expect_msg): check = dummy_check() - check._get_vg_free = lambda pool, task_vars: vg_free - result = check._check_dm_usage(driver_status, task_vars) + check.get_vg_free = lambda pool, task_vars: vg_free + result = check.check_dm_usage(driver_status, task_vars) result_success = not result.get("failed") assert result_success is success @@ -195,10 +218,10 @@ def test_vg_free(pool, command_returns, raises, returns): check = dummy_check(execute_module=execute_module) if raises: with pytest.raises(OpenShiftCheckException) as err: - check._get_vg_free(pool, {}) + check.get_vg_free(pool, {}) assert raises in str(err.value) else: - ret = check._get_vg_free(pool, {}) + ret = check.get_vg_free(pool, {}) assert ret == returns @@ -209,7 +232,7 @@ def test_vg_free(pool, command_returns, raises, returns): ("12g", 12.0 * 1024**3), ]) def test_convert_to_bytes(string, expect_bytes): - got = DockerStorage._convert_to_bytes(string) + got = DockerStorage.convert_to_bytes(string) assert got == expect_bytes @@ -219,6 +242,70 @@ def test_convert_to_bytes(string, expect_bytes): ]) def test_convert_to_bytes_error(string): with pytest.raises(ValueError) as err: - DockerStorage._convert_to_bytes(string) + DockerStorage.convert_to_bytes(string) assert "Cannot convert" in str(err.value) assert string in str(err.value) + + +ansible_mounts_enough = [{ + 'mount': '/var/lib/docker', + 'size_available': 50 * 10**9, + 'size_total': 50 * 10**9, +}] +ansible_mounts_not_enough = [{ + 'mount': '/var/lib/docker', + 'size_available': 0, + 'size_total': 50 * 10**9, +}] +ansible_mounts_missing_fields = [dict(mount='/var/lib/docker')] +ansible_mounts_zero_size = [{ + 'mount': '/var/lib/docker', + 'size_available': 0, + 'size_total': 0, +}] + + +@pytest.mark.parametrize('ansible_mounts, threshold, expect_fail, expect_msg', [ + ( + ansible_mounts_enough, + None, + False, + [], + ), + ( + ansible_mounts_not_enough, + None, + True, + ["usage percentage", "higher than threshold"], + ), + ( + ansible_mounts_not_enough, + "bogus percent", + True, + ["is not a percentage"], + ), + ( + ansible_mounts_missing_fields, + None, + True, + ["Ansible bug"], + ), + ( + ansible_mounts_zero_size, + None, + True, + ["Ansible bug"], + ), +]) +def test_overlay_usage(ansible_mounts, threshold, expect_fail, expect_msg): + check = dummy_check() + task_vars = non_atomic_task_vars() + task_vars["ansible_mounts"] = ansible_mounts + if threshold is not None: + task_vars["max_overlay_usage_percent"] = threshold + docker_info = dict(DockerRootDir="/var/lib/docker", Driver="overlay") + result = check.check_overlay_usage(docker_info, task_vars) + + assert expect_fail == bool(result.get("failed")) + for msg in expect_msg: + assert msg in result["msg"] diff --git a/roles/openshift_health_checker/test/etcd_traffic_test.py b/roles/openshift_health_checker/test/etcd_traffic_test.py new file mode 100644 index 000000000..287175e29 --- /dev/null +++ b/roles/openshift_health_checker/test/etcd_traffic_test.py @@ -0,0 +1,80 @@ +import pytest + +from openshift_checks.etcd_traffic import EtcdTraffic + + +@pytest.mark.parametrize('group_names,version,is_active', [ + (['masters'], "3.5", False), + (['masters'], "3.6", False), + (['nodes'], "3.4", False), + (['etcd'], "3.4", True), + (['etcd'], "3.5", True), + (['etcd'], "3.1", False), + (['masters', 'nodes'], "3.5", False), + (['masters', 'etcd'], "3.5", True), + ([], "3.4", False), +]) +def test_is_active(group_names, version, is_active): + task_vars = dict( + group_names=group_names, + openshift=dict( + common=dict(short_version=version), + ), + ) + assert EtcdTraffic.is_active(task_vars=task_vars) == is_active + + +@pytest.mark.parametrize('group_names,matched,failed,extra_words', [ + (["masters"], True, True, ["Higher than normal", "traffic"]), + (["masters", "etcd"], False, False, []), + (["etcd"], False, False, []), +]) +def test_log_matches_high_traffic_msg(group_names, matched, failed, extra_words): + def execute_module(module_name, args, task_vars): + return { + "matched": matched, + "failed": failed, + } + + task_vars = dict( + group_names=group_names, + openshift=dict( + common=dict(service_type="origin", is_containerized=False), + ) + ) + + check = EtcdTraffic(execute_module=execute_module) + result = check.run(tmp=None, task_vars=task_vars) + + for word in extra_words: + assert word in result.get("msg", "") + + assert result.get("failed", False) == failed + + +@pytest.mark.parametrize('is_containerized,expected_unit_value', [ + (False, "etcd"), + (True, "etcd_container"), +]) +def test_systemd_unit_matches_deployment_type(is_containerized, expected_unit_value): + task_vars = dict( + openshift=dict( + common=dict(is_containerized=is_containerized), + ) + ) + + def execute_module(module_name, args, task_vars): + assert module_name == "search_journalctl" + matchers = args["log_matchers"] + + for matcher in matchers: + assert matcher["unit"] == expected_unit_value + + return {"failed": False} + + check = EtcdTraffic(execute_module=execute_module) + check.run(tmp=None, task_vars=task_vars) + + +def fake_execute_module(*args): + raise AssertionError('this function should not be called') diff --git a/roles/openshift_health_checker/test/search_journalctl_test.py b/roles/openshift_health_checker/test/search_journalctl_test.py new file mode 100644 index 000000000..724928aa1 --- /dev/null +++ b/roles/openshift_health_checker/test/search_journalctl_test.py @@ -0,0 +1,157 @@ +import pytest +import search_journalctl + + +def canned_search_journalctl(get_log_output=None): + """Create a search_journalctl object with canned get_log_output method""" + module = search_journalctl + if get_log_output: + module.get_log_output = get_log_output + return module + + +DEFAULT_TIMESTAMP = 1496341364 + + +def get_timestamp(modifier=0): + return DEFAULT_TIMESTAMP + modifier + + +def get_timestamp_microseconds(modifier=0): + return get_timestamp(modifier) * 1000000 + + +def create_test_log_object(stamp, msg): + return '{{"__REALTIME_TIMESTAMP": "{}", "MESSAGE": "{}"}}'.format(stamp, msg) + + +@pytest.mark.parametrize('name,matchers,log_input,expected_matches,expected_errors', [ + ( + 'test with valid params', + [ + { + "start_regexp": r"Sample Logs Beginning", + "regexp": r"test log message", + "unit": "test", + }, + ], + [ + create_test_log_object(get_timestamp_microseconds(), "test log message"), + create_test_log_object(get_timestamp_microseconds(), "Sample Logs Beginning"), + ], + ["test log message"], + [], + ), + ( + 'test with invalid json in log input', + [ + { + "start_regexp": r"Sample Logs Beginning", + "regexp": r"test log message", + "unit": "test-unit", + }, + ], + [ + '{__REALTIME_TIMESTAMP: ' + str(get_timestamp_microseconds()) + ', "MESSAGE": "test log message"}', + ], + [], + [ + ["invalid json", "test-unit", "test log message"], + ], + ), + ( + 'test with invalid regexp', + [ + { + "start_regexp": r"Sample Logs Beginning", + "regexp": r"test [ log message", + "unit": "test", + }, + ], + [ + create_test_log_object(get_timestamp_microseconds(), "test log message"), + create_test_log_object(get_timestamp_microseconds(), "sample log message"), + create_test_log_object(get_timestamp_microseconds(), "fake log message"), + create_test_log_object(get_timestamp_microseconds(), "dummy log message"), + create_test_log_object(get_timestamp_microseconds(), "Sample Logs Beginning"), + ], + [], + [ + ["invalid regular expression"], + ], + ), +], ids=lambda argval: argval[0]) +def test_get_log_matches(name, matchers, log_input, expected_matches, expected_errors): + def get_log_output(matcher): + return log_input + + module = canned_search_journalctl(get_log_output) + matched_regexp, errors = module.get_log_matches(matchers, 500, 60 * 60) + + assert set(matched_regexp) == set(expected_matches) + assert len(expected_errors) == len(errors) + + for idx, partial_err_set in enumerate(expected_errors): + for partial_err_msg in partial_err_set: + assert partial_err_msg in errors[idx] + + +@pytest.mark.parametrize('name,matcher,log_count_lim,stamp_lim_seconds,log_input,expected_match', [ + ( + 'test with matching log message, but out of bounds of log_count_lim', + { + "start_regexp": r"Sample Logs Beginning", + "regexp": r"dummy log message", + "unit": "test", + }, + 3, + get_timestamp(-100 * 60 * 60), + [ + create_test_log_object(get_timestamp_microseconds(), "test log message"), + create_test_log_object(get_timestamp_microseconds(), "sample log message"), + create_test_log_object(get_timestamp_microseconds(), "fake log message"), + create_test_log_object(get_timestamp_microseconds(), "dummy log message"), + create_test_log_object(get_timestamp_microseconds(), "Sample Logs Beginning"), + ], + None, + ), + ( + 'test with matching log message, but with timestamp too old', + { + "start_regexp": r"Sample Logs Beginning", + "regexp": r"dummy log message", + "unit": "test", + }, + 100, + get_timestamp(-10), + [ + create_test_log_object(get_timestamp_microseconds(), "test log message"), + create_test_log_object(get_timestamp_microseconds(), "sample log message"), + create_test_log_object(get_timestamp_microseconds(), "fake log message"), + create_test_log_object(get_timestamp_microseconds(-1000), "dummy log message"), + create_test_log_object(get_timestamp_microseconds(-1000), "Sample Logs Beginning"), + ], + None, + ), + ( + 'test with matching log message, and timestamp within time limit', + { + "start_regexp": r"Sample Logs Beginning", + "regexp": r"dummy log message", + "unit": "test", + }, + 100, + get_timestamp(-1010), + [ + create_test_log_object(get_timestamp_microseconds(), "test log message"), + create_test_log_object(get_timestamp_microseconds(), "sample log message"), + create_test_log_object(get_timestamp_microseconds(), "fake log message"), + create_test_log_object(get_timestamp_microseconds(-1000), "dummy log message"), + create_test_log_object(get_timestamp_microseconds(-1000), "Sample Logs Beginning"), + ], + create_test_log_object(get_timestamp_microseconds(-1000), "dummy log message"), + ), +], ids=lambda argval: argval[0]) +def test_find_matches_skips_logs(name, matcher, log_count_lim, stamp_lim_seconds, log_input, expected_match): + match = search_journalctl.find_matches(log_input, matcher, log_count_lim, stamp_lim_seconds) + assert match == expected_match diff --git a/roles/openshift_service_catalog/files/kubeservicecatalog_roles_bindings.yml b/roles/openshift_service_catalog/files/kubeservicecatalog_roles_bindings.yml index bcc7fb590..c30c15778 100644 --- a/roles/openshift_service_catalog/files/kubeservicecatalog_roles_bindings.yml +++ b/roles/openshift_service_catalog/files/kubeservicecatalog_roles_bindings.yml @@ -137,6 +137,12 @@ objects: - serviceclasses verbs: - create + - delete + - update + - patch + - get + - list + - watch - apiGroups: - settings.k8s.io resources: |