summaryrefslogtreecommitdiffstats
path: root/roles/openshift_prometheus/README.md
blob: 1ebeacabf3f9d677b1828a55167df52fea692bed (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
OpenShift Prometheus
====================

OpenShift Prometheus Installation

Requirements
------------


Role Variables
--------------

For default values, see [`defaults/main.yaml`](defaults/main.yaml).

- `openshift_prometheus_state`: present - install/update. absent - uninstall.

- `openshift_prometheus_namespace`: project (i.e. namespace) where the components will be
  deployed.

- `openshift_prometheus_node_selector`: Selector for the nodes prometheus will be deployed on.

- `openshift_prometheus_<COMPONENT>_image_prefix`: specify image prefix for the component 

- `openshift_prometheus_<COMPONENT>_image_version`: specify image version for the component 

- `openshift_prometheus_args`: Modify or add arguments for prometheus application

- `openshift_prometheus_hostname`: specify the hostname for the route to prometheus `prometheus-{{openshift_prometheus_namespace}}.{{openshift_master_default_subdomain}}`

- `openshift_prometheus_alerts_hostname`: specify the hostname for the route to prometheus-alerts `prometheus_alerts-{{openshift_prometheus_namespace}}.{{openshift_master_default_subdomain}}`

e.g
```
openshift_prometheus_args=['--storage.tsdb.retention=6h', '--storage.tsdb.min-block-duration=5s', '--storage.tsdb.max-block-duration=6m']
```

## PVC related variables
Each prometheus component (prometheus, alertmanager, alertbuffer) can set pv claim by setting corresponding role variable:
```
openshift_prometheus_<COMPONENT>_storage_type: <VALUE> (pvc, emptydir)
openshift_prometheus_<COMPONENT>_storage_class: <VALUE>
openshift_prometheus_<COMPONENT>_pvc_(name|size|access_modes|pv_selector): <VALUE>
```
e.g
```
openshift_prometheus_storage_type: pvc
openshift_prometheus_storage_class: glusterfs-storage
openshift_prometheus_alertmanager_pvc_name: alertmanager
openshift_prometheus_alertbuffer_pvc_size: 10G
openshift_prometheus_pvc_access_modes: [ReadWriteOnce]
```

## NFS PV Storage variables
Each prometheus component (prometheus, alertmanager, alertbuffer) can set nfs pv by setting corresponding variable:
```
openshift_prometheus_<COMPONENT>_storage_kind=<VALUE>
openshift_prometheus_<COMPONENT>_storage_(access_modes|host|labels)=<VALUE>
openshift_prometheus_<COMPONENT>_storage_volume_(name|size)=<VALUE>
openshift_prometheus_<COMPONENT>_storage_nfs_(directory|options)=<VALUE>
```
e.g
```
openshift_prometheus_storage_kind=nfs
openshift_prometheus_storage_access_modes=['ReadWriteOnce']
openshift_prometheus_storage_host=nfs.example.com #for external host
openshift_prometheus_storage_nfs_directory=/exports
openshift_prometheus_storage_alertmanager_nfs_options='*(rw,root_squash)'
openshift_prometheus_storage_volume_name=prometheus
openshift_prometheus_storage_alertbuffer_volume_size=10Gi
openshift_prometheus_storage_labels={'storage': 'prometheus'}
```

NOTE: Setting `openshift_prometheus_<COMPONENT>_storage_labels` overrides `openshift_prometheus_<COMPONENT>_pvc_pv_selector`


## Additional Alert Rules file variable
An external file with alert rules can be added by setting path to additional rules variable: 
```
openshift_prometheus_additional_rules_file: <PATH> 
```

File content should be in prometheus alert rules format.
Following example sets rule to fire an alert when one of the cluster nodes is down:

```
groups:
- name: example-rules
  interval: 30s # defaults to global interval
  rules:
  - alert: Node Down
    expr: up{job="kubernetes-nodes"} == 0
    annotations:
      miqTarget: "ContainerNode"
      severity: "HIGH"
      message: "{{ '{{' }}{{ '$labels.instance' }}{{ '}}' }} is down"
```


## Additional variables to control resource limits
Each prometheus component (prometheus, alertmanager, alert-buffer, oauth-proxy) can specify a cpu and memory limits and requests by setting
the corresponding role variable:
```
openshift_prometheus_<COMPONENT>_(limits|requests)_(memory|cpu): <VALUE>
```
e.g
```
openshift_prometheus_alertmanager_limits_memory: 1Gi
openshift_prometheus_oath_proxy_requests_cpu: 100
```

Dependencies
------------

openshift_facts


Example Playbook
----------------

```
- name: Configure openshift-prometheus
  hosts: oo_first_master
  roles:
  - role: openshift_prometheus
```

License
-------

Apache License, Version 2.0