The content of this post is part of my masters thesis (to be completed in June 2019), in which I research the security of OpenShift and how it can be extended. The main focus is on the threats on traffic flows and interconnections, and how a built-in encryption mechanism could prevent malicious influences on the operation of the platform and its data.
What's the point?
Based on the underlying Kubernetes, each node in an OpenShift cluster is deployed with a control unit called "kubelet" which locally manages the node's components. Due to its importance, this key component has certain requirements regarding availability and security. The malicious modification of the kubelet would allow attackers to not only modify the state of the node and possibly other nodes, it also opens the possibility to request resources from the master (for example secrets) in the name of the kubelet.
Now, two different attack vectors must be differentiated. First, there's the takeover of the kubelet on the node. An exploit could allow users of the platform to break out of their contained environment and to escalate their privileges to access the kubelet. Second, traffic flows between nodes and masters could be intercepted and modified. In this article, we will focus on the second scenario.
Currently, security in OpenShift is mainly handled by TLS and authentication/authorization mechanisms. The deployment of IPsec between nodes is proposed for further hardening. On the nodes, the Linux kernel provides multiple isolation techniques (namespaces, cgroups, SELinux) used by Docker and the SDN can be configured to further isolate network traffic between pods. For example, using the
redhat/openshift-ovs-multitenant OpenVSwitch SDN plugin restricts traffic between pods to pods of their own project (a namespace in Kubernetes).
The aim of the implementation presented in this article is best described with the following picture, illustrating the resulting topology design with one master and two compute nodes.
WireGuard is a relatively new VPN implementation. People call it 'hyped', but its strengths legitimate the enthusiasm for its deployment, even though it is still considered experimental. WireGuard is not only considerably easier to deploy than other VPNs, benchmarks also show it is fast and performant. And thanks to its small code base, reviews and security audits can be performed. Lastly, it is foreseeable that the WireGuard kernel module might be included in future Linux releases.
Deploying the mesh with Ansible
Let's dive straight into the deployment. As mentioned, Ansible is used in this example setup for the orchestrated configuration of all nodes. We need a playbook, an inventory, SSH access to the nodes and WireGuard installed on the nodes. In my setup, CentOS served as the OS, with enabled EPEL and WireGuard repo.
As an example, the
host_vars for the master would contain the following snippet to define a fixed overlay IP address for the
wg0 interface. In this case,
192.168.66.1 is chosen, because it's the master node. You can choose any IP from a private range you wish, as long as the CIDR is correct (
/32). One could even try IPv6 ULA ranges.. This must be adapted to all other nodes.
Now we need a way to configure each node to connect their WireGuard interfaces with all other nodes in the cluster, creating a full mesh. In the first two tasks, all nodes are set up to recognize use the WireGuard for each other peer's hostname. Where OpenShift would configure
master.cluster as e.g.
184.108.40.206, we need
After that, WireGuard is executed on the host to generate key pairs. The keys are stored in the Ansible fact cache during the run and would be recreated on a new execution of the playbook. In the end, the WireGuard configuration is rendered on every node, containing a list of all peers.
The recreation of key pairs is not a problem during runtime, since OpenShift is not dependable on the state of the
wg0 interface, but solely uses it for traffic towards other nodes. The two parts (WireGuard mesh and the deployment of OpenShift) are independent from each other! In fact, you can test the rotation of keys by pinging other nodes, running the playbook and observe the non-existence of any interruptions. As a side note, calling
wg on each node could also be done locally on the Ansible master, but executing it on the node has the advantage to detect errors if e.g. WireGuard is not installed.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46
The template for
wg0.j2 looks like the following. Note the iteration over all nodes known to Ansible (the "all" group), ignoring the node which corresponds to the node on which the template is rendered.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
After the template is rendered, each node would have one
[Interface] section with its own private key and the address from the
host_vars, and multiple (here: two)
[Peer] sections with the public key, the routed IP address for this peer (called
AllowedIPs in WireGuard) and the endpoint with the public IP address of the peer node.
Configuring OpenShift with openshift-ansible
Now that we have configured WireGuard channels between all nodes in the cluster, we can deploy OpenShift and configure it to use the encrypted channels as the default connection interfaces. In the next code block, the Ansible inventory for openshift-ansible is defined. There are multiple important components: opening of ports in the firewall, lowering the MTU of the overlay network interfacs and setting the IP addresses to the ones designated for the internal WireGuard network
For reference, the lower MTU value is a solution to a known problem with failing builds: Builds on a Virtual Network are Failing. It might be possible to choose a higher value than 1300, since
eth0 has a value of 1500 and
wg0 a value of 1420. Please note that this inventory defines the list of node groups. If you wish to use other groups like
node-config-infra as well, adapt accordingly.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36
Running the playbooks
playbooks/deploy_cluster.yml then deploys OpenShift. In this case, the result will be a cluster with three nodes: one master with infra containers and two compute nodes. The VXLAN overlay with an MTU of 1300 is routed over the WireGuard mesh, identified as
wg0 interface on each node.