PyCon Day 0: Docker, Kubernetes, Openshift

Sponsored Workshop: Docker, Kubernetes, Openshift

IaaS vs. PaaS vs. SaaS

Most services are implented as a container under the hood. Not something new.

PaaS: you provide the source code, provider provides the environment in which to run.

What has changed. Docker. It’s just containers, but it’s new. Docker has made it much more popular and possible for users to make containers themselves.

Made it simple for you to run containers on your own laptop.
Defacto standard for what an image looks like
“Container as a Service” (CaaS)

RedHat can essentially completely rewritten OpenShift from scratch to use Docker.

CaaS not the same as PaaS, since Docker takes that runtime environment and throws it away. You as a user has to provide it.

The ugly reality. Containers are not enough. You still need proper infrastructure.
- Lots of accumulating cache files and images and you run out of disk space
- If you run docker as a daemon and you lose a host, all of your containers are gone. Nothing that allows you to migrate running containers from one host to another
- Develop software, make a decision. It’s open source and you can’t change it. Have to run things as root in the container. Lots of images on dockerhub that have to run as root. No isolation. If you have access to the docker service, you can see everything. It’s not “multi-tenant”.
Orchestration, scheduling, isolation.

Kubernetes by Google

Based on experience of deploying ~7000 containers / second inside Google
“The next iteration” which has been made Open Source
BUT… people still want the easy user experience, Kubernetes only provides that lower admin level that gives the scheduling and isolation

Enter OpenShift

The stack:

Container Host (optimized Linux image)
Container (Docker)
Ochestration (Kubernetes)
Containerized Services/User Experience (OpenShift)

Relationship between OpenShift and Kubernetes?

Like Docker: if you have access to the Kubernetes Cluster, you can see everything in that cluster. In OpenShift, to provide a public, multi-tenant Service, we need that isolation. Kubernetes provides a namespace. We wrap the namespace up into a project, and put auth and user/groups on top of that. Also have web console admin and user CLI.

Kubernetes Namespace

Kubernetes “pod” is a collection of containers. Pod provides you a namespace for networking, ports, etc. So you can’t run multiple things on port 80 in the same pod, but you can break them up by pods.
Persistent volumes
- Heroku and Google App Engine both don’t have external storage. Only available for the lifetime of the request, or the container, or whatever.
Networking
- Openshift provideds Routes and a Software Defined Network (SDN)
- Kubernetes is private by default
- In Openshift, it’s your decision to make your service available
- Openshift sets up HAProxy for you, whereas with just Docker, you’re going to have to do that yourself.
Lastest version of Kubernetes is upstreaming stuff from Openshift
Deployments:
- Kubernetes provides a Replication Controller (i.e. if something dies, it will restart it for you)
- You tell it how many to run and Kubernetes decides where it should be run
- Automatically migrates your application if it dies
- Openshfit has an integrated Docker registry
- “Image Stream” is the equivalent of the different image version in DockerHub
- “Deployment Configuration”: uses a rolling deployment strategy automatically if you create a new image
Builds (added by OpenShift)
- Source to Image
- Build Configuration
- Can set up GitHub web hooks to build/deploy on git push
- Doesn’t prevent you from using existing CI… you would just push your full image at the end of the build
Automatic Builds and Deployments
- Image build * Base Image trigger
- Source code trigger (Dockerfile)
- Configuration trigger (e.g. change environment variables to tune Apache mod_wsgi)
- Pushes to Openshift Deployer
- Application runs
- Source to Image (S2I)
  - Analogus to Heroku Build Packs
- Build Packs doesn’t let you change the base image
- S2I is like Build Packs, but modernized for Docker
- Injects your source code into an application container
- Docker commit to create an image from stopped container
- That image is something we can run
- Allows us to contrain the amount of resources that the container consumes
Bundled Python Builders
- RHEL/CentOS (Python 2.7, 3.3, 3.4)
You can bring your own S2I Builder
- e.g. warpdrive (http://www.getwarped.org/)
- CentOS (2.7, 3.4)
- Debian (2.7, 3.5)
- Auto Detection
  - Django framework
- WSGI (Any WSGI framework)
  - mod_wsgi-express (pip install mod_wsgi)
  - gunicorn
  - uwsgi
  - waitress
- Python application code file (Tornado, Twisted)
- Application shell script (Jupyter notebooks)
- Build-time features
  - requirements.txt, setup.py
  - static file assets (Django collectstatic)
  - Action hooks (pre-build, build)
- Deploy-time features
  - static file assets (Apache, uWSGI, whitenoise)
  - Action hooks (deploy-env, deploy-cfg, deploy)
- “Other”
  - Lifecycle hooks (setup, migrate)
  - Command execution
  - Interactive shell access (get the same environment as when your app is run, not something different)
  - Package wheelhouse
  - Incremental builds
Run S2I Standalone?
- Yes, you can use it just to build Docker images stand alone
- If you have your own CI/CD pipeline, you can use it there

Applications and Templates

You can run PostgreSQL, MySQL, Redis etc., which you can’t do on Heroku, b/c you don’t have persistence
Obviously you can’t run PostgreSQL and just say, “I need to scale, so I’ll run more than one copy”
You can run Celery on OpenShift
Templates
- Can create a template with multiple apps that interact with each other.
- Deploy them all in one go

CPU and RAM usage

Traditional PaaS, Containers are separated and RAM is wasted
OpenShift, many applications per project and they can share the RAM usage
This is a big problem with Python b/c of the GIL. You can only increase your concurrency by adding more processes which scales up the RAM usage by a lot.
So, you can make use of resources much better with OpenShift than with others.

OpenShift Resources

OpenShift Origin (https://www.openshift.org)
- Community supported, using issue forums to get support
OpenShift Enterprise/Dedicated (https://www.openshift.com)
OpenShift Commons (http://commons.openshift.org)
TestDrive Lab on AWS (https://www.openshift.com/dedicated/test-drive.html)
All-in-one Vagrant VM box (https://www.openshift.org/vm/)
- Based on latest Origin, so it has all the things
Free Red Hat Container Development Kit (CDK) (http://developers.redhat.com/products/cdk/overview)
- Based off of the latest enterprise
training.runcloudrun.com/roadshow
openshift.pycon.openshift3roadshow.com

user26:pycon2016

“We provide better long term support for Docker than Docker does”

← Previous Archive Next →

blog comments powered by Disqus

Published

28 May 2016