PyCon Day 0: Docker, Kubernetes, Openshift
Sponsored Workshop: Docker, Kubernetes, Openshift
IaaS vs. PaaS vs. SaaS
Most services are implented as a container under the hood. Not something new.
PaaS: you provide the source code, provider provides the environment in which to run.
What has changed. Docker. It’s just containers, but it’s new. Docker has made it much more popular and possible for users to make containers themselves.
- Made it simple for you to run containers on your own laptop.
- Defacto standard for what an image looks like
- “Container as a Service” (CaaS)
RedHat can essentially completely rewritten OpenShift from scratch to use Docker.
CaaS not the same as PaaS, since Docker takes that runtime environment and throws it away. You as a user has to provide it.
- The ugly reality. Containers are not enough. You still need proper infrastructure.
- Lots of accumulating cache files and images and you run out of disk space
- If you run docker as a daemon and you lose a host, all of your containers are gone. Nothing that allows you to migrate running containers from one host to another
- Develop software, make a decision. It’s open source and you can’t change it. Have to run things as root in the container. Lots of images on dockerhub that have to run as root. No isolation. If you have access to the docker service, you can see everything. It’s not “multi-tenant”.
- Orchestration, scheduling, isolation.
Kubernetes by Google
- Based on experience of deploying ~7000 containers / second inside Google
- “The next iteration” which has been made Open Source
- BUT… people still want the easy user experience, Kubernetes only provides that lower admin level that gives the scheduling and isolation
Enter OpenShift
The stack:
- Container Host (optimized Linux image)
- Container (Docker)
- Ochestration (Kubernetes)
- Containerized Services/User Experience (OpenShift)
Relationship between OpenShift and Kubernetes?
Like Docker: if you have access to the Kubernetes Cluster, you can see everything in that cluster. In OpenShift, to provide a public, multi-tenant Service, we need that isolation. Kubernetes provides a namespace. We wrap the namespace up into a project, and put auth and user/groups on top of that. Also have web console admin and user CLI.
Kubernetes Namespace
- Kubernetes “pod” is a collection of containers. Pod provides you a namespace for networking, ports, etc. So you can’t run multiple things on port 80 in the same pod, but you can break them up by pods.
- Persistent volumes
- Heroku and Google App Engine both don’t have external storage. Only available for the lifetime of the request, or the container, or whatever.
- Networking
- Openshift provideds Routes and a Software Defined Network (SDN)
- Kubernetes is private by default
- In Openshift, it’s your decision to make your service available
- Openshift sets up HAProxy for you, whereas with just Docker, you’re going to have to do that yourself.
- Lastest version of Kubernetes is upstreaming stuff from Openshift
- Deployments:
- Kubernetes provides a Replication Controller (i.e. if something dies, it will restart it for you)
- You tell it how many to run and Kubernetes decides where it should be run
- Automatically migrates your application if it dies
- Openshfit has an integrated Docker registry
- “Image Stream” is the equivalent of the different image version in DockerHub
- “Deployment Configuration”: uses a rolling deployment strategy automatically if you create a new image
- Builds (added by OpenShift)
- Source to Image
- Build Configuration
- Can set up GitHub web hooks to build/deploy on git push
- Doesn’t prevent you from using existing CI… you would just push your full image at the end of the build
- Automatic Builds and Deployments
- Image build * Base Image trigger
- Source code trigger (Dockerfile)
- Configuration trigger (e.g. change environment variables to tune Apache mod_wsgi)
- Pushes to Openshift Deployer
- Application runs
- Source to Image (S2I)
- Analogus to Heroku Build Packs
- Build Packs doesn’t let you change the base image
- S2I is like Build Packs, but modernized for Docker
- Injects your source code into an application container
- Docker commit to create an image from stopped container
- That image is something we can run
- Allows us to contrain the amount of resources that the container consumes
- Bundled Python Builders
- RHEL/CentOS (Python 2.7, 3.3, 3.4)
- You can bring your own S2I Builder
- e.g. warpdrive (http://www.getwarped.org/)
- CentOS (2.7, 3.4)
- Debian (2.7, 3.5)
- Auto Detection
- Django framework
- WSGI (Any WSGI framework)
- mod_wsgi-express (pip install mod_wsgi)
- gunicorn
- uwsgi
- waitress
- Python application code file (Tornado, Twisted)
- Application shell script (Jupyter notebooks)
- Build-time features
- requirements.txt, setup.py
- static file assets (Django collectstatic)
- Action hooks (pre-build, build)
- Deploy-time features
- static file assets (Apache, uWSGI, whitenoise)
- Action hooks (deploy-env, deploy-cfg, deploy)
- “Other”
- Lifecycle hooks (setup, migrate)
- Command execution
- Interactive shell access (get the same environment as when your app is run, not something different)
- Package wheelhouse
- Incremental builds
- Run S2I Standalone?
- Yes, you can use it just to build Docker images stand alone
- If you have your own CI/CD pipeline, you can use it there
Applications and Templates
- You can run PostgreSQL, MySQL, Redis etc., which you can’t do on Heroku, b/c you don’t have persistence
- Obviously you can’t run PostgreSQL and just say, “I need to scale, so I’ll run more than one copy”
- You can run Celery on OpenShift
- Templates
- Can create a template with multiple apps that interact with each other.
- Deploy them all in one go
CPU and RAM usage
- Traditional PaaS, Containers are separated and RAM is wasted
- OpenShift, many applications per project and they can share the RAM usage
- This is a big problem with Python b/c of the GIL. You can only increase your concurrency by adding more processes which scales up the RAM usage by a lot.
- So, you can make use of resources much better with OpenShift than with others.
OpenShift Resources
- OpenShift Origin (https://www.openshift.org)
- Community supported, using issue forums to get support
- OpenShift Enterprise/Dedicated (https://www.openshift.com)
- OpenShift Commons (http://commons.openshift.org)
- TestDrive Lab on AWS (https://www.openshift.com/dedicated/test-drive.html)
- All-in-one Vagrant VM box (https://www.openshift.org/vm/)
- Based on latest Origin, so it has all the things
- Free Red Hat Container Development Kit (CDK) (http://developers.redhat.com/products/cdk/overview)
- Based off of the latest enterprise
- training.runcloudrun.com/roadshow
- openshift.pycon.openshift3roadshow.com
user26:pycon2016
“We provide better long term support for Docker than Docker does”
blog comments powered by Disqus