#GlueCon 2014 Notes: Managing Containers at Scale on Google Compute Engines – Joe Beda, Google
Managing Containers at Scale on Google Compute Engines – Joe Beda, Google
- Everything at Google runs in a container (KVM-based)
- Google starts over 2 billion containers per week (3k per sec) not counting long-running containers – value obtained via Dremel
- Let Me Contain That For You (LMCTFY): http:/github/google/lmctfy – replacement for LXC, integrating with Docker as an execdriver, API/CLI
- Managed OS – Node Container Manager – Scheduled Containers – Cluster Scheduler (the computer for developers)
- The goal is to drive up utilization of their clusters
- Declarative (“Run 100 copies such that <= 2 tasks down at any time”) vs. Imperative (“Start this container on that server”)
- Pros: repeatable, “set it and forget it”, eventually consistent, easily updateable
- Con: Tracing failures can be difficult as a failure is fuzzy in the declaritive system (requires good monitoring)
- Packaging Containers
- Focused on internal Google needs, rather than a more public-friendly approach that Docker uses
- Google: host bind mounts, binary and deps built together, interfaces to container manager (std locations for logs, API)
- Docker image and env: more hermetic, entire chroot is explicity included, less guaranteed file structures, leverages OS distro and pkg mgrs
- Containers on the Google Cloud Platform (developer preview, will change)
- New container manifest.yaml that is a “scheduling unit” for sharing data, resources, and go down together
- Example within Google: Data Loader (grabs data and loads it locally) + Data Server can be built separately but defined together with the manifest
- https://github.com/GoogleCloudPlatform/container-agent
- Reference Node Container Manager
- Reads the manifest and makes it happen on top of Docker
- Currently start and keeps containers running, soon will allow for updated manifests and expose metrics, logs, etc.
- Debian + Docker + Node Container Manager = Container VMs on Google Compute Engine
- This moves from VM-view of running containers on VMs today to a container-based view that happens to be on VMs
- New container manifest.yaml that is a “scheduling unit” for sharing data, resources, and go down together