1 minute read

Managing Containers at Scale on Google Compute Engines – Joe Beda, Google

  • Everything at Google runs in a container (KVM-based)
  • Google starts over 2 billion containers per week (3k per sec) not counting long-running containers – value obtained via Dremel
  • Let Me Contain That For You (LMCTFY): http:/github/google/lmctfy – replacement for LXC, integrating with Docker as an execdriver, API/CLI
  • Managed OS – Node Container Manager – Scheduled Containers – Cluster Scheduler (the computer for developers)
  • The goal is to drive up utilization of their clusters
  • Declarative (“Run 100 copies such that <= 2 tasks down at any time”) vs. Imperative (“Start this container on that server”)
    • Pros: repeatable, “set it and forget it”, eventually consistent, easily updateable
    • Con: Tracing failures can be difficult as a failure is fuzzy in the declaritive system (requires good monitoring)
  • Packaging Containers
    • Focused on internal Google needs, rather than a more public-friendly approach that Docker uses
    • Google: host bind mounts, binary and deps built together, interfaces to container manager (std locations for logs, API)
    • Docker image and env: more hermetic, entire chroot is explicity included, less guaranteed file structures, leverages OS distro and pkg mgrs
  • Containers on the Google Cloud Platform (developer preview, will change)
    • New container manifest.yaml that is a “scheduling unit” for sharing data, resources, and go down together
      • Example within Google: Data Loader (grabs data and loads it locally) + Data Server can be built separately but defined together with the manifest
      • https://github.com/GoogleCloudPlatform/container-agent
    • Reference Node Container Manager
      • Reads the manifest and makes it happen on top of Docker
      • Currently start and keeps containers running, soon will allow for updated manifests and expose metrics, logs, etc.
    • Debian + Docker + Node Container Manager = Container VMs on Google Compute Engine
    • This moves from VM-view of running containers on VMs today to a container-based view that happens to be on VMs