*Disclaimer: I'm a container infrastructure consultant at Red Hat, so take all o...

rossdavidh · on Aug 14, 2020

Another way of putting this, which largely amounts to the same thing, is that containerization was developed by and for very large organizations. I have seen it used at much smaller companies, most of whom had zero need for it, and in fact it put them into a situation where they were unable to control their own infrastructure, because they had increased the complexity past the point where they could maintain it. Containerization makes deploying your first server harder, but your nth server becomes easier, for values of large n, and this totally makes sense when your organization is large enough to have a large number of servers.

tln · on Aug 14, 2020

I think containers are great even for really small companies. You boiled it down to `n' servers but its `n' servers times `m' services times `k` software updates. That's easier as soon as n * m * k > 2!

First of all, containers can be used with Cloud Run and many other ways to run containers without managing servers at all! (tho if you can use services like Netlify and Heroku to handle all your needs cost-effectively, you probably should).

Setting up a server with docker swarm is pretty easy, because there's basically one piece of software to install. From there on all the software to update and install is in containers.

If your software gets more complex, your server setup stays simple. Even if it doesn't get complex, being able to install software updates for the app independently of the host is great. Ie, I can go from Python 3.7 to Python 3.8 with absolutely zero fuss.

Deploying servers doesn't get more complicated with a few more containers. At some point that's not true but if you want to run, say, grafana as well, the install/maintenance of the server stays constant.

Imagine what you would do without containers... editing an ansible script and having to set up a new server just to test the setup, or horribly likely ssh'ing in and running commands one-off and having no testing, staging or reproducibility.

I vastly prefer Dockerfiles and docker-compose.yml and swarm to ansible and vagrant. There are more pre-built containers than there are ansible recipes as well. So your install/configure time for any off-the-shelf stuff can go down too.

Setting up developer laptops is also improved with Docker, though experiences vary... Run your ruby or python or node service locally if you prefer, set up a testing DB and cache in docker, and run any extra stuff in containers.

Lastly, I think CI is also incredibly worthwhile even for the smallest of companies and containers help keep the effort constant here too. The recipe is always the same.

rossdavidh · on Aug 15, 2020

Having used Docker and Kubernetes, and also spun up new VM's, I can say that Docker and Kubernetes are _not_ easier, if you're new at it. Spinning up a new VM on Linode or the like is easier, by far.

Now, this may sound incredible to you, because if you're accustomed to it, Docker and Kubernetes can be way easier. But, and here's the main point, there are tons of organizations for whom spinning up a new server is a once every year or two activity. That is not often enough to ever become adept at any of these tools. Plus, you probably don't even want to reproduce what you did last time, because you're replacing a server that was spun up several years ago, and things have changed.

For a typical devops, this state of affairs is hard to imagine, but it is what most of the internet is like. This isn't to say, by any means, that FAANG and anybody else who spins up servers on a regular basis shouldn't be doing this with the best tools for your needs. I'm just saying, how you experience the relative difficulty of these tasks, is not at all representative of what it's like for most organizations.

But, since these organizations are unlikely to ever hire a full-time sysadmin, you may not ever see them.

labawi · on Aug 14, 2020

Some of us have notes, that we can mostly copy-paste to setup a server and it works well without magic and n·m·k images.

Last time I checked, docker swarm was accepting connections from anywhere (publish really publishes) and messing with the firewall making a least-privilege setup a PITA; docker was building, possibly even running containers as root; and most importantly - the developers thought docker was magically secure, nothing to handle.

How do you handle your security?

tln · on Aug 15, 2020

An nginx container handles redirects to HTTPS and SSL termination and talks to the other services using unpublished ports. Only 22 (sshd running on server) and 80 and 443 (published ports) are open to the world. Swarm ports open between the swarm servers. That's between AWS security groups.

I don't build on my servers. A service (in a container) makes an outgoing connection to listen to an event bus (Google PubSub) to deploy new containers from CI (Google Cloud Builder).

Config changes (ie, adding a service) are committed then I SSH in, git pull and run a script that does necessary docker stack stuff. I don't mount anything writable to the containers.

giobox · on Aug 14, 2020

I cannot agree that "Containerization universally makes first server deployment harder". Even at single person scale, tools like Docker-Compose etc make my dev life massively simpler.

In 2020, I'd argue the opposite for most people, most of the time!

Also, if your container runtime is preinstalled in your OS as is often the case, the first run experience can be as little as a single command.

spanhandler · on Aug 14, 2020

One of my favorite things is how it forces config and artifact locations to be explicit and consistent. No more "where the hell does this distro's package for this daemon store its data?" Don't care, it puts it wherever I mapped the output dir in the container, which is prominently documented because it pretty much has to be for the container to be usable.

Hell it makes managing my lame, personal basement-housed server easier, let alone anything serious. What would have been a bunch of config files plus several shell scripts and/or ansible is instead just a one-command shell script per service I run, plus the presence of any directories mapped therein (didn't bother to script creation of those, I only have like 4 or 5 services running, though some include other services as deps that I'd have had to manage manually without docker).

Example: Dockerized Samba is the only way I will configure a Samba server now, period. Dozens of lines of voodoo magic horsecrap replaced with a single arcane-but-compact-and-highly-copy-pastable argument per share. And it actually works when you try it, the first time. It's so much better.

derefr · on Aug 14, 2020

> you could still pretty easily run stuff in a container and set it up/treat it like a "pet" rather than "cattle"

Keep in mind, though, if you've got a pet stateful "container" you can SSH into, it's not really a container any more; it's a VPS.

(Well, yes, it is technically a container. But it's not what people mean to talk about when they talk about containers.)

When $1/mo fly-by-night compute-hosting providers are selling you a "VPS", that's precisely what they're selling you: a pet stateful container you can SSH into.

And it's important to make this distinction, I think, because "a VPS" is a lot more like a virtual machine than it is like a container, in how it needs to be managed by ops. If you're doing "pet stateful containers you can SSH into", you aren't really 'doing containers' any more; the guides for managing containerized deployments and so forth won't help you any more. You're doing VMs—just VMs that happen to be optimized for CPU and memory sharing. (And if that optimization isn't something you need, you may as well throw it away and just do VMs, because then you'll gain access to all the tooling and guidance targeted explicitly at people with exactly your use-case.)

freedomben · on Aug 14, 2020

A VPS (which is usually a virtual machine) will be running on top of a hypervisor, and each VM on the host will have their own kernel. Containers on the other hand are different because the kernel is shared among every container running on the host. The separation/isolation of resources is done via kernel features rather than by a hypervisor like a VM. Adding SSH and a stateful filesystem to your container to make it long lived doesn't make it any less of a container. To me that seems like saying "my car is no longer a car because I live in it. Now it's a house (that happens to have all the same features as a car, but I don't use it that way so it's no longer a car)"

If you're defining "container" not by the technology but rather by "how it needs to be managed by ops" then we're working with completely different definitions from the start. We would first need to agree on how we define "container" before we can discuss whether you can treat one like a pet rather than cattle.

fragmede · on Aug 14, 2020

Where does an RV fit into your taxonomy?

If you have stateful containers where changes persist across restarts of the container, then I think you can't really call them containers anymore. Just like if you have VMs with read-only filesystem images generated by the CI/CD pipeline, it's not unreasonable to describe them as container-like. Once you throw in containers with a stateful filesystem or a VM with a read-only filesystem into the mix, then 'container' is no longer a good description of what's going on, and more precise terms need to be used, especially as you get into more esoteric technologies like OpenVZ/Virtuozzo, which uses kernel features, and not virtualization, to provide isolation, but it's containers are not the same as Docker's.

We could come to an agreement of the definition of container, but that wouldn't even use useful outside this thead, so maybe it's more useful to enumerate where the specific technology is and isn't important. The ops team cares about how the thing needs to be managed, and less so how it goes about achieving isolation. However, the exactly technology in use is of critical importance to the security team. (Those may be the same people.) Developers, on the third hand, ideally don't even know that containers are in use, the system is abstracted away from them so they can worry about business logic and UX matters, and not need to worry about how to upgrade the fleet to have the latest version of the openssl libraries/whatever.

fomine3 · on Aug 20, 2020

Container is a thing before Docker invented. LXC/OpenVZ/Solaris Zones should be a container. We need a different term about immutable container style like Docker.

felbane · on Aug 14, 2020

OpenVZ "VPS" offerings are, in fact, just containers with a shared kernel.

derefr · on Aug 14, 2020

> A VPS (which is usually a virtual machine)

This is where I disagree. Like I said in my sibling post, the term "VPS" was invented to obscure the difference between VM-backed and container-backed workload virtualization, so that a provider could sell the same "thing" at different price-points, where actually the "thing" they're selling is a VM at higher price-points and a container at lower price-points. "VPS" is like "spam" (the food): it's a way to avoid telling you that you're getting a mixture of whatever stuff is cheapest.

Sure, there's probably some high-end providers who use "VPS" to refer solely to VMs, because they're trying to capture market segments who were previously using down-market providers and are now moving up-market, and so are used to the term "VPS."

But basing your understanding of the term "VPS" on those up-market providers, is like basing your understanding of the safety of tap water on only first-world tap water, and then being confused why people in many places in the world would choose to boil it.

(And note that I referred specifically to down-market VPS providers in my GP post, not VPS providers generally. The ones who sell you $1/mo VPS instances are not selling you VMs.)

> If you're defining "container" not by the technology but rather by "how it needs to be managed by ops" then we're working with completely different definitions from the start.

It seems that you're arguing from some sort of top-down prescriptivist definition of what the word "container" should mean. I was arguing about how it is used: what people call containers, vs. what they don't. (Or rather, what people will first reach for the word "container" to describe; vs. what they'll first reach for some other word to describe.)

Think about this:

• Linux containers running on Windows are still "containers", despite each running isolated in their own VM.

• Amazon Elastic Beanstalk is a "container hosting solution", despite running each container on its own VM.

• Software running under Google's gVisor is said to be running "in a container", despite the containerization happening entirely in userland.

• CloudFlare markets its Edge Workers as running in separate "containers" — these are Node.js execution-context sandboxes. But, insofar as Node.js is an abstract machine with a kernel (native, un-sandboxed code) and system-call ops to interface with that kernel, then those sandboxes are the same thing to Node that containers are to the Linux kernel.

• Are unikernels (e.g. MirageOS) not running as VMs when you build them to their userland-process debugging target, rather than deploying them to a hypervisor?

> To me that seems like saying "my car is no longer a car because I live in it. Now it's a house (that happens to have all the same features as a car, but I don't use it that way so it's no longer a car)"

A closer analogy: I put wheels on my boat, and rigged the motor to it. I'm driving my boat down the highway. My boat now needs to be maintained the way a car does; and the debris from the road is blowing holes in the bottom that mean my boat is no longer sea-worthy. My boat is now effectively a car. It may be built on the infrastructure of a boat—but I'm utilizing it as a car, and I'd be far better served with an actual car than a boat.

kentonv · on Aug 14, 2020

> CloudFlare markets its Edge Workers as running in separate "containers" — these are Node.js execution-context sandboxes.

This is inaccurate:

- Cloudflare Workers does not use Node.js at all. It is a new custom runtime build on V8.

- Cloudflare absolutely does not market Workers as using "containers", in fact we market them explicitly as not "containers": https://blog.cloudflare.com/cloud-computing-without-containe...

(Disclosure: I am the lead engineer for Workers.)

----

In the industry today, the term "container" refers to a hosting environment where:

- The guest is intended to be a single application, not a full operating system.

- The guest can run arbitrary native-code (usually, Linux) binaries, using the OS's standard ABI. That is, existing, off-the-shelf programs are likely to be able to run.

- The guest runs in a private "namespace" where it cannot see anything belonging to other containers. It gets its own private filesystem, private PID numbers, private network interfaces, etc.

The first point distinguishes containers from classic VMs. The latter two points distinguish them from things like isolates as used by Cloudflare Workers.

Usually, containers are implemented using Linux namespaces+cgroups+seccomp. Yes, sometimes, a lightweight virtual machine layer is wrapped around the container as well, for added security. However, these lightweight VMs are specialized for running a single Linux application (not an arbitrary OS), and generally virtualize at a higher level than a classic hardware VM.

lukevp · on Aug 14, 2020

Hmm, is this really true? Typically people mean lxd or docker when they say containers, but VPSes run on KVM or OpenVZ and are a different level of abstraction than a container. I could be misunderstanding VPSes but I believe they are true VMs?

derefr · on Aug 14, 2020

OpenVZ is fundamentally a container system, almost exactly equivalent to LXC. (In fact, Linux namespaces and cgroups were effectively created through a refactoring and gradual absorption of OpenVZ-descended code.)

"Virtual Private Server" (VPS) is a generic marketing term used by compute providers to allow them to obscure whether they're backing your node with a true VM or with a container. Down-market providers of the kind I referred to always use it to mean containers.

Yes, these VPS provider containers are wrapped in a VM-like encapsulating abstraction by the compute engine (usually libvirt), but this is a management-layer abstraction, not a fundamental difference in the isolation level. VMs that use OpenVZ or Linux containers as their "hypervisor backend" leave the workloads they run just as vulnerable to cross-tenant security vulnerabilities and resource hogging as they would if said workloads were run on Docker.

-----

But all that's beside my original point. My point was that, when you run a "pet stateful container that you can SSH into", you're Greenspunning a VPS node, without getting any of the benefits of doing so, using tooling (Docker) that only makes your use-case harder.

If you acknowledge what you're really trying to do—to run your workload under VPS-like operational semantics; or maybe even under VM-like operational semantics specifically—then you can switch to using the tooling meant for that, and your life becomes a lot easier. (Also, you'll make the right hires. Don't hire "people who know Docker" to maintain your pseudo-VPS; they'll just fight you about it. Hire VPS/VM people!)

freedomben · on Aug 14, 2020

Just to be clear, I don't think anybody is arguing that you should use containers like you would a VPS, merely that you can. I would bet everyone here would agree that just because you can doesn't mean you should :-D

derefr · on Aug 14, 2020

Yeah, I see what you mean (when taking the word 'container' in its technical meaning.) I'm not arguing with that; in fact, that was the same point I was making!

But I think that people don't tend to use the word "container" to describe "a container used as a VPS."

Which points at a deeper issue: we really don't have a term for "the software-artifact product of the Twelve-Factor App methdology." We refer to these things as containers, but they're really an overlapping idea. They're signed/content-hashed tarballs of immutable-infrastructure software that can be automatically deployed, shot in the head, horizontally shared-nothing scaled, etc. These properties all make something very amenable to container-based virtualization; but they aren't the same thing as container-based virtualization. But in practice, people conflate the two, such that we don't even have a word for the type of software itself other than "Docker/OCI image." (But a Google App Engine workload is such a thing too, despite not being an OCI image! Heck, Heroku popularized many of the factors of the Twelve-Factor methodology [and named the thing, too], but their deploy-slugs aren't OCI images either.)

My claim was intended to mean that, if your software meets none of the properties of a [twelve-factor app OCI image workload thing], then you're not "doing [twelve-factor app OCI image workload thing]", and so you shouldn't rely on the basically-synonymous infrastructure that supports [twelve-factor app OCI image workload thing], i.e. containers. :)

freedomben · on Aug 14, 2020

Ah ok, cool yeah I think we're in total agreement then. No doubt you are absolutely right, the word container is used commonly to mean all sorts of things that aren't technically related to the technology we call containers :-)

I do think a lot of enterprise marketing and startup product pitching has made this problem so much worse. I see this a lot with Red Hat customers (and Red Hat employees too for that matter). "Containers" are painted as this great solution and the new correct way of doing things, even though much of what is being sold isn't tied to the technical implementation of containers. There indeed isn't a good marketing-worthy buzzword to describe immutable infrastructure/12-factor app and all that at a high level.

mohaine · on Aug 14, 2020

No, it isn't true. The OP basically says:

If you use containers like VPSes then you have basically have a VPS but in a container.

joana035 · on Aug 14, 2020

No, this is dogma ;)

Everything is a host and can be used for anything.

btilly · on Aug 14, 2020

Before adopting containers it wasn't unusual to SSH in and change a line of code on a broken server and restart. In fact that works fine while the company/team is really small. Unfortunately it becomes a disaster and huge liability when the team grows.

Writing a script to ssh into a bunch of machines and run a common command is the next step. That works far longer than most people acknowledge.

I pine for the old days - I really do. Things are insanely complex now and I don't like it. Unfortunately there are good reasons for the complexity.

Meh.

Containers provide solutions to the problems that someone else had. If you don't have those problems, then containers just create complexity and problems for you.

What problems do they solve? They solve, "My codebase is too big to be loaded on one machine." They solve, "I need my code to run in parallel across lots of machines." They solve, "I need to satisfy some set of regulations."

If you do not have any of those kinds of problems, DON'T USE CONTAINERS. They will complicate your life, and bring no benefit that you care about.

dionian · on Aug 14, 2020

Counterpoint, in many ways its much simpler than 20 years ago: Docker, k8s, etc is miles beyond the type of automation I used to have to deal with from the operations type people.

tluyben2 · on Aug 15, 2020

We have used chroots + a bunch of perl scripts for 20 years. Besides APIs for adding/deleting nodes or autoscaling nodes, nothing much changed for us. And, as I have remarked here before (as it is one of my businesses); that extra freedom, esp autoscaling, is almost never needed and, for most companies, far more expensive than just properly setting up a few baremetal machines. Most people here probably vastly underestimate how much transactions a modern server can handle and how cheap this is at a non-cloud provider. Ofcourse, badly written software will wreck your perf and with that nothing can save you.