I hope all of these Docker overlay networks start using the in-kernel overlay ne...

roque · on Sept 9, 2014

OpenContrail can be used as an overlay network for docker: the overlay is implemented as a kernel module and comes very close to the theoretical maximum iperf performance on a server with 2x10G links. This script https://github.com/pedro-r-marques/opencontrail-netns/blob/m... can be used to associate any docker container created with "--net=none" with an overlay network. Better yet you get all the semantics of the OpenStack neutron API: floating-ip, dhcp options, source-nat, LBaaS. The kernel module also collects flow records of all the traffic and there is a web-ui that can display the analytics of all the traffic flows in your network. Install guide: https://github.com/Juniper/contrail-controller/wiki/OpenCont... Support on freenode.net #opencontrail.

shykes · on Sept 9, 2014

I would love to take you up on that. I want to bake vxlan support into Docker upstream (as an optional plugin, like everything else).

Edit: Hi Joseph! Just realized it was you :)

jpgvm · on Sept 10, 2014

Haha for sure Solomon, I have some other ideas now too that I am using Docker in production. I will hit you up to have a chat. :)

kijiki · on Sept 10, 2014

If you're going down the path of VXLAN support in Docker, I'd love to talk. The company I founded built a Linux distribution for commodity hardware switches that can do VXLAN encap/decap in hardware at 2+ Tbit/sec. The same configuration that works in a Linux container host or a hypervisor works on the switches.

nolan@cumulusnetworks.com

darren0 · on Sept 9, 2014

You have to pick either kernel or user space, not both. Either implement it purely in the kernel or purely in user space. In reality pure user space is faster, just look at Snabb switch https://github.com/SnabbCo/snabbswitch/wiki

wmf · on Sept 9, 2014

Snabb and DPDK aren't magic though. Because they poll you have to dedicate a whole core to the vSwitch. Containers are a different case than VMs because the packets start in the kernel TCP/IP stack; to get into a userspace vSwitch they'd have to exit the kernel.

CMCDragonkai · on Sept 10, 2014

So which would you prefer? Userspace or Kernel?

wmf · on Sept 10, 2014

The kernel has a lot of accumulated network functionality that has already been optimized for performance (under certain assumptions). Let's use it.

vishvananda · on Sept 9, 2014

Since you seem to have some kernel expertise, do you know if there is an easy way (via an iptables/ebtables plugin or some such) to get packets to switch namespaces? It seems like you could do a whole lot with just simple kernel packet rewriting if you could have an in-container-namespace rule to jump into another namespace before routing. You could do some analog of this with a veth device, but it seems like it would be much faster to just switch the namespace.

kijiki · on Sept 10, 2014

"just switching namespaces" isn't easy, since a packet (in the kernel represented by an SKB) has to have an interface it came in on. The main role an veth pair has is to move the packet between namespaces, and to provide a new in interface, one that is visible in the new namespace.

Unless someone did something crazy, traversing a veth pair should just be doing a little bookkeeping on the SKB, no data copies at all.