Interactive map of Linux kernel

fauria · on March 16, 2018

Link to PNG image: http://www.makelinux.net/kernel_map/LKM3_2048.png

Someone1234 · on March 16, 2018

That's a fun map.

The Linux kernel has more device drivers embedded in the core than I anticipated. I'm guessing most are used for bootstrapping/fallback (e.g. loading the "actual" drivers)? Such as ext4, ipw2100, ac97, i8042, etc.

Just to be clear, I'd imagine most of the above are common enough to have on almost all but the most niche Linux distributions. My question relates around having them this high up in the source tree, rather than why they're useful/what they do.

KindOne · on March 16, 2018

The map is at least 8+ years old.

https://www.linux.com/blog/interactive-map-linux-kernel (2009)

BlackLotus89 · on March 16, 2018

It's 2007 (see the image). First version on archive.org is 2008. I would add a (2007) to the title

megous · on March 16, 2018

The map is simply incomplete, so it only lists some sample of the network card drivers there, otherwise it would be a mess.

angelsl · on March 16, 2018

What do you mean "embedded in the core"?

All drivers are in-tree.

exikyut · on March 16, 2018

Technically unmodularizable

Architecturally integrated into everyone's "this is part of the kernel and not a random printer driver" mental model

megous · on March 16, 2018

Not all driver are there, of course. Many are out of tree.

yitchelle · on March 16, 2018

I wonder how many of these device drivers are for old devices that are not available anymore; Therefore, they can be considered as dead code. Do they have an annual culling of devices drivers?

tjoff · on March 17, 2018

Just because you can't buy them doesn't mean they aren't still used.

fouc · on March 16, 2018

Feels like this needs to be a 3D map. 2 dimensions isn't quite cutting it.

oneweekwonder · on March 16, 2018

Something like jscity[0].

[0]: https://github.com/ASERG-UFMG/JSCity/wiki/JSCITY

snvzz · on March 16, 2018

Linux is such a mess that it needs a map.

It's no surprise they're paying massive technical debt over it. Work that would take a few hours in a well-structured system does take months on Linux.

Isn't surprising either that Dragonfly BSD's outperforming Linux in network throughput, despite its small number of developers.

They went for a components as concurrent lockless system servers approach [1] instead of a complex mess of fine-grained locks like Linux (and FreeBSD, following them) did.

[1] https://www.dragonflybsd.org/presentations/

jimmy1 · on March 16, 2018

I can only imagine what sort of chaos exists in Windows and OSX. Unfortunately I can only do that: imagine, because it is closed source. With Linux, I can at least see the mess, and well, make a map out of it.

psyc · on March 16, 2018

I have looked upon code that struck me down with Lovecraftian dread. Here, let me paint you a picture of a ubiquitous Microsoft product, written in C++. Entry point calls Initialize. Initialize calls Initialize2, which calls Initialize_internal, which recurses to Initialize with a flag that makes it branch another way, which calls Initialize_subsystem, which calls Initialize_subsystem2, etc, which finally recurses again to Initialize_internal with another flag ...

Honestly I never even made it out of the init code. I asked to be taken off the project immediately.

AnIdiotOnTheNet · on March 16, 2018

I find it annoying that any time someone criticises Linux for something someone has to come and point out that Windows and OSX are also kinda crappy. It's true enough, it's also just useless defensiveness. Maybe the bar we set for ourselves should be a little higher than the stuff everyone complains about?

andrepd · on March 16, 2018

Maybe it's just nigh impossible to build such a complex engineering project and keep it elegant through 25 years of development?

jimmy1 · on March 16, 2018

I think you are reading into my stance as defending Linux -- I am not. I am just saying you have to take this criticism into perspective, and consider the alternatives. I find the clamours for the <insert "beautiful" other OS here> equally useless, as they haven't had the success linux has had in the server world, and because of that, I don't care how much more "elegant" another solution is -- it doesn't have the body of work to back it up. Linux is, for now, the best option. If it's a "tangled mess of code" let's figure out how to make it better. After all, it is open source.

floren · on March 16, 2018

You can look at the OSX kernel code, just grab Apple's XNU repository. I've built bootable kernels from it. The code's not terrible, but the weirdest part to me is that there's the OSFMK microkernel portion, and then there's the BSD subsystem. They use different types, and they have their own syscall tables, and it was never clear to me where a particular bit of code would reside. I only dabbled in it, of course.

leeter · on March 16, 2018

IIRC the windows team has been in the process of cleaning up the NT Kernel for awhile (since vista at least) with the clear goal of getting NT back to "Dave Cutler's NT". The visible bits of this are the Server Micro project that's used for containers and Server core.

psyc · on March 16, 2018

Every mature project I have ever joined, ever, is such a mess that it needs a map. Typically when I start a new job, I have to make that map myself, because even the senior engineers can't explain it without stammering or omitting the half of the code they don't understand anymore. Would that I'd had something just like this map every time I started a new job!

pianowow · on March 16, 2018

How do you go about making such a map for a large code repository? I'd really appreciate a link for reading on this subject.

psyc · on March 17, 2018

I basically start anywhere and study the call graph, taking notes as I go.

BlackLotus89 · on March 16, 2018

https://www.phoronix.com/scan.php?page=article&item=netperf-... is 1 1/2 years old and tells a different story where DragonflyBSD is either even or outperformed by _both_ linux and freebsd. If you have other numbers please post them, but don't just post such statements without sources.

Edit: Just saw your answer https://leaf.dragonflybsd.org/~sephe/perf_cmp.pdf and there linux has a higher latency, but the performance throughput is not that different from dragonflybsd. (in fact at forwarding linux wins). But thx for the numbers will follow up on that later.

snvzz · on March 16, 2018

>and there linux has a higher latency, but the performance throughput is not that different from dragonflybsd.

Same throughput, better latency, a dragonfly win.

Probably the same reasoning as usual (Linux likes to spend tens of milliseconds non-preemptable every now and again).

cthalupa · on March 16, 2018

>Isn't surprising either that Dragonfly BSD's outperforming Linux in network throughput, despite its small number of developers.

This is a pretty meaningless statement. Network throughput on what? Low latency connections? High latency? Jumbo frames? What if you're using DPDK? How about vs. XDP? What is it tuned for?

If you mean that if you just install Dragonfly BSD and some random popular Linux distribution that you might see better performance out of the box on some arbitrary benchmarks, yeah, I can believe it. If you mean that you're going to get better performance in Dragonfly BSD than Linux when it comes to all scenarios with network throughput, this is silly, especially when you can take advantage of things like DPDK, XDP, af_packet v4, etc. If network performance is your biggest concern in 2018, you're caring more about that sort of tech and being on something that supports it than you are about how things perform out of the box.

(Also, Linux has had a lockless TCP stack since 2016)

bogomipz · on March 16, 2018

>"Isn't surprising either that Dragonfly BSD's outperforming Linux in network throughput, despite its small number of developers"

Do you have a citation for this claim?

snvzz · on March 16, 2018

I know there's something more recent I've seen, but I couldn't find it. This is what I found, which should show enough.

http://lists.dragonflybsd.org/pipermail/commits/2017-Septemb...

https://www.dragonflydigest.com/2017/03/06/19425.html

matwood · on March 16, 2018

Similar to how no plan survives contact with the enemy, no well-structured system survives contact with the real world. I'm not defending Linux, since I'm sure it could be better, but every large, mature system I've ever worked in has had structural deficiencies.

Thaxll · on March 16, 2018

"outperforming Linux in network throughput" I'd like to see some numbers, I'm pretty sure Linux is as fast or faster than Dragonfly BSD in any scenarios. When you make such a claim you need facts.

Also Linux TCP stack is lockless since 4.4.

blauditore · on March 16, 2018

Are there any software projects of that size that are clean and well-structured?

The larger a codebase grows, the less tractable it becomes. Refactorings are harder, more expensive and riskier. Also, more developers means more attack surface for inconsistencies and crappy code of someone unmotivated.

std0147 · on March 18, 2018

I love Linux!

this map is very useful

inetknght · on March 16, 2018

> Please Enable JavaScript or use plain html

Link then goes 404.

simooooo · on March 16, 2018

As a noob observing..Many of them seem to have names nothing like what their actual purpose is

LeonM · on March 16, 2018

A dead comment by user 'simoooo':

> As a noob observing..Many of them seem to have names nothing like what their actual purpose is

Having worked a bit with low level kernel stuff (porting a unix OS to a new platform), I understand his observation. Even after working with the kernel for a while, many names don't really make sense.

Mandatory quote of Phil Karlton:

> There are only two hard things in Computer Science: cache invalidation and naming things.

pwaivers · on March 16, 2018

Which leads to one of my favorite programming jokes:

> There are only two hard things in Computer Science: cache invalidation, naming things, and off-by-one errors.

nikomen · on March 16, 2018

Out of curiosity, do you know of any examples of seemingly misnamed objects in the kernel?

roblabla · on March 16, 2018

My recent venture in the filesystem have been somewhat fun. The problem isn't so much misnamed objects, and more that the kernel is filled with seemingly undocumented, arcane, old things. `sb_bread` isn't some kind of food, but the "SuperBlock Buffer Read" method, and `brelse` means "Buffer Release" (and not, somethingsomething else, as I initially thought).

BH means Buffer Head. The meaning and usage of Buffer Head has changed a lot across Linux versions. It's the basic unit of IO operations, but nowadays most users would use a bio for most of what you would have used a buffer head in the past. When you're working on filesystems you'll have to use BHs to talk to the hard drive though.

Linux FS infrastructure and Ext2 share a lot of names. For instance, ext2 and linux both have superblocks, inodes, blocks, etc. They are... logically the same, but at the same time they're very different - one lives purely on the disk, the other provides functions to the kernel. This makes conversation very complicated, when talking about the superblock, you have to mention which superblock you're talking about, because that's not always obvious from context.

slrz · on March 17, 2018

Names like brelse, bread, namei and so on have a long history in Unix kernels and file systems. That history forms people's expectations. If you provide a namei operation in your code, you better call it namei and not make up some other, ostensibly more clear, identifier for it.

hyper_reality · on March 16, 2018

Not exactly an object 'inside' the kernel, but I was recently caught out by the fact that /proc/net/snmp doesn't have much to do with the SNMP protocol. Rather it tracks socket statistics for IP, ICMP, TCP, and UDP, and is one of the sources of information for the netstat utility.

thezoq2 · on March 16, 2018

A professor of mine said something along the lines of "If something isn't confusing enough, give it multiple names"

dang · on March 16, 2018

In the future, try vouching for a dead comment (click on its timestamp to go to its page, then click 'vouch' at the top). If a dead comment gets enough vouches, it is restored, and then you can reply to it the normal way.

Since simoooo's comment was vouched for by other users, I've moved your reply to be a child of that one.

zython · on March 17, 2018

Heres the improved version of that quote:

> There are only two hard things in Computer Science: cache invalidation, off-by-one errors and naming things.

RickJWagner · on March 16, 2018

Ah, cool!

Hacker News paydirt.