> In other words, like GNU HURD. That design is very very hard to make correct a...

burfog · on March 20, 2016

Every example of a "fast" microkernel has either ripped out expected functionality (debug traces for example) or simply been the first to make an optimization that can be applied to monolithic kernels as well. Fundamentally, microkernels are slower. A bit of thought should make it clear that this can not be otherwise. No matter how fast you can pass a message, it's still faster to not pass a message at all. Also, the overhead of TLB misses when changing MMU mappings is huge. Microkernels can only win when they compete against badly-optimized monolithic kernels and there is no technique that can get past this fundamental truth.

Animats · on March 20, 2016

It's an architecture problem. What makes QNX fast are a few basic design decisions:

- The basic interprocess communication mechanism works like a synchronous subroutine call - you call, you wait, you get data back. Most slower microkernels have unidirectional I/O as a primitive.

- This is very tightly integrated with CPU dispatching, so that calling a service which isn't currently busy is just a context switch, not a full pass through the CPU dispatcher. This and the above are what make QNX fast. If you do interprocess communication by writing to a socket without blocking, then wait for a reply by reading from one, it takes several extra trips through the CPU dispatcher to call another process. Worse, every such call can put the handoff to the new process at the end of the line for CPU time. If you're CPU bound, this kills performance, in some systems by orders of magnitude. The ability to toss control back and forth between processes at high speed is essential. (This is where Mach blew it.)

- Userspace programs can be placed in the boot image and loaded at boot time. So can shared code objects. This eliminates the temptation to put stuff in the kernel so it's available early in startup. File systems and networking are all in userspace.

How does Redox do in these areas?

burfog · on March 20, 2016

As a thought experiment, let's make QNX faster.

We keep the same IPC mechanism. We compile the filesystem process right into the kernel. Having done this, we can now avoid half of the IPC mechanism. We enter the "microkernel" just once now, instead of twice, and we leave it once instead of twice. Since the filesystem is now in the "microkernel", we don't need to switch MMU state and have a TLB invalidate. This is a huge win. Now let's repeat this design change for the disk driver, the network stack, the network hardware driver, and all the rest. Performance keeps getting better. This, BTW, is pretty much what most Mach systems ended up doing. They became microkernel in marketing only. The final step is to clean up the code, and then you have a normal monolithic kernel.

Let's also look at things from the other perspective. You could add the QNX IPC mechanism into any monolithic kernel. AFAIK, Solaris DOORS might even qualify. Well, there you go. You can move things to use that whenever you are willing to sacrifice speed and maintainability. If this is so good, why haven't people done it? Hmmm.

nickpsecurity · on March 20, 2016

"Performance keeps getting better."

Performance is great until those things crash my system. The stuff still happens with graphics drivers on my Linux distro's. I know it's not necessary because it doesn't happen on the microkernel systems and even Windows dodges a lot of it with their SLAM toolkit.

"Let's also look at things from the other perspective. You could add the QNX IPC mechanism into any monolithic kernel. "

Congratulations: you've just reinvented security kernels w/ legacy support from the 80's-90's plus modern separation kernels w/ legacy support of the 2000's. Here's an example to support your point that our model is better with microkernels, user-mode drivers, and monolithic API's in isolated partitions:

https://www.usenix.org/legacy/events/sec04/tech/wips/wips/04...

Even just putting the drivers and a few critical components in partitions can work wonders. That's what Nizza-like architectures like Turaya and Genode do. Their TCB's are many fold smaller than UNIX's with acceptable performance. You don't even notice it with the laptops of commercial ones (eg INTEGRITY-178B, LynxSecure, VxWorks MILS). Plus, there's around a billion mobile phones running OKL4 mainly for baseband or legacy isolation alongside Android or Windows Mobile. Notice how your smartphone is so much slower than older ones that didn't do that? Wait, you thought it was faster and better than the last one? Exactly. :)

nwmcsween · on March 20, 2016

IMO microkernels are a dead end, to get better performance you have to shove stuff in kernelspace, while for say an exokernel the opposite is mostly true.

nickpsecurity · on March 20, 2016

If we're talking tech, then your post couldn't be more wrong given my BeOS and QNX examples. Performance was equal to or better than monoliths of the time. BeOS especially destroyed competition in concurrency performance due to its architecture. QNX runs at hardware speed basically with real-time properties and POSIX support. BeOS disappeared due to Microsoft monopoly with Haiku making a OSS clone. QNX was at $40 million a year in revenue when Blackberry bought it. Green Hills and VxWorks are doing OK, too, with VxWorks making more than QNX per quarter. Both have desktops virtualizing Windows, Linux, etc on microkernels w/ Gbps throughput.

I don't see why we keep getting these theoretical counters given the proven results of microkernel performance in the field. Tell me why microkernels are too slow when they can only do this on 90's era hardware:

https://youtu.be/BsVydyC8ZGQ?t=16m9s

Then, tell us why we should sacrifice isolation of malice and faults in favor of kernel-mode code with properties like this:

https://www.cvedetails.com/product/47/Linux-Linux-Kernel.htm...

Our side produced highly reliable and secure systems plus high-performance systems. It was always done with a small group with little time. The monoliths took a decade and thousands of man hours to do the same. It's up to you people to justify why those hours were well-spent.

"while for say an exokernel the opposite is mostly true."

An exokernel is a microkernel...

burfog · on March 20, 2016

I've seen the VxWorks code. VxWorks is not a microkernel.

That BeOS demo did not heavily use privileged interactions. Mostly it showed computation which is the same on any OS. The best thing it showed was a process scheduler which was good at giving priority to things that a user would care about. A more interesting test would be serving files or building software.

I think one should be careful not to read too much into CVE numbers. People aren't exactly trying to mess with KeyKOS, Haiku, QNX, and other weird things. Few people want to bother. None of the Linux problems are inherently specific to monolithic design. The best you could say is the you might have a sandbox that makes things more difficult for the attacker. On the other hand, restarting means you give attackers more chances to succeed.

nickpsecurity · on March 20, 2016

The best thing you can say is a bug in kernel code that hoses my whole system is less likely to happen several times over. Suddenly, hackers or faults have to work through components' information flows. You keep ignoring that in your analyses. Also why I brought up CVE's because it's impossible that the microkernels had as many in kernel mode just by code size. Still plenty to be found in privilieged processes but POLA and security checks are way easier when memory model is intact.

Btw, one person here who wrote about QNX desktop demo mentioned doing productivity stuff while compiles ran in background with no lag. So there's that use case except not for BeOS. The link below will show you BFS was more like a combo of NoSQL DB, files, and streaming server:

http://arstechnica.com/information-technology/2010/06/the-be...

Due to its nature, compilation and build systems are about the slowest things you can do on it. I've seen numbers ranging from 2.5x to 20x slower than Linux but they didnt share specs. I'd swap out the magic filesystem for a simpler one if on a development box. BeOS was aimed at creating, editing, and viewing streaming media, though. Did that very well.

Re sandbox more difficult

No kidding! That's the entire point: get it right or make it harder to beat at least. Monoliths on mainstream hardware are amusement parks with free rides and victims everywhere for attackers. Microkernels on COTS hardware and even modular, typed monoliths on POLA hardware are a series of sandboxes with adult supervision during play and movement. Quite a difference in number of problems showing up and damage done.

Re more chances to succeed

You keep repeating this too without evidence. Attackers need vulnerabilities to succeed. They'll know some to use ahead of time or they won't if we're talking OS compromise. A flaw in one module lets them take one module no matter how many restarts. A flaw in two with a flow means they'll get it in first try. This is why you design it so each flow and individual op on them follow security policy.

The only time restarts give attack opportunities is if your using probabilistic tactics (eg ASLR) or they're waiting for intermitent failure (eg MMU errata). Any high assurance system better not exclusively rely on tactics (ever) and should account for latter (eg immunity-aware programming).

All in all, anything you've said about microkernel systems applies to monoliths in various ways. One model just limits system-hosing faults and hacks a lot better. The question is do you want to accept that risk to squeeze out max performance or eliminate that risk with acceptable performance? Microkernels choose risk reduction while mainstream monoliths choose performance.

nwmcsween · on March 20, 2016

An exokernel is definitely not a microkernel, one provides abstraction via (usually) processes 'servers' the other one does via a library which is vastly cheaper overhead wise. I do understand that there will be a need for some sort of IPC just not to the extent of a microkernel.

nickpsecurity · on March 20, 2016

A microkernel is an abstraction over hardware with minimal code and API. An exokernel is a form of microkernel since it has these properties. It just does thing very differently from most microkernels. Hence a name for that style.

pjmlp · on March 20, 2016

If they are a dead end, why do they rule the embedded and real time OSes?

bogomipz · on March 22, 2016

except that Mach OS X's mach is a microkernel right? Or am I missing something?

nickpsecurity · on March 23, 2016

Mach was a microkernel that tried to do a bit too much. Performance and security stayed horrible. Other designs had acceptable to great performance or security. So, it's not representative of microkernels in general despite being interesting research platform back in its own day.

Now, Darwin starts with Mach then basically adds BSD and graphic stack onto it in kernel mode. So, stuff that would have user-mode isolation and performance penalties due to Mach bloat has no penalty but less protection.

So, OS X is clearly not a microkernel system so much as incorporating a microkernel into a monolith. Windows similarly has a microkernel near its foundation for organization purposes I think. Linux is really modular inside similar to microkernels but clearly similarity ends there. So, there's lots of hybrid results where monoliths and microkernels styles are blended a bit for compromise of bensfits.

bogomipz · on March 22, 2016

What exactly is the QNX IPC system? How is it different from regular IPC?

Can mapping QNX into the top of userspace processes avoid the TLB flush/invalidate?

I vaguely remember the Solaris Doors which is similar to pipe correct? How does it relate to this discussion.

Thanks.

bogomipz · on March 22, 2016

Can you elaborate on what you mean by "CPU dispatcher"? I am not familiar with the term and have not heard it before. Is this something specific to certain SoC designs? I've never heard mention of this in x86 architectures.

EdSharkey · on March 20, 2016

Modern desktop computers are what, a million times faster than counterparts from the 1980's? I will take the TLB misses and reduced efficiency. It's time for safety and reliability to take center stage. Microkernels seem like a great design for that, much moreso than monolithic kernels.

bogomipz · on March 21, 2016

Does OS X's Mach count as a "fast" microkernel by your definition?