Chapter 2 is an excellent dive into some of the more unfamiliar and confusing features emulated by terminal emulators.
It would be neat to have an equivalent set-up using a GUI toolkit instead, but the terminal is good enough to work with.
In a broader sense, it's pretty frustrating to know that we are still emulating features (like ctrl-[s,q]) that only really made sense in the context of a physical terminal. The amount of work and frustration we could save with a modern equivalent to terminal emulators (without historical baggage) would be really significant.
> The amount of work and frustration we could save with a modern equivalent to terminal emulators (without historical baggage) would be really significant.
This has been occupying my mind for over a decade, but I can't come up with a design that I like.
For "TUI" apps: something like a GUI toolkit, but should remain reasonably performant over a poor connection (think how mosh tries to hide input latency), and the wire protocol should be something you can bang out in bash.
For shells: the shell should not be concerned with things like keyboard shortcuts, line wrapping, or positioning the cursor - this should be done on the client (terminal). You should be able to set a proportional font, if that's what you like. The shell should be able to trivially request displaying a picture, report on the status of a background task, or present an interactive progress bar.
Maybe this is two different projects / efforts, maybe one can build on top of another.
TL;DR: I'm about to go on a lengthy rant, because I too have spent over a decade obsessing over this subject. I feel like I've been starting to actually get somewhere, but there's still a lot up in the air. Also my meds are wearing off, so all eighdy HD's are present. Read at your own risk...
-----
It's a game of compromises. The first problem is to decide what goals you value most.
The current compromises are decided by the goal of backwards compatibility. In that sense, we are "steeped in tradition". It's a mess, but at least decisions have been made; and we are recognizing - right here, right now - just how significant that fact is on its own.
> For "TUI" apps: something like a GUI toolkit, but should remain reasonably performant over a poor connection
That sounds a lot like what web browsers are; which reminds me of the sheer magnitude of options we have available. It's easy to get stuck with decision fatigue. We aren't making a universe simulator here, so we need to break this project down to a smaller domain. The constraints of the problem domain are what define a project, so pinning them down is 90% of the work.
I'm thinking back at what my motivation was in the first place: frustration with the status quo. Having a simple text UI interface is evidently useful. The shell is the first and last thing I use: it's my home, and that's why I care so much in the first place. Something with generally the same constraints of a terminal emulator & shell is what I want. My frustration lies with the complexity in both the terminal emulator itself, and the way a shell (and every other terminal-based program) interfaces with that. So what's the bare minimum that these two things need to function?
First, the terminal needs input and output. But these needs are really just an extension of the text-based program's needs, right? So what does a shell or a text editor need to function? Well, it needs input and output. Hold on now, do we really need the terminal emulator? Why not have the shell do that?
Well, there are some reasons: Shells can be run without any IO. Inside shells, we can pipe programs into each other. Those pipes are made of text. Is that the domain we want? Well, using text as the interface is one of my main frustrations with terminal emulators: even the ASCII table itself shoves key-presses into the domain of text, and that's an ugly hack I don't want to keep. There are functional needs for dropping that, too: "ctrl-m" is the same ASCII code as "enter", and keyboard layouts aren't always very compatible with ASCII in the first place. There are too many assumptions baked in here.
And while we are on the topic of ASCII, escape characters are the ugliest hack in here. On the terminal, we use them for input - to extend the ASCII table to accommodate a few more keys - and for output - for decoration like text color, boldness, and even flashing background. It's like a markup language, but made out of non-drawable characters that were supposed to be typed with the escape key. There are so many ugly edge cases here, you could write a novel about them!
But shells aren't always about plain text. You can pipe gzip to dd or a file, and it will totally work. But if you pipe it to your terminal output, weird things can happen. That's because some raw binary looks like escape characters, and we are already solving that problem by making user input a separate thing. Progress!
So what do we do with keystrokes? Let the user decide! But what data will end up getting sent to the program? The status quo is to be a REPL by default. That's the foundation for a lot of the ugly assumptions we just dropped. It's the reason that the OP has two whole chapters just on disabling that behavior. But most programs we invoke from a shell are either printing, or running a REPL of their own. And a lot of them are bad at it, which is why GNU readline is a thing. There's even rlwrap, which wraps a program in readline! That's very helpful, but I would love to never need it again. Plus, readline makes some UI decisions of its own, like its emacs-like key bindings, and I would rather it not.
So the big question here is about how UI should flow. That question gets me all kinds of excited, and all kinds of overwhelmed. This is where I start shifting my focus away from shells and back to text editors.
---
Emacs has its own shell. But Emacs isn't a dumb terminal emulator. It's built out of lisp, which represents data as text in a much cleaner way than ASCII escape sequences could ever dream of. And the way its interface flows is definitely more interesting than a REPL. In fact, some of the crazy shit you can do in eshell is one of the main things that get me thinking about shells and terminals and the ways they hold me back.
And if you really want a good example of how text-based UI can flow, look no farther than Vim. The modular flow is elegant and efficient in ways that the primitive REPL could only dream of.
Yet both of these wonderful programs bring me back to the very same frustrations I have with terminal emulators and shells: they are steeped in tradition!
Vim was my first. I was happy to sin against the Church of Emacs, the stubborn atheist I am, and vi is a seductive mistress. But while vi could do most anything I wanted, and it could bend and extend in many exciting ways, I still managed to find its limits. When I started typing with a non-standard keyboard layout (workman), I tried remapping Vim's keys to compensate. First you remap a key, then you remap the one it replaced, then the next, etc... and eventually you hit a circular dependency. It can't be done. So I decided to take the opportunity to learn me an Emacs.
Emacs is incredibly customizable. If you can navigate your way into its elisp structures, you can mold it into anything you want. But what Emacs already is by default may be easy to decorate and add to, it's not so easily unmade. I had big ideas about making my own vi-style modular editing keybindings once, but I never could quite find my way. Every move I made, I found myself tripping over what was already there; what I had come to replace. The whole time, in the back of my mind, my conscience screamed, "We just need a clean slate! Emacs without the defaults. VoidMACS."
It's the same problem everywhere I look: UX assumptions. Predetermined flow. Defaults. My ultimate arch-villain.
Every time I think about this, I fantasize about the perfect user experience: one with zero assumptions. A modular UI, totally factored away from data and functions. That's what shells got right in the first place! What if we did that with GUI?
Take a moment and think about your favorite shell utilities. What do they look like? What do they do? What input do they need? How do they flow? Here we are, at home, in the REPL. But what if we stepped out, right through the fourth wall, and looked at it? What if we took that output and forked it into a file, or another utility? That's where the shell was born. Can it be bourne again again?
Emacs tried. It took the flow, and put it into a text buffer. That let us do some non-linear flow. Emacs also recreated some utilities in lisp, so that they could interface with more complicated structures, and react to different events like hooks. Over time, some structure emerged, but it was too organic. Using Emacs today means wading through traditions every bit as deep as the shell itself. Like a fine aged whiskey, it soothes us into a drunken haze.
The hangover leaves me desperate for clarity. Turn off the stimulus. Give me a blank slate. An empty canvas. You've got a default startup sequence? Drop it. Got a welcome screen? Drop it. Default scratch buffer? Drop it. Default keybinds? Drop. It. I want nothing. I don't care how broken that makes it!
In Episode 2 of Firefly, Mal speculates about where the mysterious Reavers came from: "They got out to the edge of the galaxy, to that place of nothin' and that's what they became." I'm not afraid: Give me the void, and let me get to work. I am the user. I am God!
When I start my text editor, I want nothing there but what I configured. It's not that scary of a concept when you think about it: "defined" doesn't have to be from scratch. What I really envision is a choice of options. Like libraries or plugins that you drop into the top of your config file. Someone made a Vi-clone mode you like? Put it in your config. Someone made a Sublime text clone mode you like? Put it in your config. Someone made a Windows Vista clone with a half-broken MS notepad that you like? I don't give a fuck about your personal preferences, put it in your config!
No one needs to know what your interface looks like, or what your keys are mapped to. They just need to give you two simple things: input and output.
Want to edit text? Import the text editing mod, then map a part of your windowing system to the text renderer (output), and map some keybindings (input) to the text-editing functions.
Want to rebase a git tree? Import the interactive git editor mod, then map a part of your windowing system to the interactive git UI renderer (output), and map some keybindings (input) to the interactive git functions.
Want to play Tetris? Import the Tetris-clone mod, then map a part of your windowing system to the tetris clone renderer (output), and map some keybindings (input) to the game movement functions.
Want a web browser? Import the Firefox-fork for voidmacs mod, then map a part of your windowing system to the voidfox renderer (output), and map some keybindings/mouse events (input) to it.
Want a tiling window manager? Import the i3-for-voidmacs mod, then map your windowing system to a tile of windowing system instances (output), and map some keybindings (input) to the window split, move, and close functions.
Want a Vim-like modal interface? Put your regular basic text-editing keybindings in a new namespace called "insert mode" (plus a key to go back to normal-mode), and make some new "normal mode" keybinds. Put all of that in a mod, and import that into your config. Done. Works everywhere.
Want an Alexa/Siri/GoogleHome/Bixby/Jarvis assistant to bitch at? Import the mod, then set the default speaker and whatever other mod input functions you want integrated (output), configure the wake word hook to "HAL 9000" (input), and tell it to sudo make me a sandwich.
---
Confused yet? I've been ranting for a while now.
Basically, I want an Emacs, but one where modules/packages never define their user interface. Or maybe a module/package defines a user interface, but that's all it does. Never both.
That way, modules/packages are guaranteed to be composable. Like neatly factored out functions, you can stack them together. You can weave them into whatever convoluted tapestry your heart desires. And you never have to undo any configuration. Because undoing configuration is pain.
As soon as a module/package defines functionality and user interface, it ties a knot. That is not allowed. Only the user can do that. Otherwise, at some point, the user is going to have to go around untying knots all over the place.
I could keep ranting forever, but I'm starting to worry about the character limit on HN... So I bid thee good afternoon, good evening, and goodnight.
I think you're raising a couple important questions here, but I believe building apps that are actually frameworks is a trap.
Have you looked at Awesome (https://awesomewm.org/)? It's a window manager that actually advertises itself as a framework. It has a default example config, that you can just completely ignore, and instead, use the API to build your own WM from scratch, using high-level primitives such as windows and tags, rather than having to deal with X11/XCB directly.
I've used Awesome a lot in the 3.x days, and one of the major pain points was always upgrading. My old config (only slightly adapted from the default) would regularly break between minor releases. It was an experience similar to upgrading Django projects - something I'd charge money for nowadays, not inflict upon myself voluntarily.
I have a similar experience with Emacs, I'd rather have a 10 line config and sane defaults than a 10k line program that I end up working on more often than on my actual projects. If not for magit, I'd have switched to VS Code a long time ago.
I'm afraid most frameworks-disguised-as-apps end up this way. I want the exact opposite - a simple GUI framework, with a native feel, that makes writing simple apps (such as the text editor from TFA) - as simple, as banging out escape codes to stdout, but without emulating half a century of technical debt. And nope, using a web browser for that purpose doesn't fit my definition of simple ;)
BTW feel free to reply on email if you feel constrained by HN post length.
I didn't plan on ranting either, but I got started and kept enjoying it. It's satisfying to put ideas into words, and we don't really have explicit spaces where these abstract discussions can belong. There is a subject here I don't really have a name for, like a frontier whose edge sits between all of us. Imposing, yet just out of reach. I guess I should really start that blog I've been procrastinating...
In the mean time, here's some more rant:
A framework is not quite what I want. The pieces are tied too closely together. AwesomeWM is great, and so is xmomad, but they both suffer from difficult configuration. Configuration needs to be trivial to approach, but not limited in extensibility. I settled on i3 because I prefer the underlying structure and don't find much to be missing. The i3 devs guessed a lot right about what I would want in a tiling wm. Even so, I do get excited at the idea of configuring Xmonad to perfection. I'm just not that good at Haskell yet.
The exciting thing about window managers is the way they fit into the Xorg ecosystem. They are decoupled enough both from the software they window, and from the xserver they draw to, that you can swap out the entire window manager process - even in a running xsession - and everything will continue working. That's a quality I want in the modules I was talking about.
Emacs gets close, but the approach is still pretty confusing. There is plenty of documentation, but you have to learn too much context to get started.
And most of that context is the way that different functionalities are tangled together. The power of lisp is that you can pull on a thread and stretch it - without breaking it away from other components - then tie your new functionality in; but you have to know where to pull, and any update to Emacs or its packages might change the pattern such that you have to pull on new threads.
That's the point I am struggling to articulate most. Modules need to exist in a neat predictable data structure. The substance of what a module is has to be self contained. As soon as a module cares about what's inside another module, that can go both ways and create a circular dependency.
But silos have their own limitations. Neither window managers nor Xorg know how to do display scaling. That put the onus on every GUI software package, which didn't go over well. Some sort of introspection is really useful, which is why Wayland moved the compositor into the window manager.
But Wayland made a fatal flaw: They started without a framework. They made the Weston reference compositor, and just waited for people to copy or fork it. That's begging for repeated (and incomplete) work, and incompatibility. Eventually someone made the wlroots library to encapsulate that repeated work.
Having a library means changes can be made that propagate up. But the exciting thing about configurable software is the ability to put changes on the top that propagate down.
There are a few places where I've seen that kind of bi-directional introspection at work. One is Photoshop layers. Layers can be stacked above or below each other. A layer's pixels can be added, subtracted, multiplied, etc. to the next. You can use a mask to stencil away parts of a layer. If we treat modules as layers, we can decide what order they are stacked, and what function "stacking" does.
Another place is Nix derivations. There are no variables in Nix, only constants. But constants can be overridden anywhere. That's a lot like drawing on a new layer. The trouble with Nix is that that is done with a pinch of pure-functional magic, and an embarrassing amount of boilerplate. As a user, I often find myself standing in the middle of the stack, unsure how to look up. And I'm not quite sure what's at the bottom, either. There's also not a clear consensus on how new derivations should be structured.
Something like NixOS on the text editor scale would be great. Most good text editors have some kind of package manager. I would love to take that to its extreme, and have every part of the editor be a package. All that's left is to decide what domain a package can encompass, and what data structure stacks them together.
Once in a while I see something about a text editor using the GPU to accelerate rendering of fonts. I would love to see some tutorial material on this. Generally speaking using the GPU is a complete mystery to me, as is anything to do with rendering fonts.
> We can no longer just feed the substring of render that we want to print right into abAppend(). We’ll have to do it character-by-character from now on.
The more functionality you add to a program, the more complicated and slower it becomes. You can add optimizations and abstraction layers. But the best way to write a bug free and efficient program is to keep it small and simple.
If you do need those functions though, you can make your life easier by writing tests before implementing a feature. For example before writing the code that colors numbers, write a test that checks if numbers has a color, and a test that measure the time it takes to render a line - and have it fail if it becomes significant slower.
So as an exercise you could write a test for each step in the tutorial.
- I wound up growing and expanding this package to the full Win32 surface area
- Another OSS contributor took and extended my work to support the WinRT
Now there are tens of Dart packages that use the Windows interop work that all started with this little tutorial on a completely unrelated area. So thank you to the author!
Thank you for following up. I refactored my original Go version into somewhat more idiomatic Go. I'll compare with your Dart version to see if the packages I ended up with are "organic" or just determined by Go peculiarities.
This is really cool! I’m definitely gonna try this out in one of my favorite languages. I love these types of project-based learning. You learn so much by going through them.
Everybody? It's been the language for open source software since the beginning. "Who likes C?" is a different question, but using it is often just plain practical. Almost all mainstream languages are syntactically close to C, so it's easy to read. Documentation for using C interfaces ships with every *nix, making it easy to write. Memory safety and footguns don't matter at all for the vast majority of programs, especially not the short and simple "do-one-thing"-type programs that are most non-commercial code.
It would be neat to have an equivalent set-up using a GUI toolkit instead, but the terminal is good enough to work with.
In a broader sense, it's pretty frustrating to know that we are still emulating features (like ctrl-[s,q]) that only really made sense in the context of a physical terminal. The amount of work and frustration we could save with a modern equivalent to terminal emulators (without historical baggage) would be really significant.