(Hi, I’m on the rust-analyzer team, but I’ve been less active for reasons that are clear in my bio.)
> Language servers are powerful because they can hook into the language’s runtime and compiler toolchain to get semantically correct answers to user queries. For example, suppose you have two versions of a pop function, one imported from a stack library, and another from a heap library. If you use a tool like the dumb-jump package in Emacs and you use it to jump to the definition for a call to pop, it might get confused as to where to go because it’s not sure what module is in scope at the point. A language server, on the other hand, should have access to this information and would not get confused.
You are correct that a language server will generally provide correct navigation/autocomplete,
but a language server doesn’t necessarily need to hook into an existing compiler: a language server might be a latency-sensitive re-implementation of an existing compiler toolchain (rust-analyzer is the one I’m most familiar with, but the recent crop of new language servers tend to take this direction if the language’s compiler isn’t query-oriented).
> It is possible to use the language server for syntax highlighting. I am not aware of any particularly strong reasons why one would want to (or not want to) do this.
Since I spend a lot of time writing Rust, I’ll use Rust as an example: you can highlight a binding if it’s mutable or style an enum/struct differently. It’s one of those small things that makes a big impact once you get used to it: editors without semantic syntax highlighting (as it is called in the LSP specification) feel like they’re naked to me.
For another example of semantics-aware highlighting for Rust, see Flowistry, which allows you to select an expression in order to highlight all the code that either influences or is influenced by that expression: https://github.com/willcrichton/flowistry
Flowistry publishes their underlying rustc plugin as a crate, so all the analysis is already done for you, you'd just need to integrate the output with your editor of choice.
Thanks for sharing this project! That's a really neat idea and would help me a lot with understanding code written by others. It's unfortunate that it's only available for Rust, but it makes sense that the language design really lends itself to this.
Looking at this, I noticed how long it's been since I saw a new IDE feature that really made me more productive at understanding code. The last I can really remember was parameter inlay hints. It's a bummer - both the Jetbrains IDEs and VS Code seem to only focus on AI features I don't want, to the detriment of everything else.
Another pretty common application is to color unused bindings with a slightly faded-out color. So for e.g. with the TypeScript LSP, up at the top of the file you can instantly tell what imports are redundant because they're colored differently.
I think it's funny that some languages, like TypeScript, use a different programming language to improve their compile times.
Then there are languages like Rust who are like, whelp, we already use the fastest language, but compilation is still slow, so they have to resort to solutions like the rust-analyzer.
> they have to resort to solutions like the rust-analyzer.
It's not really a bad thing. IDEs want results ASAP, so a solution should focus on latency; query based compilers can compile just enough of the source to get the answer to a specific query, so they're a good answer.
Compiling a binary means compiling everything though, so "compiling just the smallest amount of source for a query" isn't specifically a goal, instead you want to optimise for throughput; stuff like batching is a win there.
These aren't language specific improvements, they're recognition that the two tasks are related, but have different goals.
We (the rust-analyzer team) have been aware of the slowness in Rowan for a while, but other things always took priority. Beyond allocation, Rowan is structured internally as a doubly-linked list to support mutating trees, but:
1. Mutation isn’t really worth it; the API isn’t user-friendly.
2. In most cases, it’s straight up faster to create a new parse tree and replace the existing one. Cache effects of a linked list vs. an arena!
In fairness, I don’t think we predicted just how large L1/L2 caches would get over the coming years.
One of the rust-analyzer co-maintainers (Chayim Friedman) already rewrote it, but we can’t integrate it yet, as about 40 assists (the little lightbulbs?) still rely on mutation. If you want something Rowan-like, I think syntree and Biome’s rowan fork are good options to look into.
How about cstree? https://crates.io/crates/cstree
Recently I found that traversal in rowan tree can cost much time. No sure if there's a cheaper way to achieve that. Or, integrate rowan with bump allocator (bumpalo or oxc_allocator)?
I’d second Rain’s reply, but having gone from git to sapling to jujutsu, I feel like the jump from sapling to jujutsu was as big as the jump from git to sapling, in terms of “oh, this is a way nicer workflow”. I really like and miss Sapling’s interactive smart log, but I found jj’s conceptual simplicity to be more compelling than ISL. That said, VisualJJ and Jujutsu Kaizen (both listed on https://jj-vcs.github.io/jj/latest/community_tools/) might give you the ISL-style experience you’re looking for.
(disclosure: I started a jj company but I don’t have anything to sell yet.)
> Are there "central concepts" in the Jujutsu design?
I think a central concept in jj is that everything is a commit: there’s no staging index, stashes, or unstaged files, only commits. This has a few pretty neat implications:
- changes are automatically to recorded to the current commit.
- commits are very cheap to create.
- as others have already said, rebases (and history manipulation in general!) are way cheaper than in git. It helps that jj as a whole is oriented around manipulating and rewriting commits. Stacked diffs fall out of this model for free.
- Since everything is now a commit, merge/rebase conflicts aren’t anything special anymore because a commit can simply be marked as having “a conflict” and you, as the user, can simply resolve the conflict at your own leisure. While I’m sure you know already know thus, I’d rather make subtext text: in git and mercurial, you’d need to resolve the conflict as it got materialized or abort the rebase entirely.
- because everything is a commit, jj has a universal undo button that falls out effectively, for free: `jj undo`. It feels so incredibly powerful to have a safety net!
> All other Git UIs I've used have been severely lacking, but Magit has made me significantly more productive with Git, and has convinced me of the "magic of git".
I can’t speak to this personally, but I’ve had friends who are magit users tell me that jj isn’t a big enough improvement (on a day-to-day, not conceptual, basis) over magit to switch at this time. It’d be really cool to have a magit-but-for-jj, but I don’t think anyone has written one yet.
That's a great point about allocation/memory management. As an example, rust-analyzer needs to free memory, but rustc's `free` is simply `std::process::exit`.
> I think the future definitely lies in compilers written to be “incremental first,” but this requires a major shift in mindset, as well as accepting significantly worse performance for batch compilation. It also further complicates the already very complicated task of writing compilers, especially for first-time language designers.
I'm in strong agreement with you, but I will say: I've really grown to love query-based approaches to compiler-shaped problems. Makes some really tricky cache/state issues go away.
> Are they? I feel like intellisense is largely a subset of what a compiler already has to do.
They are distinct! Well, not just intellisense, but pretty much everything. I'll paraphrase this blog post, but the best way to think about think about the difference between a traditional compiler and an IDE is that compilers are top-down (e.g, you start compiling a program from a compilation unit's entrypoint, a `lib.rs` or `main.rs` in Rust), but IDEs are cursor-centric—they're trying to compile/analyze the minimal amount of code necessary to understand the program. After all, the best way to go fast is to avoid unnecessary work!
> If you wanted to argue that intellisense is a subset of compiling and it can be done faster and more efficiently I could buy that argument. But if you’re going to declare the tasks are at odds with one another I’d love to hear specific details!
Beyond the philosophical/architectural difference I mentioned above, compilers typically have a one-way mapping between syntax and mapping, but to support things like refactors or assists, you often need to do the opposite: go from semantics to syntax. For instance, if you want to refactor from struct to an enum, you often need to find all instances of said struct, make the semantic change, then construct the new syntax tree from the semantics. For simple transformations like a struct to an enum, a purely syntax-based based approach might work (albeit, at the cost of accuracy if you have two structs with same name), but you start to run into issues when you consider traits, interfaces (for example: think about how a type implements an interface in Go!), or generics.
It doesn't really make sense for a compiler to support above use cases, but they're are _foundational_ to an IDE. However, if a compiler is query-centric (as rustc is), then it's pretty feasible for rustc and rust-analyzer to share, for instance, the trait solver or the borrow checker (we're planning/scoping work on the former right now).
> I have a maybe wrong and bad opinion that LSP is actually at the wrong level. Right now every language needs to implement a from scratch implementation of their LSP server. These implementations are HUGE and take YEARS to develop. rust-analyzer is over 365,000 lines of code. And every language has their own massive, independent implementation.
rust-analyzer a big codebase, but it's also less problematic than the raw numbers would make you think. rust-analyzer has a bunch of advanced functionality (term search https://github.com/rust-lang/rust-analyzer/pull/16092 and refactors), assists (nearly 20% of rust-analyzer!) and tests.
> I think there should be a common Intellisense Database file format for providing LSP or LSP-like capabilities. Ok sure there will still be per-language work to be done to implement the IDB format.
> But you'd get like 95% of the implementation for free for any LLVM language. And generating a common IDB format should be a lot simpler than implementing a kajillion LSP protocols.
I wouldn't say 95%. SCIP/LSIF can do the job for navigation, but that's only a subset of what you want from an IDE. For example:
- Intellisense/autocomplete is extremely latency sensitive where milliseconds count. If you have features like Rust/Haskell's traits/typeclasses that allow writing blanket implementations like `impl<T> SomeTrait for T`, it's often faster to try to solve that trait bound on-the-fly than storing/persisting that data.
- It'd be nice to handle features like refactors/assists/lightbulbs. That's going to result in a bunch of de novo code needs to exist outside of a standard compiler, not counting all the supporting infrastructure.
> My dream world has a support file that contains: full debug symbols, full source code, and full intellisense data.
Rust tried something similar in 2017 with the Rust Language Server (RLS, https://github.com/rust-lang/rls). It worked, but most people found it too slow because it was invoking a batch compiler on every keystroke.
> Language servers are powerful because they can hook into the language’s runtime and compiler toolchain to get semantically correct answers to user queries. For example, suppose you have two versions of a pop function, one imported from a stack library, and another from a heap library. If you use a tool like the dumb-jump package in Emacs and you use it to jump to the definition for a call to pop, it might get confused as to where to go because it’s not sure what module is in scope at the point. A language server, on the other hand, should have access to this information and would not get confused.
You are correct that a language server will generally provide correct navigation/autocomplete, but a language server doesn’t necessarily need to hook into an existing compiler: a language server might be a latency-sensitive re-implementation of an existing compiler toolchain (rust-analyzer is the one I’m most familiar with, but the recent crop of new language servers tend to take this direction if the language’s compiler isn’t query-oriented).
> It is possible to use the language server for syntax highlighting. I am not aware of any particularly strong reasons why one would want to (or not want to) do this.
Since I spend a lot of time writing Rust, I’ll use Rust as an example: you can highlight a binding if it’s mutable or style an enum/struct differently. It’s one of those small things that makes a big impact once you get used to it: editors without semantic syntax highlighting (as it is called in the LSP specification) feel like they’re naked to me.