Linus built an incredibly elegant and simple underlying model for git. For what it successfully does - distributed version control - it is remarkably simple and easy to grasp if you want to.
However, this model was not mapped well to the high level concepts that the typical user of a VCS operates in. This is the biggest issue of git: it's hard to make sense of it by its UI if you do not understand how it works under the hood. I struggled until I read the pro git book.
I wouldn't go as far as to compare this to knowing about filesystem data structures for saving a jpeg file. It's more like using an old school file dialog where you just see the bare file system and you need to know your way around your drive.
Git's other (compounding) problem is how the CLI is an inconsistent mess.
Why do you create a branch via the "git checkout" command? Why do you delete tags using "git tag -d" but delete stashes using "git stash drop"? If you want to blow away local uncommitted changes, you can use "git reset", "git reset --hard" or "git checkout (file)" - which (I think) all do totally different things.
Git's data model may be elegant, but its hard to appreciate it through the tangled mess of git command line options.
I know what you mean (deleting a branch in a remote is a very unintuitive syntax to me at first glance, especially), but these are perhaps not the best examples:
> Why do you create a branch via the "git checkout" command?
That's a shortcut for "git branch (name)", then "git checkout (name)". Or the newer "switch" which is more obvious.
> Why do you delete tags using "git tag -d" but delete stashes using "git stash drop"?
Because the stash is more like a stack, and tags are not, so "drop" without a parameter is a valid and very usual command. Yes, it feels inconsistent, but allowing "git stash -d" without a parameter would probably not be better.
> If you want to blow away local uncommitted changes, you can use "git reset", "git reset --hard" or "git checkout (file)" - which (I think) all do totally different things.
These do all do different things, so that's why they all exist. "git restore (file)" was introduced a few years ago (with "switch", mentioned earlier) to make the last one more obvious, since that's indeed always been an uncomfortable syntax for a core operation.
Git's a very powerful tool, originally aimed at a very complex code base run by experts, and was written very quickly as an emergency replacement for BitWarden. This rushed development and target audience does show through even today, but it's being annealed over time. Nevertheless, it's so good at what it does that it's taken out nearly every other VCS by just existing (ok, and the network effects of GitHub, but they choose it for a reason too).
> Why do you create a branch via the "git checkout" command?
Now there is also `git switch --create` / `git switch -c` for this.
Perhaps in time, there will appear different front end dialects for git. Like the statistics programming language R has the base R language, data.table dialect, and the tidyverse dialect.
There is also `git restore` for reverting local or staged changes, the other common use of `git checkout`.
I am slowly remapping my keystroke muscle memory away from the footgun that is `git checkout` and using restore/switch. But boy is it hard to undo a decade of practice.
I use Git daily and literally couldn't tell you how to use it because I've aliased every single command it has. I basically have a wrapper over it and to teach anyone how to use Git I have to peel my layer away and check what I have things aliased to.
It feels like terrible ad hoc user design built over an otherwise extremely elegant and clean data model.
I’ve been toying with various ways to do this. Some alternatives I’ve thought of:
- a “draft commit”, where you can amend a bunch of changes into a commit before finalizing it.
- default to “git commit --all”, and allow the user to “git commit --patch” where needed
- as 'anthomtb says, `git stash push --patch` (I also find myself wishing for `git stash pop --patch` so you can shove bits in and out of a stash as needed.)
a) terminology: it’s one less concept to have to wrap your head around; and
b) a draft commit would also have a draft commit message, and (though I’m admittedly not sure about how well this part would work), draft parents (probably supporting refs rather than just commits as parents) so you can have multiple of them and shuffle them around conveniently. (This also sort of subsumes the stash as well.)
I made a preliminary stab at this a while back, though it has some awkwardness and I haven’t had a chance to revisit it recently: https://github.com/wolfgang42/git-draft/
By making multiple partial commits, then squashing stuff together as needed. Or amending the top commit. Or amending non-top commits (making them "absorb" the changes).
Which is what I do with mercurial (& evolve), and I am happy I don't have a super-special extra concept to clutter up my already overflowing brain.
The way I think of it, the staging area is an incrementally buildable commit that is not called a commit because commits aren't incrementally buildable. So if you allow commits to be incrementally buildable, then you don't need the staging area. The only difference is you need to come up with a message for the commit when you first start to build it. Or not—make it empty, then amend it when it becomes something worth naming.
The point of the comment is that all 3 terms refer to the same thing. `git add` modifies staging area, `git diff --cached` shows the diff of things in the staged area, and `git stash --keep-index` stashes things that haven't been staged (I think? IDK, I never use it). Maybe pick one?
Instead of git add --patch, there could be a --patch option to git commit. You can already edit the latest commit with git commit --amend, so you'd have to do git commit -p to get the commit started, and then continue with git commit --amend -p.
> Why do you create a branch via the "git checkout" command?
git checkout -b foo
is just a shortcut for
git branch foo
git checkout foo
> Why do you delete tags using "git tag -d" but delete stashes using "git stash drop"?
That is inconsistent. One has an interface of `git <thing> <options-to-manage-thing>` and the other `git <thing> <subcommand-actions-for-thing>`. I imagine what happened is the former was the original and was probably thought to be sufficient, but then it wasn't for `stash` and the latter was introduced for more flexibility. The inconsistency is probably from backwards compatibility.
It might be worth noting that, at least as far as I know, git was like the first to use or at least popularize subcommands. It'd be understandable if they didn't include support for sub-sub-commands from the get-go.
> If you want to blow away local uncommitted changes, you can use "git reset", "git reset --hard" or "git checkout (file)" - which (I think) all do totally different things.
git-reset is mainly about resetting the branch, index, and/or working tree to a given ref. git-checkout is mainly about checking out a ref, setting HEAD and syncing the working tree to it. They're different things with an overlap. I would say that's not really inconsistent. It would only appear so when one only learns specific patterns of commands for subsets of their function, like "blow away local uncommitted changes", which in this case fits in their overlap.
That can all be true, but it doesn't make git any less of a nightmare to learn. Even very experienced git users make mistakes and need to google things all the time. Its become almost a trope at multiple places I've worked that even once in awhile someone makes a mess of their git repository, and needs to call for help from one of the 2 people in the entire office who understand git enough to unbreak it.
Another annoying inconsistency: git tag prints a list of tags. Git branch prints a list of branches. Git commit prints ... modified files? And git stash modifies the stash. You need git stash list to see the stash. What!?
I get it; its a complex tool. Its managing 4 different storage areas for your code (the repository, the staging area, the index and the stash). It also manages tags, branches, remotes and configuration. And it has multiple networking interfaces.
But I can't escape the conclusion that its just not a very good user interface. A good interface wouldn't be so hard to use. Redis is more complex, but I don't make so many mistakes using the redis cli. Awk is more powerful - but its much more intuitive. And cargo probably has more subcommands than git does, but I don't get lost in them. Git? Git is a mess.
Git wasn't the only DVCS with an elegant model. Mercurial, Darcs and Fossil came out around the same time. All are equally elegant in their own ways, and all have a much friendlier UI than Git.
So it is possible to have both an elegant implementation, and a friendly UI that doesn't force the user to understand the internals to work with the tool.
Git's elegant model is not why it won out. Despite of its shortcomings, I suspect the cult of personality around Linus had a big role in that, as well as major services like GitHub.
So what's the underlying model that makes the staging area make sense? Why does stash followed by unstash leave my checkout in a different state from what it was before?
The staging area is where you construct your next commit— giving you a middle ground between your changes in your local working copy and the last actual commit so that, if you don’t want to commit everything that you’ve changed in a single commit, you can do that.
(If you always want to commit everything you’ve changed, you can do that too— always commit with ‘git commit -a’ and only use ‘git add’ when dealing with new files that you want to add to version control.)
Hmm. With Mercurial I just use "commit --interactive" if I only want to commit part of my changes, and I always found that more intuitive and less confusing than having to mentally keep track of Git's staging area as well.
The git analogue to that would be `git commit --interactive`, or using `git status` to check the staging area while using `git add`. Keeping mental check of it is the worst solution imho.
You can also have your git porcelain handle it. Magit for example has a great interactive overview of unstaged and staged changes. When I need to do something more picky than just commiting every change, I'll usually grab magit to stash individual chunks: I don't necessarily want to commit all changes in a file, sometimes I want individual lines.
You can do that with staging using the commands above, magit, or some other porcelain (I've heard good things about git kraken). If you really want to forget staging even exists, you could just commit straight up and amend the commit afterwards to get a comparable experience I guess. I've found staging to be helpful in keeping track of what I've achieved for my next "version" of the software to be added to the history, which is why I'm still using it.
Staging is useful for gradually queuing up multi-file commits rather than listing them all in one command. It becomes even more useful with partial file commits.
What’s useful in it? A commit is a thing that should preferably make sense on its own, which can be guaranteed by testing or at least building/running the code. By cherry-picking changes from workdir into a commit don’t you basically make a blind guess? Or is it stash/test/pop every time? What if you overpicked? Reset and repeat?
I don't know about you, but I often get sidetracked with different changes when I'm working on something, so that the work directory is in a messy state to commit everything. The staging area allows me to cherry-pick only the changes that will be in the next commit, while keeping the rest for later. This way you can save the state momentarily, finish polishing the changes, and then easily commit them. I find it very useful to keep focus on what I'm currently working on, without the overhead of WIP commits.
> By cherry-picking changes from workdir into a commit don’t you basically make a blind guess?
No, you use the interactive mode (`git add -p`) to select exactly what you want.
If you overpicked, you can reset a single file, and try again. That can be a bit annoying if there are a lot of changes, so this is another reason to keep commits small and atomic.
I think the answer to the question is "yes, the person staging a partial commit may be making a guess". I think this is because the tool apes an earlier practice of crafting patches to share with other developers. There are definitely cowboys writing patches to show others, and not necessarily testing every implied snapshot in a chain of such patches. Some CI practices also encourage cowboy commits, i.e. if a team pushes commits to get them tested rather than testing prior to commit.
You can imagine an inverted perspective where the stash should be the only non-staging area, and the working copy _is_ the staging area for the next commit. Stash away partial changes you want to defer, then test the current working copy, then commit the working copy.
You'd also want status/diff commands that let you more easily compare: working vs HEAD (what can be committed); stash vs HEAD (all uncommitted changes); and stash vs working (deferred changes).
Your point is actually great, but it’s important that you should always test from a commit itself - that if those tests pass, gets merged to the release branch. If you are only testing your working directory I feel like that’s even harder to do.
The staging area is a virtual snapshot, in roughly the way that the working copy and a commit are actual snapshots. It's defined in terms of the current HEAD with some changes.
Not sure what you mean by "unstash", since "git unstash" is not a command (on my machine anyway, so not unless it was added very recently). I'm pretty sure stashes are still modeled as commits/snapshots.
The git stash command is a little wonky, yes, but I don't think that's a data model thing. It's easy to mistake the disaster zone of Git's CLI for problems with the data model. It becomes more obvious where the problem is when you start thinking in terms of the data model, and trying to figure out what incantation will perform the relatively simple operation in your mind.
> Not sure what you mean by "unstash", since "git unstash" is not a command (on my machine anyway, so not unless it was added very recently).
I meant pop or apply.
> The git stash command is a little wonky, yes, but I don't think that's a data model thing. It's easy to mistake the disaster zone of Git's CLI for problems with the data model. It becomes more obvious where the problem is when you start thinking in terms of the data model, and trying to figure out what incantation will perform the relatively simple operation in your mind.
I disagree. I think the staging area and its behaviour are inherently unreasonable; certainly all the "it's just a DAG of commits" people tend to be confidently wrong about what the staging area will do under a given sequence of operations.
However, this model was not mapped well to the high level concepts that the typical user of a VCS operates in. This is the biggest issue of git: it's hard to make sense of it by its UI if you do not understand how it works under the hood. I struggled until I read the pro git book.
I wouldn't go as far as to compare this to knowing about filesystem data structures for saving a jpeg file. It's more like using an old school file dialog where you just see the bare file system and you need to know your way around your drive.