What do you think your git repo is?

colburnmh · on May 6, 2022

True to an extent--meaning mostly true in practice.

You can mutate the Git data if you chose to using commands like "filter-branch". "filter-branch" isn't used frequently, since it causes issues with every up/down-stream replica if the data has been pushed/pulled, but it is possible. But, even some commonly used commands like "amend", "rebase", and "squash" cause limited data mutations which are broadly considered appropriate and useful.

kjeetgill · on May 6, 2022

This is kind of the brilliance of git forcing users to refer to commits by hashes, all the things you're referring to don't modify commits, they make new ones.

Braches are the the only mutable place where you have it point to different commits in a sequence.

I could imagine in some horrific alternate universe someone deciding to hide hashes as not user friendly. Like, I think we got incredibly lucky how that played out.

GauntletWizard · on May 6, 2022

You can experience this alternate universe right now in Github Actions, which allows you to refer to other "Actions" by their tag, and encourages you to pin yourself to a "v3" which the team will then destroy and replace to update you.

If this sounds terrible, insecure, and begging to be exploited, it's because every idiot on the Github Actions Team should be censured for their poor understanding of Git, Github, and yet proceeding to ship anyway.

hermanb · on May 6, 2022

I’ve been wondering about this too and always used full sha’s until now. But recently I’ve made an action myself: You actually need to publish the action to the marketplace with each tag manually. It feels like there might be more going on.

Is GitHub storing those published tags and avoiding tampering by only letting you use those tags once? Are they warning or blocking runs if you tamper? …

I’m really curious because it seems like SUCH a giant risk otherwise.

GauntletWizard · on May 6, 2022

Nope, they even suggest (and companies have built tooling around) deleting versions of the tags.

WorldMaker · on May 6, 2022

Deleting a tag is a force push operation like any other and repo policies that block force pushes will block tag updates.

Tags themselves aren't necessarily the worst idea, but yes policies encouraging force pushes are likely to experience exploitation.

Also, annotated tags have their own "commit" hashes, and can be code signed like any other commit. There are more precautions that could be taken.

Arnavion · on May 7, 2022

When the threat is an action repo becoming malicious and force-pushing its existing tags to malicious code, the policies of the action repo preventing force-pushes is not a safeguard.

WorldMaker · on May 7, 2022

I agree; there should be more protections and I'm pointing out that they could be offered. Github could certainly enforce at the platform level that the only tags allowed for use in Actions must be annotated, maybe even signed, and must never be force-pushed.

The use of tags isn't necessarily the wrong strategy: I'm mostly just pointing out it is treating tags as mutably as branches that is the problem. I don't think you should ever force push a tag, personally, and I always find it problematic when people treat tags like branches and confuse the two.

jrockway · on May 7, 2022

Yeah this is all kind of lazy glue code. The same thing happens with Docker; people refer to foobar:latest, and that changes over time and is annoying/a good attack vector. (All the tags are mutable, of course, not just "latest".) What should happen in both cases is that "v3" or "latest" should be read at the time you submit the configuration and stored as the unique id (commit id for git, sha256 for container images). This does have the downside that you have to check that "v3" and "latest" are still what you want every time you apply an edit to the action, but at least you were tangentially involved rather than pure action at a distance.

GauntletWizard · on May 7, 2022

There are vendors that I would be fine pinning to a vendors signature instead, but yeah. There is, thankfully, a lot of kubernetes tooling around this workflow

Arnavion · on May 7, 2022

Indeed, all the actions tell you to use them via tags.

And then GitHub comes and recommends you (in a doc that you're unlikely to find unless you know to look for it) to use SHAs to protect yourself from the attack that they themselves enabled.

https://docs.github.com/en/actions/security-guides/security-...

tylorr · on May 6, 2022

Those all generate new commits, they don't modify existing ones. Though the only way of easily detecting that is comparing it to a clone of the repo.

cryptonector · on May 6, 2022

Uhm, you can mutate any Merkle hash tree / log you want, provided you can replace the copies of the past that others store. Correspondingly, the security of a Merkle hash tree / blockchain depends on having others store copies of it and check that additions chain from their past heads.

Git, of course, lets you "mutate" the Merkle hash tree[0], but -to the extent that the hash function used allows- your rewriting of published branch history will be detected by others. (Those others can recover by reviewing the deltas and using `git rebase --onto`.)

There's no Merkle-hash-tree-defeating code in Git as such because this is a generic issue for Merkle hash trees, not a Git-specific one.

And to be clear: there is nothing wrong with `git rebase` and "rEwRiTiNg HiStOry" as long as you're not rewriting published history. And for those times when you must rewrite published history (for whatever reason), you can recover forks with `git rebase --onto`.

The idea that "oh no!!1! Git lets you rewrite history, that's baaad and you shouldn't ever ever do that!!" is a terrible canard that must be called out as such every time it comes up. Mercurial does too. It's not really a problem unless you rewrite published history, but the vast majority of the time any user rewrites Git (or Mercurial, or...) history, they're doing so locally, on dev branches, not on published trunks.

I think the history-rewrite-is-always-bad meme must have been started by people who hated Git or who just didn't understand. But it's a very destructive meme that causes people to be happy pushing and accepting ugly and useless history because "you don't want to ever fix it". That meme must be defeated.

[0] Well, Git doesn't let you mutate the objects in the tree itself, it only lets you add or remove objects, and it lets you change the commits pointed to by branches, tags, and other symbolic refs.

tylersmith · on May 6, 2022

You can always create new trees by commiting or rebasing but absent being able to find a proper hash collision you can't change content within a given commit hash.

skybrian · on May 7, 2022

How often do packages refer to their dependencies using git hashes? Very seldom in my experience, and for good reason. (Unless you're using git submodules, which is not the usual way.)

Go's checksum server does something different by making sure that the names used in module files refer to the same things for different people. Also, it works even if some packages don't come from git repos.