9

Ryabitsev: Cross-fork object sharing in git (is not a bug)

 2 years ago
source link: https://lwn.net/Articles/884105/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

Ryabitsev: Cross-fork object sharing in git (is not a bug)

[Posted February 8, 2022 by corbet]
This is a few days old, but evidently there is still need for this message: Konstantin Ryabitsev explains how it is easy to cause a commit to appear falsely to be part of a GitHub repository:

With all the benefits of object sharing comes one important downside — namely, you can access any shared object through any of the forks. So, if you fork linux.git and push your own commit into it, any of the 41.1k forks will have access to the objects referenced by your commit. If you know the hash of that object, and if the web ui allows to access arbitrary repository objects by their hash, you can even view and link to it from any of the forks, making it look as if that object is actually part of that particular repository (which is how we get the links at the start of this article).

A failure to understand this point is how the net fills up with articles like this one.


(Log in to post comments)

Ryabitsev: Cross-fork object sharing in git (is not a bug)

Posted Feb 8, 2022 20:09 UTC (Tue) by atnot (subscriber, #124910) [Link]

I don't feel "that's just how the program works" is a very good answer to people pointing out an undesirable property of the program, so I'm happy to hear there's also some efforts to try to fix this.

Ryabitsev: Cross-fork object sharing in git (is not a bug)

Posted Feb 8, 2022 20:10 UTC (Tue) by vstinner (subscriber, #42675) [Link]

> https://github.com/torvalds/linux/blob/ac632c504d0b881d7c...

This link now displays a warning on GitHub:

"This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository."

It seems like a recent GitHub enhancement: nice!

Ryabitsev: Cross-fork object sharing in git (is not a bug)

Posted Feb 8, 2022 20:16 UTC (Tue) by corbet (editor, #1) [Link]

It's a worthwhile change, certainly, but I have to wonder why they don't just return a 404 in that case. What is the value of showing a commit/blob that doesn't exist in the repo being queried?

Ryabitsev: Cross-fork object sharing in git (is not a bug)

Posted Feb 8, 2022 20:30 UTC (Tue) by mricon (subscriber, #59252) [Link]

There *could* be legitimate reasons to do so. E.g. you had a branch and you shared a bunch of links around, but then you rebased the branch and all of those links are now invalid even though the loose objects are still around until git-prune runs.

But that's kind of a corner case and in general I agree that a 404 would be much more reasonable in most cases.

Ryabitsev: Cross-fork object sharing in git (is not a bug)

Posted Feb 9, 2022 8:30 UTC (Wed) by taladar (subscriber, #68407) [Link]

The more likely answer is that it is needed for some of the merge request tooling.

Ryabitsev: Cross-fork object sharing in git (is not a bug)

Posted Feb 9, 2022 11:02 UTC (Wed) by pbonzini (✭ supporter ✭, #60935) [Link]

Yes, links to specific commits are common in merge request comments and after a force-push they might result in the situation described here.

Ryabitsev: Cross-fork object sharing in git (is not a bug)

Posted Feb 8, 2022 22:46 UTC (Tue) by jthill (subscriber, #56558) [Link]

Inaccessible-through-any-branch is not inaccessible-through-any-ref. For instance, I think things in PRs should be accessible but warned about, and it seems reasonable that GitHub wouldn't want to serialize unrelated PR changes. Reachability updates are not cheap. The warning's a word to the wise, it's enough, the rest is just being aware.

Ryabitsev: Cross-fork object sharing in git (is not a bug)

Posted Feb 9, 2022 7:19 UTC (Wed) by marcH (subscriber, #57642) [Link]

> What is the value of showing a commit/blob that doesn't exist in the repo being queried?

Probably not intended but it's convenient to submit submodule changes to the parent's CI for testing. No need to get the submodule change merged first.

(submodule or similar)

Ryabitsev: Cross-fork object sharing in git (is not a bug)

Posted Feb 9, 2022 15:07 UTC (Wed) by mathstuf (subscriber, #69389) [Link]

We get around this by adding the MR author's fork as a remote and fetching from there before updating the submodule. Assumptions:

- new submodules don't appear within the update (we only add the fork to each submodule once)
- the user's fork shares the same name as the upstream repository

Of course, we ignore failures until `git submodule update` in case the user doesn't have such a fork. This allows one to push a candidate submodule commit to their own fork instead of having to put it in the "official" repo and without having to muck with `.gitmodules` (which is prone to being committed accidentally). Of course, on GitHub, the ubiquitous `checkout` action is quite useless for such a use case. But submodules seem to be the neglected corner no one wants to touch. Can't say I blame them.

Submodules etc.

Posted Feb 9, 2022 20:23 UTC (Wed) by marcH (subscriber, #57642) [Link]

> We get around this by adding the MR

"We" = Gitlab? I'm merely guessing from the "MR" terminology and your comments on earlier articles.

> But submodules seem to be the neglected corner no one wants to touch.

Github's support for submodules is IMHO excellent, the ability to browse changes directly from the parent repo is great.

> Can't say I blame them.

After a number of discussions and comparisons with alternatives, I came to the conclusion that a large part of the confusion and dissatisfaction with submodules is due to the links being "hidden" in metadata. All other alternatives I know use plain text manifests than can be open in a regular editor. This puts more burden on the tool because now it has to manage more possibilities of inconsistency between the (freshly edited...) manifest and the actual state(s) but it gives the users more control or at least the feeling to be in control.

PS: in my experience very few submodule users know that a "git submodule sync" command even exists...

Ryabitsev: Cross-fork object sharing in git (is not a bug)

Posted Feb 9, 2022 10:37 UTC (Wed) by seds (subscriber, #151382) [Link]

Worth noting that the warning is all the way up in the page, making it difficult for everyone to see

Ryabitsev: Cross-fork object sharing in git (is not a bug)

Posted Feb 9, 2022 14:36 UTC (Wed) by mjw (subscriber, #16740) [Link]

I looked for it and couldn't find it at all. Turns out it is some javascript hack, which doesn't show up unless you explicitly enable javascript for that webpage.

Ryabitsev: Cross-fork object sharing in git (is not a bug)

Posted Feb 8, 2022 22:41 UTC (Tue) by arekm (subscriber, #4846) [Link]

But why "blame" on github shows Torvalds at this line?

Blame

Posted Feb 8, 2022 22:51 UTC (Tue) by corbet (editor, #1) [Link]

If you create a commit, you can put anything you want (including author identity) into it.

Ryabitsev: Cross-fork object sharing in git (is not a bug)

Posted Feb 8, 2022 23:14 UTC (Tue) by NYKevin (subscriber, #129325) [Link]

Out of the box, Git doesn't have a trust model at all. If a commit says its author was Linus, then as far as Git is concerned, Linus authored it. Anybody can forge anything. The only guarantee is that if you change a commit's contents, commit message, other metadata (such as date, author, etc.), or any of those things for any of its (transitive) ancestors, then the hash of that commit will change (and even this guarantee has an asterisk due to SHA-1 collision attacks). So if (for example) you talk to Linus over some trusted channel, and he tells you to reference commit abc123 (but he actually gives you a complete hash, not an abbreviation), then you can be reasonably confident that you're both talking about the same commit, with the same history.

(It is possible to "sign" objects to prove that a given object has been approved of by a given PGP key, but then you have all the usual PKI problems that PGP is famous for. On top of that, plenty of developers don't do this, because Git doesn't require it.)

GitHub, on the other hand, does have a "real" trust model, but that trust model is primarily designed to prevent you from pushing to somebody else's fork, not to prevent you from forging commits. In general, the assumption is that most cross-fork collaboration happens in the form of pull requests, where the recipient can closely scrutinize the set of incoming commits, and if the author is wrong, they can refuse the PR or tell the sender to fix it. Therefore, stopping you from faking the author is simply not regarded as a functional requirement of the software.

Of course, there's a problem here: You can cause edits to be misattributed to another developer within your own private fork. Anyone who wants to do so can simply write whatever nasty, hateful, or otherwise problematic content they like, label it as authored by Linus (or anybody else), push it to their own private fork, and then send a screenshot to some credulous tech journalist who doesn't understand how Git works. GitHub will happily display Linus's profile image over text he didn't write, and clicking that image will take you to his regular GitHub profile. For someone who has never used Git, this makes it look like Linus wrote (whatever), even if he had nothing to do with it.[1] Personally, I would consider that a rather serious problem.

[1]: See for example https://github.com/jayphelps/git-blame-someone-else/commi...

Ryabitsev: Cross-fork object sharing in git (is not a bug)

Posted Feb 8, 2022 23:45 UTC (Tue) by mricon (subscriber, #59252) [Link]

> GitHub will happily display Linus's profile image over text he didn't write, and clicking that image will take you to his regular GitHub profile. For someone who has never used Git, this makes it look like Linus wrote (whatever), even if he had nothing to do with it.[1] Personally, I would consider that a rather serious problem.

There is no reasonable way to fix it. Imagine that we're working on a project together, except that I'm using gitlab and you're using github. I commit a bunch of stuff to my gitlab repo, and then you pull them into your local copy, add your own commits on top, and push them out to your github remote. What do you expect github to do with all the commits authored by me that you just pushed? Rejecting them isn't right -- and even if they are signed, my github profile (if it even exists) may not have my public key stored. Github has to accept them at face value or break git's workflow outright.

Ryabitsev: Cross-fork object sharing in git (is not a bug)

Posted Feb 9, 2022 0:21 UTC (Wed) by NYKevin (subscriber, #129325) [Link]

> There is no reasonable way to fix it.

Of course there is. You're just failing to imagine it. They can simply display a "this commit might have been forged" banner until/unless the purported author clicks a "yes, this was me" button in the commit UI.

To the extent that your workflow even cares about these banners (they need not affect anything other than the commit and pull request web pages), you can get rid of them by either signing your commits with a key that GitHub knows about, or using pull requests as GitHub intended. Or you can just ignore them outright, leaving you no worse off than you are now.

Ryabitsev: Cross-fork object sharing in git (is not a bug)

Posted Feb 9, 2022 1:09 UTC (Wed) by LtWorf (subscriber, #124958) [Link]

They already show a different icon to indicate that who made the commit and who sent it to the server are not the same.

Ryabitsev: Cross-fork object sharing in git (is not a bug)

Posted Feb 9, 2022 1:41 UTC (Wed) by NYKevin (subscriber, #129325) [Link]

I can find no such icon anywhere on https://github.com/jayphelps/git-blame-someone-else/commi..., unless it's buried under the commentspam somewhere? We *know* that commit was definitely forged, because it explicitly says so.

Ryabitsev: Cross-fork object sharing in git (is not a bug)

Posted Feb 9, 2022 10:38 UTC (Wed) by LtWorf (subscriber, #124958) [Link]

See here:

https://github.com/ltworf/trabucco/commits?author=Oblomov

I got the commits via email, so it shows Oblomov as author and me as pushing them.

Ryabitsev: Cross-fork object sharing in git (is not a bug)

Posted Feb 9, 2022 11:43 UTC (Wed) by lunaryorn (subscriber, #111088) [Link]

Not pushing, _committing_. That's just another piece of freely editable metadata of a commit. Let's look at e.g. https://github.com/ltworf/trabucco/commit/750b9de5e13410f... (names and mail addresses redacted):

$ git show --pretty=full 750b9de5e13410fc8a047b1b87d5482e91b381e6
commit 750b9de
Author: G. B. <REDACTED>
Commit: S. 'LtWorf' T. <REDACTED>

Strict C++ mode with extra warnings

diff --git a/trabucco.pro b/trabucco.pro
index ff52c7c..2e412ab 100644
--- a/trabucco.pro
+++ b/trabucco.pro
[…]

git commit takes the contents of the "Commit:" field from $GIT_COMMITER_NAME and $GIT_COMMITTER_EMAIL, meaning you can change these fields freely, and use whatever you like. I can easily make a commit in a repository under my account which shows you as the committer, without you even knowing it.

Ryabitsev: Cross-fork object sharing in git (is not a bug)

Posted Feb 9, 2022 11:50 UTC (Wed) by lunaryorn (subscriber, #111088) [Link]

See e.g. https://github.com/lunaryorn/test/commit/57fcbd6602d9f452...

I'm sure you didn't push this one ;)

Ryabitsev: Cross-fork object sharing in git (is not a bug)

Posted Feb 9, 2022 16:36 UTC (Wed) by LtWorf (subscriber, #124958) [Link]

But it says you authored. Can you make it look like I did everything?

Ryabitsev: Cross-fork object sharing in git (is not a bug)

Posted Feb 9, 2022 18:39 UTC (Wed) by mjg59 (subscriber, #23239) [Link]

Ryabitsev: Cross-fork object sharing in git (is not a bug)

Posted Feb 9, 2022 18:51 UTC (Wed) by lunaryorn (subscriber, #111088) [Link]

Sure I'll make you one tomorrow when I'm back on my laptop 8)

Ryabitsev: Cross-fork object sharing in git (is not a bug)

Posted Feb 9, 2022 20:25 UTC (Wed) by dbnichol (subscriber, #39622) [Link]

See https://git-scm.com/docs/git-commit#_commit_information. You can literally put whatever you want in there. On GitHub they take the email address and try to correlate it to a GitHub user for the web view. But since anyone can put anything in there, there's no validation unless you add commit signatures - https://docs.github.com/en/authentication/managing-commit...

Ryabitsev: Cross-fork object sharing in git (is not a bug)

Posted Feb 9, 2022 11:59 UTC (Wed) by karkhaz (subscriber, #99844) [Link]

That isn't "author and pusher", that is "author and committer"---exactly as it says on the web page.

In git, there is both an author and comitter attached to each commit. If you create a commit with 'git commit' the author and committer will both be your identity, but if you took a patch from Oblomov (as you say, via email) and applied it to your tree with 'git am', then Oblomov will remain author but you will be the committer.

Neither of these metadata are related to which user pushed to GitHub. I can push to GitHub a commit that was authored by Oblomov and committed by you, my icon will not show up next to that commit. GitHub just looks at the email address of the committer and author, and if they match the email of existing users on GitHub, it will display those users' icons.

Ryabitsev: Cross-fork object sharing in git (is not a bug)

Posted Feb 9, 2022 6:25 UTC (Wed) by lunaryorn (subscriber, #111088) [Link]

I don't think such a feature exists on GitHub.

If you see two different user icons on a single commit GitHub only shows you both the author and the committer of that commit, but both of these fields can be changed freely and provide no authenticity at all. Notably the committer is not necessarily the one who pushed that commit.

Ryabitsev: Cross-fork object sharing in git (is not a bug)

Posted Feb 9, 2022 10:39 UTC (Wed) by LtWorf (subscriber, #124958) [Link]

See my other response.

Ryabitsev: Cross-fork object sharing in git (is not a bug)

Posted Feb 9, 2022 5:46 UTC (Wed) by lfam (subscriber, #127309) [Link]

The way to fix it is to use signed commits. A project can even set up a variety of Git hooks or external tooling to enforce and authenticate commit signatures.

There are also signed pushes which offer other useful security properties.

Ryabitsev: Cross-fork object sharing in git (is not a bug)

Posted Feb 9, 2022 7:32 UTC (Wed) by Wol (subscriber, #4433) [Link]

> > There is no reasonable way to fix it.

> Of course there is. You're just failing to imagine it. They can simply display a "this commit might have been forged" banner until/unless the purported author clicks a "yes, this was me" button in the commit UI.

You don't live in the real world.

1) How is the purported author supposed to even know about this new repository?

2) And I don't know about other people, but the requirement to "sign up" to things is - to me - an active incentive to avoid them. If I have to sign up to authenticate this repository (and if I don't any "yes this is me" is worthless), then unless I have a pressing need, I will simply ignore it.

Cheers,
Wol

Ryabitsev: Cross-fork object sharing in git (is not a bug)

Posted Feb 9, 2022 17:31 UTC (Wed) by NYKevin (subscriber, #129325) [Link]

The point is not to prevent people from making such commits in the first place. The point is to make it obvious to *everyone* that such commits should not be assumed to have originated from their purported authors, so that you cannot just commit a bunch of slurs to an empty repository, put author=torvalds, and email the New York Times about it.

Ryabitsev: Cross-fork object sharing in git (is not a bug)

Posted Feb 9, 2022 6:30 UTC (Wed) by lunaryorn (subscriber, #111088) [Link]

> It is possible to "sign" objects to prove that a given object has been approved of by a given PGP key, but then you have all the usual PKI problems that PGP is famous for. On top of that, plenty of developers don't do this, because Git doesn't require it.

Recent git versions also support signatures with SSH keys which I find much easier to use than PGP keys. Anyone's got an SSH key somewhere anyway, and what's good enough to give you access to your production servers is probably good enough to prove the authenticity of your commits.

Unfortunately no major forge supports those signatures yet. I think gitea will add support in the next release, and GitHub seems to have it on their agenda already. I don't know about gitlab.

Ryabitsev: Cross-fork object sharing in git (is not a bug)

Posted Feb 9, 2022 8:37 UTC (Wed) by taladar (subscriber, #68407) [Link]

The problems are the same, namely the link between the key and the actual person.

Ryabitsev: Cross-fork object sharing in git (is not a bug)

Posted Feb 9, 2022 18:50 UTC (Wed) by lunaryorn (subscriber, #111088) [Link]

Depends, I think. In many environments this link probably exists already through the existing infrastructure for SSH keys which grant access to development and production systems.

Ryabitsev: Cross-fork object sharing in git (is not a bug)

Posted Feb 9, 2022 15:03 UTC (Wed) by mathstuf (subscriber, #69389) [Link]

SSH signing also has questionable properties related to key rotation AFAIK. If I have to burn a key, how does anyone know that after $TODAY, the key is not to be trusted?

Ryabitsev: Cross-fork object sharing in git (is not a bug)

Posted Feb 9, 2022 18:19 UTC (Wed) by mricon (subscriber, #59252) [Link]

That's what the allowed_signers file does -- the latest iteration of it properly allows for settings like validity date ranges (i.e. "not before" and "not after" equivalence with TLS certs). The difficult part is, as usual, distributing this information around in a timely manner.

Ryabitsev: Cross-fork object sharing in git (is not a bug)

Posted Feb 8, 2022 23:09 UTC (Tue) by colemickens (guest, #101774) [Link]

Respectfully, but I don't get it. I thought "oh, another of these" and then the article itself points out that this is constantly re-discovered. There have been posts about this for years. I hope LWN isn't covering every crypto-spam-blog clickbait post that involves Linux.

Ryabitsev: Cross-fork object sharing in git (is not a bug)

Posted Feb 8, 2022 23:16 UTC (Tue) by dkg (subscriber, #55359) [Link]

The article in question is really about a series of implementation choices at major revision-control hosts like github.com, not about git itself. git doesn't even have the concept of "fork" in the way that github does.

Maybe the LWN editors could update the title of this post to be clear that it's about github (et al), and not about a potential flaw we need to watch out for in git itself?

Ryabitsev: Cross-fork object sharing in git (is not a bug)

Posted Feb 8, 2022 23:36 UTC (Tue) by mricon (subscriber, #59252) [Link]

> The article in question is really about a series of implementation choices at major revision-control hosts like github.com, not about git itself. git doesn't even have the concept of "fork" in the way that github does.

But it *is* about git itself. For example, you can easily watch it happen on your own disk:

1. create a repo called repo1
2. clone it with --shared: "git clone -s repo1 repo2"
3. create a new commit in repo1
4. you can now run "git cat-file -t <sha-of-that-commit>" from repo2

Ryabitsev: Cross-fork object sharing in git (is not a bug)

Posted Feb 8, 2022 23:42 UTC (Tue) by NYKevin (subscriber, #129325) [Link]

> git doesn't even have the concept of "fork" in the way that github does.

A fork is just a clone with good publicity.

Ryabitsev: Cross-fork object sharing in git (is not a bug)

Posted Feb 9, 2022 17:18 UTC (Wed) by jezuch (subscriber, #52988) [Link]

Not even a clone. It's more like a bunch of branches in a shared clone.

Ryabitsev: Cross-fork object sharing in git (is not a bug)

Posted Feb 9, 2022 20:26 UTC (Wed) by marcH (subscriber, #57642) [Link]

The two things git beginners seem to miss the most often:

- A git branch is nothing like a branch in other systems. It's just a "head".
- A clone/fork is only a collection of heads under the same access control. Nothing else.

Ryabitsev: Cross-fork object sharing in git (is not a bug)

Posted Feb 9, 2022 21:13 UTC (Wed) by NYKevin (subscriber, #129325) [Link]

This is why I prefer some (but not all) of the Mercurial terminology here:

* A "bookmark" is a pointer to a head, just like a Git branch. This name is much more logical and intuitive.
* On the other hand, they also use the word "branch" to refer to a metadata field attached to the commit object. Changing the "branch" also changes the hash. I'm not convinced that this field should even exist in the first place, let alone have such a confusing and overloaded name.
* The word "branch" is sometimes interpreted to mean a topological branch, i.e. a set of commits which form a linear chain, and IMHO this should have been the sole definition of "branch." In Mercurial, you don't have to have anything pointing to one of these, if you don't want to, and so you can have anonymous branches. This is similar to detached HEAD mode, except that Mercurial doesn't label the resulting commits as "unreachable" and hide them from history when you move off of them.
* If you want to get rid of a commit, the proper way to do that is to add an obsolescence marker for it, and such markers can be safely pushed between repositories without doing anything like a Git force-push (this functionality is theoretically still under development, so it's off-by-default, but it's reasonably stable at this point and I use it at $DAYJOB).

Unfortunately, Mercurial lost the mind-share war, and so I have to learn both systems. Oh well.

Ryabitsev: Cross-fork object sharing in git (is not a bug)

Posted Feb 9, 2022 10:01 UTC (Wed) by Thomas (subscriber, #39963) [Link]

Thank you.

On that note it makes me sad that the obvious needs to be pointed out.

git is OK, git hosting web page is allegedly broken. And some people on the interwebs don't understand the difference. Nothing new, nothing to see, carry on.

Ryabitsev: Cross-fork object sharing in git (is not a bug)

Posted Feb 9, 2022 7:28 UTC (Wed) by marcH (subscriber, #57642) [Link]

The paragraph "Why this isn't a security bug" is looking at only one type of threat.

There are git servers disabling uploadpack.allow___SHA1InWant[*] because (for instance) they don't want to garbage collect after each branch or tag deletion.

[*] https://git-scm.com/docs/git-fetch-pack


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK