The Git "Dumb" Protocol Is Neat

Published on 2024-10-14 • Updated on 2024-10-14

So, my friend Nic at genderphas.ing showed me a cool trick you can do with Git repositories: you can host them completely statically, just by making a couple of tiny tweaks to the repository. And then, if you're willing to get a little spicy, you can even make a URL that is both a web page and a clonable repository. Let's see how!

The "Smart" Git Protocol

Almost all Git services, like GitHub, GitLab, Gitten, etc. operate using the "smart" protocol. This just means that git clone https://some.site/foo.git communicates with the server on some.site to collaboratively transmit the required data from the server to the client. I haven't done a deep dive on this protocol, so I don't understand the fine details, but it's the default for a reason: it's fast, reasonably secure, and not hard to set up.

But... can it be simpler? What if you don't have a web server? What if all you have is an S3 bucket, a static site generator, and a dream?

The "Dumb" Git Protocol

The "dumb" protocol is, basically, "what if git clone just did a bunch of HTTP requests to obtain the relevant repository files, and then figured the rest out itself." It's a very limited protocol: you don't get any access control, you leak the internals of the repository to the outside world, you don't get any optimized delivery of new objects... but. The server does not participate in this process at all. And that's a valuable property if you are operating in an environment where the web server is hidden from you, like an S3 bucket, or shared/static hosting.

It's important to note the dumb protocol does not support uploading data; it is fully read-only. 3rd parties can git clone and git pull, but not git push.

Setting Up A "Dumb" Repository

If you are 100% sure you're okay with your repository being fully exposed to the public, actually setting it up is easy. On your server, assuming your repository is a bare Git repository at /srv/git/foo.git:

Run touch /srv/git/foo.git/hooks/post-receive
Run pushd /srv/git/foo.git; git update-server-info; popd

And that's it! Assuming your web server is serving static files under /srv/git, you should now be able to clone the repository.

If you want to test this out locally, Python's http.server module is a handy way:

$ git init --bare foo.git
$ touch foo.git/hooks/post-receive
$ pushd foo.git; git update-server-info; popd
$ python -m http.server -d foo.git -b localhost 8080

And then in another shell (because http.server will block):

$ git clone http://localhost:8080/ foo

Spicy Mode: Hosting A Web Page Inside A Repository

The basic mechanism of the "dumb" protocol is that we serve the contents of the repository over HTTP as plain files. We just point our web server at the repository, and the git client works out the rest.

... So what happens if we put other files inside the repository? Like... index.html?

$ cat >foo.git/index.html <<EOF
<!doctype html>

<html lang="en-US">
	<head>
		<title>Foo: A Repository Of Cool Stuff</title>
	</head>
	<body>
		<h1>Foo: A Repository Of Cool Stuff</h1>
		<p>Check it out, ma! I'm a Git repository <em>and</em> a web page!</p>
	</body>
</html>
EOF
python -m http.server -d foo.git -b localhost 8080

# (visit localhost:8080 in your web browser)

Disclaimer: Be Careful

In particular, according to Nic, "exposing the .git folder and thus the credentials inside is a pretty common security misconfiguration." If your repository has any usernames, passwords, or other sensitive information in the .git/config file (or any other file), remove them from the public copy before pointing your web server at it.

This is a cool technique, but please be 100% sure you're okay with exposing your Git repository and everything inside it to the outside world. The "smart" protocol is the default for a reason. But if you have a cool idea and only have public repositories anyway, go for it! c: