The Git "Dumb" Protocol Is Neat
So, my friend Nic at genderphas.ing showed me a cool trick you can do with Git repositories: you can host them completely statically, just by making a couple of tiny tweaks to the repository. And then, if you're willing to get a little spicy, you can even make a URL that is both a web page and a clonable repository. Let's see how!
The "Smart" Git Protocol
Almost all Git services, like GitHub, GitLab, Gitten, etc. operate
using the "smart" protocol. This just means that git clone https://some.site/foo.git
communicates
with the server on some.site
to collaboratively transmit the required data from the server to the
client. I haven't done a deep dive on this protocol, so I don't understand the fine details, but
it's the default for a reason: it's fast, reasonably secure, and not hard to set up.
But... can it be simpler? What if you don't have a web server? What if all you have is an S3 bucket, a static site generator, and a dream?
The "Dumb" Git Protocol
The "dumb" protocol is, basically, "what if git clone
just did a bunch of HTTP requests to obtain
the relevant repository files, and then figured the rest out itself." It's a very limited protocol:
you don't get any access control, you leak the internals of the repository to the outside world, you
don't get any optimized delivery of new objects... but. The server does not participate in this
process at all. And that's a valuable property if you are operating in an environment where the web
server is hidden from you, like an S3 bucket, or shared/static hosting.
It's important to note the dumb protocol does not support uploading data; it is fully read-only.
3rd parties can git clone
and git pull
, but not git push
.
Setting Up A "Dumb" Repository
If you are 100% sure you're okay with your repository being fully exposed to the public, actually
setting it up is easy. On your server, assuming your repository is a bare Git repository at
/srv/git/foo.git
:
- Run
touch /srv/git/foo.git/hooks/post-receive
- Run
pushd /srv/git/foo.git; git update-server-info; popd
And that's it! Assuming your web server is serving static files under /srv/git
, you should now be
able to clone the repository.
If you want to test this out locally, Python's http.server
module is a handy way:
$ git init --bare foo.git
$ touch foo.git/hooks/post-receive
$ pushd foo.git; git update-server-info; popd
$ python -m http.server -d foo.git -b localhost 8080
And then in another shell (because http.server
will block):
$ git clone http://localhost:8080/ foo
Spicy Mode: Hosting A Web Page Inside A Repository
The basic mechanism of the "dumb" protocol is that we serve the contents of the repository over HTTP
as plain files. We just point our web server at the repository, and the git
client works out the
rest.
... So what happens if we put other files inside the repository? Like... index.html
?
$ cat >foo.git/index.html <<EOF
<!doctype html>
<html lang="en-US">
<head>
<title>Foo: A Repository Of Cool Stuff</title>
</head>
<body>
<h1>Foo: A Repository Of Cool Stuff</h1>
<p>Check it out, ma! I'm a Git repository <em>and</em> a web page!</p>
</body>
</html>
EOF
python -m http.server -d foo.git -b localhost 8080
# (visit localhost:8080 in your web browser)
Disclaimer: Be Careful
In particular, according to Nic, "exposing the .git
folder and thus the credentials inside is a
pretty common security misconfiguration." If your repository has any usernames, passwords, or other
sensitive information in the .git/config
file (or any other file), remove them from the public
copy before pointing your web server at it.
This is a cool technique, but please be 100% sure you're okay with exposing your Git repository and everything inside it to the outside world. The "smart" protocol is the default for a reason. But if you have a cool idea and only have public repositories anyway, go for it! c: