Mirror of the Rel4tion website/wiki source, view at <http://rel4tion.org>
Clone
HTTPS:
git clone https://vervis.peers.community/repos/yEzqv
SSH:
git clone USERNAME@vervis.peers.community:yEzqv
Branches
Tags
23.mdwn
[[!template id=ticket class=task assigned=fr33domlover]]
[[!meta title=“HTTP smart mode”]]
Issue
Git supports push and pull over HTTP. There is an old dumb mode which simply GETs the git repo’s file, and a smart mode that works similarly to how git works over SSH.
Git includes a CGI program git http-backend
which implements both modes. But it seems to be easy enough to implement by myself. I looked at Gogs code and it seems to do that indeed. However, it’s a simple HTTP(S) wrapper of the git upload-pack
command and other backend commands. There are really 2 things I can do:
- Implement HTTP smart mode directly with Yesod (must)
- Implement
git upload-pack
using [[!hackage hit]] (cool bonus)
If I do the latter, I’ll also be able to:
- Implement a simple git protocol daemon
- Implement the SSH server component without using the git binary
Progress
Intro
I spent a while, few weeks, working mostly on the git-upload-pack-over-SSH code. It’s unclear, very technical, boring, not documented precisely, eventually depressing. So I moved back to Vervis, and eventually, earlier today, I added support for git-upload-pack over SSH using the git-upload-pack
executable.
Since the HTTP mode is stateless, at least from the server’s point of view, it may be easier to implement the protocol’s steps. Getting results faster also helps maintain motivation, and maybe I’ll even successfully implement the parts missing in my SSH code.
Before I start, what should be the git clone URL? Remember Darcs is going to be supported too. Should the repo page also be used for VCS access? It’s nice for UI, but I’m not sure it’s good for RESTfulness and forward compatibility. Since it’s very early anyway, here’s an idea: For a sharer john
and repo foobar
, the path /u/john/r/foobar/<VCS-NAME>
will be the URL for VCS access. The VCS-NAME part can be git
or darcs
. So for now, just git
.
DECISION: git access base URL is /u/USER/r/REPO/git
Since the smart mode is everywhere these days, I’m going to start with it and ignore the dumb mode entirely.
General
These are general requirements from the git docs. Replace ( )
with (x)
gradually.
( ) If there is no repository at `$GIT_URL`, or the resource pointed to by
a location matching `$GIT_URL` does not exist, the server MUST NOT
respond with `200 OK` response. A server SHOULD respond with
`404 Not Found`, `410 Gone`, or any other suitable HTTP status code
which does not imply the resource exists as requested.
( ) If there is a repository at `$GIT_URL`, but access is not currently
permitted, the server MUST respond with the `403 Forbidden` HTTP status
code.
( ) Servers SHOULD support both HTTP 1.0 and HTTP 1.1.
( ) Servers SHOULD support chunked encoding for both request and response
bodies.
( ) Servers MAY return ETag and/or Last-Modified headers.
( ) Servers MAY return `304 Not Modified` if the relevant headers appear
in the request and the entity has not changed. Clients MUST treat
`304 Not Modified` identical to `200 OK` by reusing the cached entity.
( ) Clients MAY reuse a cached entity without revalidation if the
Cache-Control and/or Expires header permits caching. Clients and
servers MUST follow RFC 2616 for cache controls.
Ref discovery
(x) The first step, ref discovery, starts by sending a GET to `/info/refs`.
The result is plain text, so I'm adding a new route that always returns
404 for now.
(x) The request MUST contain exactly one query parameter,
`service=$servicename`, where `$servicename` MUST be the service name
the client wishes to contact to complete the operation. The request
MUST NOT contain additional query parameters.
(x) Find out how to get query params in a Yesod handler
In package `yesod-core`, module `Yesod.Core.Handler`, there's a
function `getRequest` and also several sugar functions for looking
up parameters.
(x) The message contains refs much like in SSH, but with an additional
first line.
(x) Find out what a peeled ref is and implement in `hit-network`.
Ah, I think I get it. There are lightweight tags, which just point
to commits, and annotated tags, which have an author and date and
their own SHA1 and optional GPG signature. I have both in my repos,
so I'm playing with them. It seems that:
* The SHA1 of a lightweight tag points to a *commit* object
* The SHA1 of an annotated tag points to a *tag* object
Before I proceed, wild guess. The only meaning of "peeling" I can
see here is this: When you find an annotated tag, you can read the
tag object and pick the SHA1 of the commit it refers to. Is that
what peeling means? Cool idea: I'll try on a repo. In my old `sif`
repo, I have annotated tags. Running `git-upload-pack` on it
returns ref discovery as follows (capabilities removed for
readability):
00c95dfa29d168487a11e7be741a88129a810927f178 HEAD
003f5dfa29d168487a11e7be741a88129a810927f178 refs/heads/master
00485dfa29d168487a11e7be741a88129a810927f178 refs/remotes/origin/master
003d2a065dac1ed0027405dff41da17ef58f53e2bfdf refs/tags/0.1.0
004072150789f8e4ab172c5c9a0e81cab62d26b3e287 refs/tags/0.1.0^{}
003d625f198eefc93151e0a86bd7aa2ea69f2ecd37de refs/tags/0.1.1
004085a91133a46455378f648d7231c706771864161a refs/tags/0.1.1^{}
What are the SHA1s of the tags? Let's check:
* 0.1.0 - tag
* 0.1.0^{} - commit
* 0.1.1 - tag
* 0.1.1^{} - commit
Indeed, peeling means to fetch the commit pointed by an annotated
tag. I'm implementing this in `hit-network`. The git source code
also suggests (unless I missed something) that peeling simply means
to get the SHA1 the annotated tag points to.
(x) If the server does not recognize the requested service name, or the
requested service name has been disabled by the server administrator,
the server MUST respond with the `403 Forbidden` HTTP status code.
(x) Otherwise, smart servers MUST respond with the smart server reply
format for the requested service name.
(x) Cache-Control headers SHOULD be used to disable caching of the
returned entity.
(x) The Content-Type MUST be `application/x-$servicename-advertisement`.
Clients SHOULD fall back to the dumb protocol if another content type
is returned. When falling back to the dumb protocol clients SHOULD NOT
make an additional request to `$GIT_URL/info/refs`, but instead SHOULD
use the response already in hand. Clients MUST NOT continue if they do
not support the dumb protocol.
(x) Clients MUST verify the first pkt-line is `# service=$servicename`.
Servers MUST set $servicename to be the request parameter value.
Servers SHOULD include an LF at the end of this line.
Clients MUST ignore an LF at the end of the line.
(x) Servers MUST terminate the response with the magic `0000` end
pkt-line marker.
(x) The returned response is a pkt-line stream describing each ref and
its known value. The stream SHOULD be sorted by name according to
the C locale ordering. The stream SHOULD include the default ref
named `HEAD` as the first ref. The stream MUST include capability
declarations behind a NUL on the first ref.
See http-protocol.txt for the BNF of the message.
Upload pack
The docs say the request step contains only WANT and HAVE lines, but I’m trying some git commands and it seems the content sent is much like in the SSH case, and actually the same git command, and specifically the same git source C function, handles both cases and it doesn’t even seem to check which case it is.
At least the HTTP transport case seems to work in steps. In each step, the client sends at most 32 HAVE lines. The idea is to keep sending until a common commit is chosen or something like that, must overall after sending 256 HAVEs without confirmation, the client gives up.
I want to try some commands and examine the request.
$ git clone http://dev.rel4tion.org/u/dummy/r/some-repo/git
GET /u/dummy/r/some-repo/git/info/refs
Params: [("service","git-upload-pack")]
Accept: */*
POST /u/dummy/r/some-repo/git/git-upload-pack
Request Body:
0042want 46cf4f38fd6bf7b9791362680485d2915634c085 agent=git/1.9.1
0032want 46cf4f38fd6bf7b9791362680485d2915634c085
0032want 5931f869783fbd0301c8d6384b14a8eb91cbdca2
0032want 804015721ac0a1bb8863b2add145d6e0533d4ffa
0032want fa6473b7df0d6376e093f7e4ac4f513c9492d2b8
0032want 2fcf3ce929a174575c7f88d588f3c36c4ccb24b0
0032want 3ae3dbcb14fb452c4b5ed62a8408825a6a46bb3a
0032want 092cebadc64f23f2206a9a9e1655ae964d4a66e8
00000009done
Accept: application/x-git-upload-pack-result
So what we have here is:
- First want line with capabilities appended
- More want lines
- Flush-pkt
“done” pkt-line
$ git clone http://dev.rel4tion.org/u/dummy/r/some-repo/git –depth 5
GET /u/dummy/r/some-repo/git/info/refs Params: [(“service”,“git-upload-pack”)] Accept: / POST /u/dummy/r/some-repo/git/git-upload-pack Request Body: 0042want 46cf4f38fd6bf7b9791362680485d2915634c085 agent=git/1.9.1 0032want 46cf4f38fd6bf7b9791362680485d2915634c085 000cdeepen 50000 Accept: application/x-git-upload-pack-result
What we have this time is:
- First want with capabilities
- Another want, this time the wants refer just to HEAD and master
- deepen 5 pkt-line
- Flush-pkt
The shallow part is irrelevant because it won’t work until we advertise that we support shallow clients in the capabilities.
Anyway, merely supporting git-clone will already be a great achievement. I want to teach hit-network to handle a request which contains:
- 1 or more wants, with the first one listing capabilities
- Flush-pkt
- done
Result
Not done yet.