1 42f36648 2020-10-15 op These days I’m building a gemini server called gmid.
3 42f36648 2020-10-15 op Original name, uh?
5 42f36648 2020-10-15 op (I hope I’m not stealing the name from someone else)
7 42f36648 2020-10-15 op gmid is a simple, zeroconf gemini server for static content. At the moment it doesn’t even handle virtual hosts, but it’s working fine for my use-case (serving a statically-generated blog).
9 42f36648 2020-10-15 op => https://git.omarpolo.com/gmid gmid repo
10 42f36648 2020-10-15 op => https://github.com/omar-polo/gmid github mirror
12 42f36648 2020-10-15 op I had a lot of fun writing it, so I thought to write a post, describing various implementation choices. I don’t know about you, but I usually don’t have the chance to write a server :)
15 42f36648 2020-10-15 op ## Where the journey begins
17 42f36648 2020-10-15 op => https://git.omarpolo.com/gmid/tree/gmid.c?id=4d4f0e19acf862d139c9864de8510c21b5538e9c First (running) version
19 42f36648 2020-10-15 op (it’s not technically the first commit because of a name clash on sendfile(2) on linux/FreeBSD that prevented the server from compile)
21 42f36648 2020-10-15 op The first version, with its 416 lines of code, was, surprisingly, usable. It’s a dead-simple implementation that uses blocking I/O on a single thread, so it’s not exactly the fastest implementation out there, and it always used the “text/gemini” MIME type for every response, but hey, it worked.
24 42f36648 2020-10-15 op ### A brief excursus on libtls
26 42f36648 2020-10-15 op I’m happy of the choice to use libtls: if you read the source code (the main, the loop and send_file functions in particular) you’ll see that it’s almost like the usual file API except for a tls_ prefix before.
28 42f36648 2020-10-15 op libts needs to be initialized:
31 42f36648 2020-10-15 op /* excerpt from main */
32 42f36648 2020-10-15 op struct tls *ctx = NULL;
33 42f36648 2020-10-15 op struct tls_config *conf;
35 42f36648 2020-10-15 op if ((conf = tls_config_new()) == NULL)
36 42f36648 2020-10-15 op err(1, "tls_config_new");
38 42f36648 2020-10-15 op if (tls_config_set_cert_file(conf, "cert.pem") == -1)
39 42f36648 2020-10-15 op err(1, "tls_config_set_cert_file: %s", cert);
41 42f36648 2020-10-15 op if (tls_config_set_key_file(conf, "key.pem") == -1)
42 42f36648 2020-10-15 op err(1, "tls_config_set_key_file: %s", key);
44 42f36648 2020-10-15 op if ((ctx = tls_server()) == NULL)
45 42f36648 2020-10-15 op err(1, "tls_server");
47 42f36648 2020-10-15 op if (tls_configure(ctx, conf) == -1)
48 42f36648 2020-10-15 op errx(1, "tls_configure: %s", tls_error(ctx));
51 42f36648 2020-10-15 op then it needs to allocate a ctx for every client
54 42f36648 2020-10-15 op /* from loop() */
56 42f36648 2020-10-15 op struct tls *clientctx;
60 42f36648 2020-10-15 op if ((fd = accept(sock, (struct sockaddr*)&client, &len)) == -1)
61 42f36648 2020-10-15 op err(1, "accept");
63 42f36648 2020-10-15 op if (tls_accept_socket(ctx, &clientctx, fd) == -1) {
64 42f36648 2020-10-15 op warnx("tls_accept_socket: %s", tls_error(ctx));
67 42f36648 2020-10-15 op /* XXX: handle the client */
68 42f36648 2020-10-15 op tls_close(clientctx);
69 42f36648 2020-10-15 op tls_free(clientctx);
73 42f36648 2020-10-15 op and then we can use tls_write, tls_read and tls_close as you may imagine:
76 42f36648 2020-10-15 op /* from send_file */
77 42f36648 2020-10-15 op while (w > 0) {
78 42f36648 2020-10-15 op if ((t = tls_write(ctx, buf + i, w)) == -1) {
79 42f36648 2020-10-15 op warnx("tls_write (path=%s) : %s", fpath, tls_error(ctx));
87 42f36648 2020-10-15 op I don’t know how it is to use OpenSSL API for TLS, but I really like the libtls interface (and its documentation!)
90 42f36648 2020-10-15 op ## poll(2) to the rescue
92 42f36648 2020-10-15 op => https://git.omarpolo.com/gmid/tree/gmid.c?id=592fd6245350595319e338ef49984a443b818f16 poll-based event loop
94 42f36648 2020-10-15 op Having a working server is neat, but having a working server that can handle more than one client at the same time is even better.
96 42f36648 2020-10-15 op One can use kqueue, libevent, libev, or other libraries to handle multiple clients, but one of the main point of this project was, other than having fun, keep it simple. For that reason, I excluded pthread, libevent and other libraries. You can directly use kqueue, but that it’s only for the BSDs. Or you can use epoll or one of the thousand alternatives, but that it’s linux-only.
98 42f36648 2020-10-15 op What remains? In POSIX AFAIK only select(2) and poll(2). But select(2) is ugly, so I went with poll(2).
100 42f36648 2020-10-15 op The challenge here was to rewrite the code to handle asynchronous I/O.
102 42f36648 2020-10-15 op When you’re dealing with synchronous I/O you write(2), that calls block, and when it has finished it returns, and you write(2) again. But when you’re writing asynchronous I/O you write something, the kernel tells you that socket wasn’t ready for writing and so you have to wait and retry later. The advantage is that in the meantime you can handle other clients, improving the throughtput of your program.
104 42f36648 2020-10-15 op In my case it meant that send_file, the function that does the most of the work, now can be “suspended” and “resumed”. That meant that now I have a state machine for every client, that goes like this
108 42f36648 2020-10-15 op client open a connection ->| S_OPEN |
111 42f36648 2020-10-15 op client sends the request
114 42f36648 2020-10-15 op ,----------------.
115 42f36648 2020-10-15 op | S_INITIALIZING | send the response header
116 42f36648 2020-10-15 op `----------------'
118 42f36648 2020-10-15 op the response has
126 42f36648 2020-10-15 op | ,-----------.
127 42f36648 2020-10-15 op | | S_SENDING | send the whole file
128 42f36648 2020-10-15 op | `-----------'
131 42f36648 2020-10-15 op ,-----------. /
132 42f36648 2020-10-15 op | S_CLOSING |<-'
133 42f36648 2020-10-15 op `-----------'
136 42f36648 2020-10-15 op (I couldn't avoid to make a graph, I love ASCII diagrams)
138 42f36648 2020-10-15 op I had to use all those states because at any point the network buffer may be full and we may need to recover from that point later. DFA are a simple way to code this, so I went with them.
140 42f36648 2020-10-15 op There was an interesting bug, where I didn't change the state from S_INITIALIZING to S_SENDING, and a chunk of the file was transmitted twice. It was hard to find because it would only happen with “big files” (i.e. images), because the pages I’m serving all fits in the network buffer that the kernel allocates.
143 42f36648 2020-10-15 op ## More features
145 42f36648 2020-10-15 op Then I added the support for MIME types (the server looks at the file extensions and chooses an appropriate MIME type) and used memory mapped I/O to read local files (it should make better user of the kernel VM subsystem).
147 42f36648 2020-10-15 op I’m now trying to support both ipv4 and ipv6, and then I’ll take a look at implementing virtual hosts.