Blame


1 42f36648 2020-10-15 op These days I’m building a gemini server called gmid.
2 42f36648 2020-10-15 op
3 42f36648 2020-10-15 op Original name, uh?
4 42f36648 2020-10-15 op
5 42f36648 2020-10-15 op (I hope I’m not stealing the name from someone else)
6 42f36648 2020-10-15 op
7 42f36648 2020-10-15 op gmid is a simple, zeroconf gemini server for static content. At the moment it doesn’t even handle virtual hosts, but it’s working fine for my use-case (serving a statically-generated blog).
8 42f36648 2020-10-15 op
9 42f36648 2020-10-15 op => https://git.omarpolo.com/gmid gmid repo
10 42f36648 2020-10-15 op => https://github.com/omar-polo/gmid github mirror
11 42f36648 2020-10-15 op
12 42f36648 2020-10-15 op I had a lot of fun writing it, so I thought to write a post, describing various implementation choices. I don’t know about you, but I usually don’t have the chance to write a server :)
13 42f36648 2020-10-15 op
14 42f36648 2020-10-15 op
15 42f36648 2020-10-15 op ## Where the journey begins
16 42f36648 2020-10-15 op
17 42f36648 2020-10-15 op => https://git.omarpolo.com/gmid/tree/gmid.c?id=4d4f0e19acf862d139c9864de8510c21b5538e9c First (running) version
18 42f36648 2020-10-15 op
19 42f36648 2020-10-15 op (it’s not technically the first commit because of a name clash on sendfile(2) on linux/FreeBSD that prevented the server from compile)
20 42f36648 2020-10-15 op
21 42f36648 2020-10-15 op The first version, with its 416 lines of code, was, surprisingly, usable. It’s a dead-simple implementation that uses blocking I/O on a single thread, so it’s not exactly the fastest implementation out there, and it always used the “text/gemini” MIME type for every response, but hey, it worked.
22 42f36648 2020-10-15 op
23 42f36648 2020-10-15 op
24 42f36648 2020-10-15 op ### A brief excursus on libtls
25 42f36648 2020-10-15 op
26 42f36648 2020-10-15 op I’m happy of the choice to use libtls: if you read the source code (the main, the loop and send_file functions in particular) you’ll see that it’s almost like the usual file API except for a tls_ prefix before.
27 42f36648 2020-10-15 op
28 42f36648 2020-10-15 op libts needs to be initialized:
29 42f36648 2020-10-15 op
30 42f36648 2020-10-15 op ```
31 42f36648 2020-10-15 op /* excerpt from main */
32 42f36648 2020-10-15 op struct tls *ctx = NULL;
33 42f36648 2020-10-15 op struct tls_config *conf;
34 42f36648 2020-10-15 op
35 42f36648 2020-10-15 op if ((conf = tls_config_new()) == NULL)
36 42f36648 2020-10-15 op err(1, "tls_config_new");
37 42f36648 2020-10-15 op
38 42f36648 2020-10-15 op if (tls_config_set_cert_file(conf, "cert.pem") == -1)
39 42f36648 2020-10-15 op err(1, "tls_config_set_cert_file: %s", cert);
40 42f36648 2020-10-15 op
41 42f36648 2020-10-15 op if (tls_config_set_key_file(conf, "key.pem") == -1)
42 42f36648 2020-10-15 op err(1, "tls_config_set_key_file: %s", key);
43 42f36648 2020-10-15 op
44 42f36648 2020-10-15 op if ((ctx = tls_server()) == NULL)
45 42f36648 2020-10-15 op err(1, "tls_server");
46 42f36648 2020-10-15 op
47 42f36648 2020-10-15 op if (tls_configure(ctx, conf) == -1)
48 42f36648 2020-10-15 op errx(1, "tls_configure: %s", tls_error(ctx));
49 42f36648 2020-10-15 op ```
50 42f36648 2020-10-15 op
51 42f36648 2020-10-15 op then it needs to allocate a ctx for every client
52 42f36648 2020-10-15 op
53 42f36648 2020-10-15 op ```
54 42f36648 2020-10-15 op /* from loop() */
55 42f36648 2020-10-15 op int fd;
56 42f36648 2020-10-15 op struct tls *clientctx;
57 42f36648 2020-10-15 op
58 42f36648 2020-10-15 op /* … */
59 42f36648 2020-10-15 op
60 42f36648 2020-10-15 op if ((fd = accept(sock, (struct sockaddr*)&client, &len)) == -1)
61 42f36648 2020-10-15 op err(1, "accept");
62 42f36648 2020-10-15 op
63 42f36648 2020-10-15 op if (tls_accept_socket(ctx, &clientctx, fd) == -1) {
64 42f36648 2020-10-15 op warnx("tls_accept_socket: %s", tls_error(ctx));
65 42f36648 2020-10-15 op continue;
66 42f36648 2020-10-15 op }
67 42f36648 2020-10-15 op /* XXX: handle the client */
68 42f36648 2020-10-15 op tls_close(clientctx);
69 42f36648 2020-10-15 op tls_free(clientctx);
70 42f36648 2020-10-15 op close(fd);
71 42f36648 2020-10-15 op ```
72 42f36648 2020-10-15 op
73 42f36648 2020-10-15 op and then we can use tls_write, tls_read and tls_close as you may imagine:
74 42f36648 2020-10-15 op
75 42f36648 2020-10-15 op ```
76 42f36648 2020-10-15 op /* from send_file */
77 42f36648 2020-10-15 op while (w > 0) {
78 42f36648 2020-10-15 op if ((t = tls_write(ctx, buf + i, w)) == -1) {
79 42f36648 2020-10-15 op warnx("tls_write (path=%s) : %s", fpath, tls_error(ctx));
80 42f36648 2020-10-15 op goto exit;
81 42f36648 2020-10-15 op }
82 42f36648 2020-10-15 op w -= t;
83 42f36648 2020-10-15 op i += t;
84 42f36648 2020-10-15 op }
85 42f36648 2020-10-15 op ```
86 42f36648 2020-10-15 op
87 42f36648 2020-10-15 op I don’t know how it is to use OpenSSL API for TLS, but I really like the libtls interface (and its documentation!)
88 42f36648 2020-10-15 op
89 42f36648 2020-10-15 op
90 42f36648 2020-10-15 op ## poll(2) to the rescue
91 42f36648 2020-10-15 op
92 42f36648 2020-10-15 op => https://git.omarpolo.com/gmid/tree/gmid.c?id=592fd6245350595319e338ef49984a443b818f16 poll-based event loop
93 42f36648 2020-10-15 op
94 42f36648 2020-10-15 op Having a working server is neat, but having a working server that can handle more than one client at the same time is even better.
95 42f36648 2020-10-15 op
96 42f36648 2020-10-15 op One can use kqueue, libevent, libev, or other libraries to handle multiple clients, but one of the main point of this project was, other than having fun, keep it simple. For that reason, I excluded pthread, libevent and other libraries. You can directly use kqueue, but that it’s only for the BSDs. Or you can use epoll or one of the thousand alternatives, but that it’s linux-only.
97 42f36648 2020-10-15 op
98 42f36648 2020-10-15 op What remains? In POSIX AFAIK only select(2) and poll(2). But select(2) is ugly, so I went with poll(2).
99 42f36648 2020-10-15 op
100 42f36648 2020-10-15 op The challenge here was to rewrite the code to handle asynchronous I/O.
101 42f36648 2020-10-15 op
102 42f36648 2020-10-15 op When you’re dealing with synchronous I/O you write(2), that calls block, and when it has finished it returns, and you write(2) again. But when you’re writing asynchronous I/O you write something, the kernel tells you that socket wasn’t ready for writing and so you have to wait and retry later. The advantage is that in the meantime you can handle other clients, improving the throughtput of your program.
103 42f36648 2020-10-15 op
104 42f36648 2020-10-15 op In my case it meant that send_file, the function that does the most of the work, now can be “suspended” and “resumed”. That meant that now I have a state machine for every client, that goes like this
105 42f36648 2020-10-15 op
106 42f36648 2020-10-15 op ```
107 42f36648 2020-10-15 op ,--------.
108 42f36648 2020-10-15 op client open a connection ->| S_OPEN |
109 42f36648 2020-10-15 op `--------'
110 42f36648 2020-10-15 op |
111 42f36648 2020-10-15 op client sends the request
112 42f36648 2020-10-15 op |
113 42f36648 2020-10-15 op v
114 42f36648 2020-10-15 op ,----------------.
115 42f36648 2020-10-15 op | S_INITIALIZING | send the response header
116 42f36648 2020-10-15 op `----------------'
117 42f36648 2020-10-15 op |
118 42f36648 2020-10-15 op the response has
119 42f36648 2020-10-15 op a body?
120 42f36648 2020-10-15 op / \
121 42f36648 2020-10-15 op / \
122 42f36648 2020-10-15 op / \
123 42f36648 2020-10-15 op no yes
124 42f36648 2020-10-15 op | |
125 42f36648 2020-10-15 op | v
126 42f36648 2020-10-15 op | ,-----------.
127 42f36648 2020-10-15 op | | S_SENDING | send the whole file
128 42f36648 2020-10-15 op | `-----------'
129 42f36648 2020-10-15 op | /
130 42f36648 2020-10-15 op v /
131 42f36648 2020-10-15 op ,-----------. /
132 42f36648 2020-10-15 op | S_CLOSING |<-'
133 42f36648 2020-10-15 op `-----------'
134 42f36648 2020-10-15 op ```
135 42f36648 2020-10-15 op
136 42f36648 2020-10-15 op (I couldn't avoid to make a graph, I love ASCII diagrams)
137 42f36648 2020-10-15 op
138 42f36648 2020-10-15 op I had to use all those states because at any point the network buffer may be full and we may need to recover from that point later. DFA are a simple way to code this, so I went with them.
139 42f36648 2020-10-15 op
140 42f36648 2020-10-15 op There was an interesting bug, where I didn't change the state from S_INITIALIZING to S_SENDING, and a chunk of the file was transmitted twice. It was hard to find because it would only happen with “big files” (i.e. images), because the pages I’m serving all fits in the network buffer that the kernel allocates.
141 42f36648 2020-10-15 op
142 42f36648 2020-10-15 op
143 42f36648 2020-10-15 op ## More features
144 42f36648 2020-10-15 op
145 42f36648 2020-10-15 op Then I added the support for MIME types (the server looks at the file extensions and chooses an appropriate MIME type) and used memory mapped I/O to read local files (it should make better user of the kernel VM subsystem).
146 42f36648 2020-10-15 op
147 42f36648 2020-10-15 op I’m now trying to support both ipv4 and ipv6, and then I’ll take a look at implementing virtual hosts.