Blame


1 c418ae42 2021-02-13 op EDIT 2021/02/05: typos
2 c418ae42 2021-02-13 op
3 c418ae42 2021-02-13 op Some daemons are able to restart themselves. I mean, a real in-place restart, not a naïve external stop+re-exec.
4 c418ae42 2021-02-13 op
5 c418ae42 2021-02-13 op Why would you care if a daemon is able to restart in place or not? Well, it depends. For some daemons is almost a necessary feature (think of sshd, would you be happy if when you restart the daemon it would shut down every ongoing connections? I wouldn’t), in others a nice-to-have feature (httpd for instance), while in some case is an unnecessary complications.
6 c418ae42 2021-02-13 op
7 c418ae42 2021-02-13 op Generally speaking, with a various degree of importance, for network-related daemons being able to restart in place is a good thing. It means that you (the server administrator) can adjust things while the daemon is running and this is almost invisible for the outside word: ongoing connection are preserved and new connections are subject to the new set of rules.
8 c418ae42 2021-02-13 op
9 c418ae42 2021-02-13 op I just implemented something similar for gmid, my Gemini server, but the overall design can be used in various kind of daemons I guess.
10 c418ae42 2021-02-13 op
11 c418ae42 2021-02-13 op => gemini://gemini.omarpolo.com/pages/gmid.gmi gmid
12 c418ae42 2021-02-13 op
13 c418ae42 2021-02-13 op The solution I chose was to keep a parent process that on SIGHUP re-reads the configuration and forks(2) to execute the real daemon code. The other processes on SIGHUP simply stop accepting new connections and finish to process what they have.
14 c418ae42 2021-02-13 op
15 c418ae42 2021-02-13 op Doing it this way simplifies the situation when you take into consideration that the daemon may want to chroot itself, or do any other kind of sandbox, or drop privileges and so on, since the main process remains outside the chroot/sandbox with the original privileges. It also isn’t a security concern since all it does is waiting on a signal (in other words, it cannot be influenced by the outside world.)
16 c418ae42 2021-02-13 op
17 c418ae42 2021-02-13 op One thing to be wary are race-conditions induced by signal handlers. Consider this bit of code
18 c418ae42 2021-02-13 op
19 c418ae42 2021-02-13 op ```
20 c418ae42 2021-02-13 op /* 1 when SIGHUP is received, 0 otherwise.
21 c418ae42 2021-02-13 op * This var is shared with the children. */
22 c418ae42 2021-02-13 op volatile sig_atomic_t hupped;
23 c418ae42 2021-02-13 op
24 c418ae42 2021-02-13 op /* … */
25 c418ae42 2021-02-13 op
26 c418ae42 2021-02-13 op for (;;) {
27 c418ae42 2021-02-13 op hupped = 0;
28 c418ae42 2021-02-13 op
29 c418ae42 2021-02-13 op switch (fork()) {
30 c418ae42 2021-02-13 op case 0:
31 c418ae42 2021-02-13 op return daemon_main();
32 c418ae42 2021-02-13 op }
33 c418ae42 2021-02-13 op
34 c418ae42 2021-02-13 op wait_sighup();
35 c418ae42 2021-02-13 op /* after this point hupped is 1 */
36 c418ae42 2021-02-13 op reload_config();
37 c418ae42 2021-02-13 op }
38 c418ae42 2021-02-13 op ```
39 c418ae42 2021-02-13 op
40 c418ae42 2021-02-13 op You see the problem?
41 c418ae42 2021-02-13 op
42 c418ae42 2021-02-13 op (spoiler: the reload_config call is there only to trick you)
43 c418ae42 2021-02-13 op
44 c418ae42 2021-02-13 op We set ‘hupped’ to 0 before we fork, so our child starts with hupped set to 0, then we fork and wait. But what if we receive a SIGHUP after we set the variable to 0, but before the fork? Or right before wait_sighup? The children will exit and the main process would get stuck waiting for a SIGHUP that was already delivered.
45 c418ae42 2021-02-13 op
46 c418ae42 2021-02-13 op Oh, and guarding the wait_sighup won’t work too
47 c418ae42 2021-02-13 op
48 c418ae42 2021-02-13 op ```
49 c418ae42 2021-02-13 op if (!hupped) {
50 c418ae42 2021-02-13 op /* what happens if SIGHUP gets delivered
51 c418ae42 2021-02-13 op * here, before the wait? */
52 c418ae42 2021-02-13 op wait_sighup();
53 c418ae42 2021-02-13 op }
54 c418ae42 2021-02-13 op ```
55 c418ae42 2021-02-13 op
56 c418ae42 2021-02-13 op Fortunately, we can block signals with sigprocmask and wait for specific signals with sigwait.
57 c418ae42 2021-02-13 op
58 c418ae42 2021-02-13 op => gemini://gemini.omarpolo.com/cgi/man?sigprocmask sigprocmask(2)
59 c418ae42 2021-02-13 op => gemini://gemini.omarpolo.com/cgi/man?sigwait sigwait(2)
60 c418ae42 2021-02-13 op
61 c418ae42 2021-02-13 op Frankly, I never used these “advanced” signals API before, as usually the “simplified” interface were enough, but it’s nice to learn new stuff.
62 c418ae42 2021-02-13 op
63 c418ae42 2021-02-13 op The right order should be
64 c418ae42 2021-02-13 op * block all signals
65 c418ae42 2021-02-13 op * fork
66 c418ae42 2021-02-13 op * in the child, re-enable signals
67 c418ae42 2021-02-13 op * in the parent, wait for sighup
68 c418ae42 2021-02-13 op * re-enable signals
69 c418ae42 2021-02-13 op * repeat
70 c418ae42 2021-02-13 op
71 c418ae42 2021-02-13 op or, if you prefer some real code, something along the lines of
72 c418ae42 2021-02-13 op
73 c418ae42 2021-02-13 op ```C
74 c418ae42 2021-02-13 op sigset_t set;
75 c418ae42 2021-02-13 op
76 c418ae42 2021-02-13 op void
77 c418ae42 2021-02-13 op block_signals(void)
78 c418ae42 2021-02-13 op {
79 c418ae42 2021-02-13 op sigset_t new;
80 c418ae42 2021-02-13 op
81 c418ae42 2021-02-13 op sigemptyset(&new);
82 c418ae42 2021-02-13 op sigaddset(&new, SIGHUP);
83 c418ae42 2021-02-13 op sigprocmask(SIG_BLOCK, &new, &set);
84 c418ae42 2021-02-13 op }
85 c418ae42 2021-02-13 op
86 c418ae42 2021-02-13 op void
87 c418ae42 2021-02-13 op unblock_signals(void)
88 c418ae42 2021-02-13 op {
89 c418ae42 2021-02-13 op sigprocmask(SIG_SETMASK, &set, NULL);
90 c418ae42 2021-02-13 op }
91 c418ae42 2021-02-13 op
92 c418ae42 2021-02-13 op void
93 c418ae42 2021-02-13 op wait_sighup(void)
94 c418ae42 2021-02-13 op {
95 c418ae42 2021-02-13 op sigset_t mask;
96 c418ae42 2021-02-13 op int signo;
97 c418ae42 2021-02-13 op
98 c418ae42 2021-02-13 op sigemptyset(&mask);
99 c418ae42 2021-02-13 op sigaddset(&mask, SIGHUP);
100 c418ae42 2021-02-13 op sigwait(&mask, &signo);
101 c418ae42 2021-02-13 op }
102 c418ae42 2021-02-13 op
103 c418ae42 2021-02-13 op /* … */
104 c418ae42 2021-02-13 op
105 c418ae42 2021-02-13 op volatile sig_atomic_t hupped;
106 c418ae42 2021-02-13 op
107 c418ae42 2021-02-13 op /* … */
108 c418ae42 2021-02-13 op
109 c418ae42 2021-02-13 op for (;;) {
110 c418ae42 2021-02-13 op block_signals();
111 c418ae42 2021-02-13 op hupped = 0;
112 c418ae42 2021-02-13 op
113 c418ae42 2021-02-13 op switch (fork()) {
114 c418ae42 2021-02-13 op case 0:
115 c418ae42 2021-02-13 op unblock_signals();
116 c418ae42 2021-02-13 op return daemon_main();
117 c418ae42 2021-02-13 op }
118 c418ae42 2021-02-13 op
119 c418ae42 2021-02-13 op wait_sighup();
120 c418ae42 2021-02-13 op unblock_signals();
121 c418ae42 2021-02-13 op reload_config();
122 c418ae42 2021-02-13 op }
123 c418ae42 2021-02-13 op ```