Blob


1 .\" smarc.7 was written by Omar Polo <op@openbsd.org> and is placed in
2 .\" the public domain. The author hereby disclaims copyright to this
3 .\" source code.
4 .Dd May 5, 2023
5 .Dt SMARC 7
6 .Os
7 .Sh NAME
8 .Nm smarc
9 .Nd mailing list web archive generation system
10 .Sh DESCRIPTION
11 .Nm
12 is a system to incrementally generate a web archive for a mailing list
13 and optionally provide search capabilities.
14 The generated archive is a set of HTML files that can be served as-is,
15 while searching requires the use of a FastCGI server.
16 At a higher level,
17 .Nm
18 is made of three components:
19 .Pp
20 .Bl -tag -width msearchd_8_ -compact -offset indent
21 .It Xr smarc 1
22 to generate the static HTML files from a maildir.
23 .It Xr smingest 1
24 to import emails into a sqlite3 database.
25 .It Xr msearchd 8
26 to provide search results.
27 .El
28 .Sh INITIAL SETUP
29 There are several step necessary to initialize the web archive:
30 .Pp
31 .Bl -enum -compact -offset indent
32 .It
33 Create and populate the output directory.
34 .It
35 Customize the templates.
36 .It
37 Prepare the maildir.
38 .It
39 Generate the web archive.
40 .It
41 Set up the database for searching.
42 .It
43 Configure the web server.
44 .El
45 .Pp
46 It is reccommended to use a dedicate user.
47 Commands to be run as a unpriviledged user are preceded by a dollar sign
48 .Sq $ ,
49 while commands requiring superuser privileges by a hash mark
50 .Sq # .
51 Hereafter, it will be assumed that the local user is called
52 .Sq smarc .
53 .Ss 1. Create and populate the output directory
54 The web archive is made of several static files, mostly HTML, that needs
55 to be served by a web server like
56 .Xr httpd 8 .
57 .Pa /var/www/smarc
58 is the default location, but a different path can be used.
59 To prepare it, issue:
60 .Bd -literal -offset indent
61 # mkdir -p /var/www/smarc
62 # chown smarc /var/www/smarc
63 .Ed
64 .Pp
65 Then copy the CSS file, optionally tweaking it.
66 .Pp
67 .Dl $ cp /usr/local/share/examples/smarc/style.css /var/www/smarc
68 .Pp
69 Other eventual assets
70 .Pq e.g.\& logo images
71 need to be copied here as well.
72 .Ss 2. Customize the templates
73 The default templates are installed at
74 .Pa /etc/smarc .
75 Since these are anonymous, they need to be tweaked to include
76 information about the mailing list.
77 .Pp
78 Care should be taken when editing these files after generating the
79 archive since existing pages won't be automatically updated.
80 The cachedir
81 .Pq see Xr smarc 1
82 needs to be deleted and the web archive generated again.
83 .Xr msearchd 8
84 has to be stopped and restarted as well.
85 .Ss 3. Prepare the maildir
86 The maildir with the mailing list entries needs to be prepared.
87 It is assumed to be at
88 .Pa ~/Mail/smarc
89 by default, but a different path can be used.
90 .Ss 4. Generate the web archive
91 .Xr smarc 1
92 can be finally used to generate the web archive.
93 The first run may take a while, depending on the size of the maildir,
94 while subsequent runs will be incremental and take less time.
95 .Pp
96 .Dl $ smarc -m path/to/maildir -o path/to/outdir
97 .Pp
98 On multi-processor machines multiple processes may be used to save some
99 time with the
100 .Xr smarc 1 Fl j No flag.
101 .Pp
102 The generated files may be compressed to save bandwidth:
103 .Pp
104 .Dl $ gzip -krq /var/www/smarc </dev/null 2>/dev/null
105 .Ss 5. Set up the database for searching
106 This is an suggested yet optional step.
107 .Pp
108 .Xr msearchd 8
109 offers full text search capabilities using a sqlite3 database that has to
110 be populated with
111 .Xr smingest 1 .
112 First, create a directory in the
113 .Pa /var/www
114 .Xr chroot 8
115 jail:
116 .Bd -literal -offset indent
117 # mkdir -p /var/www/msearchd
118 # chown smarc /var/www/msearchd
119 .Ed
120 .Pp
121 Then, populate the database with all emails in the maildir:
122 .Bd -literal -offset indent
123 $ sqlite3 /var/www/msearchd/mails.sqlite3 \e
124 </usr/local/share/examples/smarc/schema.sql
125 $ mlist ~/Mail/smarc | smingest /var/www/msearchd/mails.sqlite3
126 .Ed
127 .Pp
128 At this point,
129 .Xr msearchd 8
130 can be started.
131 .Ss 6. Configure the web server
132 The web server needs to serve the contents of the outdir as-is and
133 handle the requests for
134 .Pa /search
135 via the
136 .Xr msearchd 8
137 FastCGI server.
138 A sample
139 .Xr httpd 8
140 configuration is provided here for reference:
141 .Bd -literal -offset indent
142 server "marc.example.com" {
143 listen on * port 80
144 root "/smarc"
145 gzip-static
147 # leave out when not using msearchd(8)
148 location "/search" {
149 fastcgi socket "/run/msearchd.sock"
152 .Ed
153 .Sh HANDLING NEW MESSAGES
154 New messages should be fetched periodically using tools like
155 .Xr fdm 1
156 or
157 .Xr mbsync 1 ,
158 the database updated with
159 .Xr smingest 1
160 and the web archive refreshed using
161 .Xr smarc 1 .
162 .Pp
163 It is reccommended to create a script like the following and schedule
164 its execution periodically with
165 .Xr cron 8 :
166 .Bd -literal -offset indent
167 #!/bin/sh
169 set -e
170 fdm -l fetch
171 minc ~/Mail/smarc | smingest /var/www/msearchd/mails.sqlite3
172 smarc
173 yes | gzip -krq /var/www/smarc/ 2>/dev/null || true
174 .Ed
175 .Pp
176 If
177 .Xr msearchd 8
178 is not used,
179 new messages still needs to be incorporated
180 .Po i.e.\& moved from
181 .Pa new/
182 to
183 .Pa cur/
184 .Pc
185 but no database has to be updated.
186 In that case simplify the
187 .Xr minc 1
188 invocation as:
189 .Pp
190 .Dl minc -q ~/Mail/smarc
191 .Pp
192 and don't call
193 .Xr smingest 1
194 at all.
195 .Sh HANDLING MULTIPLE MAILING LISTS
196 If the archive for multiple mailing lists needs to be served from the
197 same box, care must be taken to use different directories and database
198 files to avoid mixing messages.
199 .Pp
200 .Xr msearchd 8
201 handles only one database at a time, so multiple instances need to be
202 run, each pointing at the database for only one mailing list.
203 Different FastCGI socket path needs to be used per-instance.
204 .Pp
205 .Xr smarc 1
206 outdir, maildir and cachedir must be unique per-mailing list, i.e.\& the
207 .Fl c , Fl m No and Fl o
208 flag must always be provided.
209 .Pp
210 Very likely, each mailing list will needs its own set of templates, so
211 those needs to be prepared and both
212 .Xr smarc 1
213 and
214 .Xr msearchd 8
215 have to be pointed at the right template directory.
216 .Sh SEE ALSO
217 .Xr minc 1 ,
218 .Xr smarc 1 ,
219 .Xr smingest 1 ,
220 .Xr sqlite3 1 ,
221 .Xr httpd 8 ,
222 .Xr msearchd 8