Blame


1 38232a0a 2023-05-07 op .\" smarc.7 was written by Omar Polo <op@openbsd.org> and is placed in
2 38232a0a 2023-05-07 op .\" the public domain. The author hereby disclaims copyright to this
3 38232a0a 2023-05-07 op .\" source code.
4 38232a0a 2023-05-07 op .Dd May 5, 2023
5 38232a0a 2023-05-07 op .Dt SMARC 7
6 38232a0a 2023-05-07 op .Os
7 38232a0a 2023-05-07 op .Sh NAME
8 38232a0a 2023-05-07 op .Nm smarc
9 38232a0a 2023-05-07 op .Nd mailing list web archive generation system
10 38232a0a 2023-05-07 op .Sh DESCRIPTION
11 38232a0a 2023-05-07 op .Nm
12 38232a0a 2023-05-07 op is a system to incrementally generate a web archive for a mailing list
13 38232a0a 2023-05-07 op and optionally provide search capabilities.
14 38232a0a 2023-05-07 op The generated archive is a set of HTML files that can be served as-is,
15 38232a0a 2023-05-07 op while searching requires the use of a FastCGI server.
16 38232a0a 2023-05-07 op At a higher level,
17 38232a0a 2023-05-07 op .Nm
18 38232a0a 2023-05-07 op is made of three components:
19 38232a0a 2023-05-07 op .Pp
20 38232a0a 2023-05-07 op .Bl -tag -width msearchd_8_ -compact -offset indent
21 38232a0a 2023-05-07 op .It Xr smarc 1
22 38232a0a 2023-05-07 op to generate the static HTML files from a maildir.
23 38232a0a 2023-05-07 op .It Xr smingest 1
24 38232a0a 2023-05-07 op to import emails into a sqlite3 database.
25 38232a0a 2023-05-07 op .It Xr msearchd 8
26 38232a0a 2023-05-07 op to provide search results.
27 38232a0a 2023-05-07 op .El
28 38232a0a 2023-05-07 op .Sh INITIAL SETUP
29 38232a0a 2023-05-07 op There are several step necessary to initialize the web archive:
30 38232a0a 2023-05-07 op .Pp
31 38232a0a 2023-05-07 op .Bl -enum -compact -offset indent
32 38232a0a 2023-05-07 op .It
33 38232a0a 2023-05-07 op Create and populate the output directory.
34 38232a0a 2023-05-07 op .It
35 38232a0a 2023-05-07 op Customize the templates.
36 38232a0a 2023-05-07 op .It
37 38232a0a 2023-05-07 op Prepare the maildir.
38 38232a0a 2023-05-07 op .It
39 38232a0a 2023-05-07 op Generate the web archive.
40 38232a0a 2023-05-07 op .It
41 38232a0a 2023-05-07 op Set up the database for searching.
42 38232a0a 2023-05-07 op .It
43 38232a0a 2023-05-07 op Configure the web server.
44 38232a0a 2023-05-07 op .El
45 38232a0a 2023-05-07 op .Pp
46 38232a0a 2023-05-07 op It is reccommended to use a dedicate user.
47 38232a0a 2023-05-07 op Commands to be run as a unpriviledged user are preceded by a dollar sign
48 38232a0a 2023-05-07 op .Sq $ ,
49 38232a0a 2023-05-07 op while commands requiring superuser privileges by a hash mark
50 38232a0a 2023-05-07 op .Sq # .
51 38232a0a 2023-05-07 op Hereafter, it will be assumed that the local user is called
52 38232a0a 2023-05-07 op .Sq smarc .
53 38232a0a 2023-05-07 op .Ss 1. Create and populate the output directory
54 38232a0a 2023-05-07 op The web archive is made of several static files, mostly HTML, that needs
55 38232a0a 2023-05-07 op to be served by a web server like
56 38232a0a 2023-05-07 op .Xr httpd 8 .
57 38232a0a 2023-05-07 op .Pa /var/www/smarc
58 38232a0a 2023-05-07 op is the default location, but a different path can be used.
59 38232a0a 2023-05-07 op To prepare it, issue:
60 38232a0a 2023-05-07 op .Bd -literal -offset indent
61 38232a0a 2023-05-07 op # mkdir -p /var/www/smarc
62 38232a0a 2023-05-07 op # chown smarc /var/www/smarc
63 38232a0a 2023-05-07 op .Ed
64 38232a0a 2023-05-07 op .Pp
65 38232a0a 2023-05-07 op Then copy the CSS file, optionally tweaking it.
66 38232a0a 2023-05-07 op .Pp
67 38232a0a 2023-05-07 op .Dl $ cp /usr/local/share/examples/smarc/style.css /var/www/smarc
68 38232a0a 2023-05-07 op .Pp
69 38232a0a 2023-05-07 op Other eventual assets
70 38232a0a 2023-05-07 op .Pq e.g.\& logo images
71 38232a0a 2023-05-07 op need to be copied here as well.
72 38232a0a 2023-05-07 op .Ss 2. Customize the templates
73 38232a0a 2023-05-07 op The default templates are installed at
74 38232a0a 2023-05-07 op .Pa /etc/smarc .
75 38232a0a 2023-05-07 op Since these are anonymous, they need to be tweaked to include
76 38232a0a 2023-05-07 op information about the mailing list.
77 38232a0a 2023-05-07 op .Pp
78 38232a0a 2023-05-07 op Care should be taken when editing these files after generating the
79 38232a0a 2023-05-07 op archive since existing pages won't be automatically updated.
80 38232a0a 2023-05-07 op The cachedir
81 38232a0a 2023-05-07 op .Pq see Xr smarc 1
82 38232a0a 2023-05-07 op needs to be deleted and the web archive generated again.
83 38232a0a 2023-05-07 op .Xr msearchd 8
84 38232a0a 2023-05-07 op has to be stopped and restarted as well.
85 38232a0a 2023-05-07 op .Ss 3. Prepare the maildir
86 38232a0a 2023-05-07 op The maildir with the mailing list entries needs to be prepared.
87 38232a0a 2023-05-07 op It is assumed to be at
88 38232a0a 2023-05-07 op .Pa ~/Mail/smarc
89 38232a0a 2023-05-07 op by default, but a different path can be used.
90 38232a0a 2023-05-07 op .Ss 4. Generate the web archive
91 38232a0a 2023-05-07 op .Xr smarc 1
92 38232a0a 2023-05-07 op can be finally used to generate the web archive.
93 38232a0a 2023-05-07 op The first run may take a while, depending on the size of the maildir,
94 38232a0a 2023-05-07 op while subsequent runs will be incremental and take less time.
95 38232a0a 2023-05-07 op .Pp
96 38232a0a 2023-05-07 op .Dl $ smarc -m path/to/maildir -o path/to/outdir
97 38232a0a 2023-05-07 op .Pp
98 38232a0a 2023-05-07 op On multi-processor machines multiple processes may be used to save some
99 38232a0a 2023-05-07 op time with the
100 38232a0a 2023-05-07 op .Xr smarc 1 Fl j No flag.
101 38232a0a 2023-05-07 op .Pp
102 38232a0a 2023-05-07 op The generated files may be compressed to save bandwidth:
103 38232a0a 2023-05-07 op .Pp
104 38232a0a 2023-05-07 op .Dl $ gzip -krq /var/www/smarc </dev/null 2>/dev/null
105 38232a0a 2023-05-07 op .Ss 5. Set up the database for searching
106 38232a0a 2023-05-07 op This is an suggested yet optional step.
107 38232a0a 2023-05-07 op .Pp
108 38232a0a 2023-05-07 op .Xr msearchd 8
109 38232a0a 2023-05-07 op offers full text search capabilities using a sqlite3 database that has to
110 38232a0a 2023-05-07 op be populated with
111 38232a0a 2023-05-07 op .Xr smingest 1 .
112 38232a0a 2023-05-07 op First, create a directory in the
113 38232a0a 2023-05-07 op .Pa /var/www
114 38232a0a 2023-05-07 op .Xr chroot 8
115 38232a0a 2023-05-07 op jail:
116 38232a0a 2023-05-07 op .Bd -literal -offset indent
117 38232a0a 2023-05-07 op # mkdir -p /var/www/msearchd
118 38232a0a 2023-05-07 op # chown smarc /var/www/msearchd
119 38232a0a 2023-05-07 op .Ed
120 38232a0a 2023-05-07 op .Pp
121 38232a0a 2023-05-07 op Then, populate the database with all emails in the maildir:
122 38232a0a 2023-05-07 op .Bd -literal -offset indent
123 38232a0a 2023-05-07 op $ sqlite3 /var/www/msearchd/mails.sqlite3 \e
124 38232a0a 2023-05-07 op </usr/local/share/examples/smarc/schema.sql
125 38232a0a 2023-05-07 op $ mlist ~/Mail/smarc | smingest /var/www/msearchd/mails.sqlite3
126 38232a0a 2023-05-07 op .Ed
127 38232a0a 2023-05-07 op .Pp
128 38232a0a 2023-05-07 op At this point,
129 38232a0a 2023-05-07 op .Xr msearchd 8
130 38232a0a 2023-05-07 op can be started.
131 38232a0a 2023-05-07 op .Ss 6. Configure the web server
132 38232a0a 2023-05-07 op The web server needs to serve the contents of the outdir as-is and
133 38232a0a 2023-05-07 op handle the requests for
134 38232a0a 2023-05-07 op .Pa /search
135 38232a0a 2023-05-07 op via the
136 38232a0a 2023-05-07 op .Xr msearchd 8
137 38232a0a 2023-05-07 op FastCGI server.
138 38232a0a 2023-05-07 op A sample
139 38232a0a 2023-05-07 op .Xr httpd 8
140 38232a0a 2023-05-07 op configuration is provided here for reference:
141 38232a0a 2023-05-07 op .Bd -literal -offset indent
142 38232a0a 2023-05-07 op server "marc.example.com" {
143 38232a0a 2023-05-07 op listen on * port 80
144 38232a0a 2023-05-07 op root "/smarc"
145 38232a0a 2023-05-07 op gzip-static
146 38232a0a 2023-05-07 op
147 38232a0a 2023-05-07 op # leave out when not using msearchd(8)
148 38232a0a 2023-05-07 op location "/search" {
149 38232a0a 2023-05-07 op fastcgi socket "/run/msearchd.sock"
150 38232a0a 2023-05-07 op }
151 38232a0a 2023-05-07 op }
152 38232a0a 2023-05-07 op .Ed
153 38232a0a 2023-05-07 op .Sh HANDLING NEW MESSAGES
154 38232a0a 2023-05-07 op New messages should be fetched periodically using tools like
155 38232a0a 2023-05-07 op .Xr fdm 1
156 38232a0a 2023-05-07 op or
157 38232a0a 2023-05-07 op .Xr mbsync 1 ,
158 38232a0a 2023-05-07 op the database updated with
159 38232a0a 2023-05-07 op .Xr smingest 1
160 38232a0a 2023-05-07 op and the web archive refreshed using
161 38232a0a 2023-05-07 op .Xr smarc 1 .
162 38232a0a 2023-05-07 op .Pp
163 38232a0a 2023-05-07 op It is reccommended to create a script like the following and schedule
164 38232a0a 2023-05-07 op its execution periodically with
165 38232a0a 2023-05-07 op .Xr cron 8 :
166 38232a0a 2023-05-07 op .Bd -literal -offset indent
167 38232a0a 2023-05-07 op #!/bin/sh
168 38232a0a 2023-05-07 op
169 38232a0a 2023-05-07 op set -e
170 38232a0a 2023-05-07 op fdm -l fetch
171 38232a0a 2023-05-07 op minc ~/Mail/smarc | smingest /var/www/msearchd/mails.sqlite3
172 38232a0a 2023-05-07 op smarc
173 38232a0a 2023-05-07 op gzip -krq /var/www/smarc/ </dev/null 2>/dev/null || true
174 38232a0a 2023-05-07 op .Ed
175 38232a0a 2023-05-07 op .Pp
176 38232a0a 2023-05-07 op If
177 38232a0a 2023-05-07 op .Xr msearchd 8
178 38232a0a 2023-05-07 op is not used,
179 38232a0a 2023-05-07 op new messages still needs to be incorporated
180 38232a0a 2023-05-07 op .Po i.e.\& moved from
181 38232a0a 2023-05-07 op .Pa new/
182 38232a0a 2023-05-07 op to
183 38232a0a 2023-05-07 op .Pa cur/
184 38232a0a 2023-05-07 op .Pc
185 38232a0a 2023-05-07 op but no database has to be updated.
186 38232a0a 2023-05-07 op In that case simplify the
187 38232a0a 2023-05-07 op .Xr minc 1
188 38232a0a 2023-05-07 op invocation as:
189 38232a0a 2023-05-07 op .Pp
190 38232a0a 2023-05-07 op .Dl minc -q ~/Mail/smarc
191 38232a0a 2023-05-07 op .Pp
192 38232a0a 2023-05-07 op and don't call
193 38232a0a 2023-05-07 op .Xr smingest 1
194 38232a0a 2023-05-07 op at all.
195 38232a0a 2023-05-07 op .Sh HANDLING MULTIPLE MAILING LISTS
196 38232a0a 2023-05-07 op If the archive for multiple mailing lists needs to be served from the
197 38232a0a 2023-05-07 op same box, care must be taken to use different directories and database
198 38232a0a 2023-05-07 op files to avoid mixing messages.
199 38232a0a 2023-05-07 op .Pp
200 38232a0a 2023-05-07 op .Xr msearchd 8
201 38232a0a 2023-05-07 op handles only one database at a time, so multiple instances need to be
202 38232a0a 2023-05-07 op run, each pointing at the database for only one mailing list.
203 38232a0a 2023-05-07 op Different FastCGI socket path needs to be used per-instance.
204 38232a0a 2023-05-07 op .Pp
205 38232a0a 2023-05-07 op .Xr smarc 1
206 38232a0a 2023-05-07 op outdir, maildir and cachedir must be unique per-mailing list, i.e.\& the
207 38232a0a 2023-05-07 op .Fl c , Fl m No and Fl o
208 38232a0a 2023-05-07 op flag must always be provided.
209 38232a0a 2023-05-07 op .Pp
210 38232a0a 2023-05-07 op Very likely, each mailing list will needs its own set of templates, so
211 38232a0a 2023-05-07 op those needs to be prepared and both
212 38232a0a 2023-05-07 op .Xr smarc 1
213 38232a0a 2023-05-07 op and
214 38232a0a 2023-05-07 op .Xr msearchd 8
215 38232a0a 2023-05-07 op have to be pointed at the right template directory.
216 38232a0a 2023-05-07 op .Sh SEE ALSO
217 38232a0a 2023-05-07 op .Xr minc 1 ,
218 38232a0a 2023-05-07 op .Xr smarc 1 ,
219 38232a0a 2023-05-07 op .Xr smingest 1 ,
220 38232a0a 2023-05-07 op .Xr sqlite3 1 ,
221 38232a0a 2023-05-07 op .Xr httpd 8 ,
222 38232a0a 2023-05-07 op .Xr msearchd 8