1 8b4d5615 2023-05-05 op .\" gotmarc.7 was written by Omar Polo <op@openbsd.org> and is placed in
2 8b4d5615 2023-05-05 op .\" the public domain. The author hereby disclaims copyright to this
3 8b4d5615 2023-05-05 op .\" source code.
4 8b4d5615 2023-05-05 op .Dd May 5, 2023
9 8b4d5615 2023-05-05 op .Nd mailing list web archive generation system
10 8b4d5615 2023-05-05 op .Sh DESCRIPTION
12 8b4d5615 2023-05-05 op is a system to incrementally generate a web archive for a mailing list
13 8b4d5615 2023-05-05 op and optionally provide search capabilities.
14 8b4d5615 2023-05-05 op The generated archive is a set of HTML files that can be served as-is,
15 8b4d5615 2023-05-05 op while searching requires the use of a FastCGI server.
16 8b4d5615 2023-05-05 op At a higher level,
18 8b4d5615 2023-05-05 op is comprised of three main components:
20 8b4d5615 2023-05-05 op .Bl -tag -width msearchd_8_ -compact -offset indent
21 8b4d5615 2023-05-05 op .It Xr gotmarc 1
22 8b4d5615 2023-05-05 op to generate the static HTML files from a maildir.
23 8b4d5615 2023-05-05 op .It Xr gmimport 1
24 8b4d5615 2023-05-05 op to import emails into a sqlite3 database.
25 8b4d5615 2023-05-05 op .It Xr msearchd 8
26 8b4d5615 2023-05-05 op to provide search results.
28 8b4d5615 2023-05-05 op .Sh INITIAL SETUP
29 8b4d5615 2023-05-05 op There are several step necessary to initialize the web archive:
31 8b4d5615 2023-05-05 op .Bl -enum -compact -offset indent
33 8b4d5615 2023-05-05 op Create and populate the output directory.
35 8b4d5615 2023-05-05 op Customize the templates.
37 8b4d5615 2023-05-05 op Prepare the maildir.
39 8b4d5615 2023-05-05 op Generate the web archive.
41 8b4d5615 2023-05-05 op Set up the database for searching.
43 8b4d5615 2023-05-05 op Configure the web server.
46 8b4d5615 2023-05-05 op It is reccommended to use a dedicate user.
47 8b4d5615 2023-05-05 op Commands to be run as a unpriviledged user are preceded by a dollar sign
49 8b4d5615 2023-05-05 op while commands requiring superuser privileges by a hash mark
51 8b4d5615 2023-05-05 op Hereafter, it will be assumed that the local user is called
53 8b4d5615 2023-05-05 op .Ss 1. Create and populate the output directory
54 8b4d5615 2023-05-05 op The web archive is made of several static files, mostly HTML, that needs
55 8b4d5615 2023-05-05 op to be served by a web server like
57 8b4d5615 2023-05-05 op .Pa /var/www/gotmarc
58 8b4d5615 2023-05-05 op is the default location, but a different path can be used.
59 8b4d5615 2023-05-05 op To prepare it, issue:
60 8b4d5615 2023-05-05 op .Bd -literal -offset indent
61 8b4d5615 2023-05-05 op # mkdir -p /var/www/gotmarc
62 8b4d5615 2023-05-05 op # chown gotmarc /var/www/gotmarc
65 8b4d5615 2023-05-05 op Then copy the CSS file, optionally tweaking it.
67 8b4d5615 2023-05-05 op .Dl $ cp /usr/local/share/examples/gotmarc/style.css /var/www/gotmarc
69 8b4d5615 2023-05-05 op Other eventual assets
70 8b4d5615 2023-05-05 op .Pq e.g.\& logo images
71 8b4d5615 2023-05-05 op need to be copied here as well.
72 8b4d5615 2023-05-05 op .Ss 2. Customize the templates
73 8b4d5615 2023-05-05 op The default templates are installed at
74 8b4d5615 2023-05-05 op .Pa /etc/gotmarc .
75 8b4d5615 2023-05-05 op Since these are anonymous, they need to be tweaked to include
76 8b4d5615 2023-05-05 op information about the mailing list.
78 8b4d5615 2023-05-05 op Care should be taken when editing these files after generating the
79 8b4d5615 2023-05-05 op archive since existing pages won't be re-generated.
80 8b4d5615 2023-05-05 op The outdir and cachedir
81 8b4d5615 2023-05-05 op .Pq see Xr gotmarc 1
82 8b4d5615 2023-05-05 op needs to be deleted and the web archive generated again.
83 8b4d5615 2023-05-05 op .Xr msearchd 8
84 8b4d5615 2023-05-05 op has to be stopped and restarted as well.
85 8b4d5615 2023-05-05 op .Ss 3. Prepare the maildir
86 8b4d5615 2023-05-05 op The maildir with the mailing list entries needs to be prepared.
87 8b4d5615 2023-05-05 op It is assumed to be at
88 8b4d5615 2023-05-05 op .Pa ~/Mail/gotmarc
89 8b4d5615 2023-05-05 op by default, but a different path can be used.
90 8b4d5615 2023-05-05 op .Ss 4. Generate the web archive
92 8b4d5615 2023-05-05 op can be finally used to generate the web archive.
93 8b4d5615 2023-05-05 op The first run may take a while, depending on the size of the maildir,
94 8b4d5615 2023-05-05 op while subsequent runs will be incremental and take less time.
96 8b4d5615 2023-05-05 op .Dl $ gotmarc -m path/to/maildir -o path/to/outdir
98 8b4d5615 2023-05-05 op On multi-processor machines multiple processes may be used to save some
100 8b4d5615 2023-05-05 op .Xr gotmarc 1 Fl j No flag.
102 8b4d5615 2023-05-05 op The generated files may be compressed to save bandwidth:
104 8b4d5615 2023-05-05 op .Dl $ gzip -krq /var/www/gotmarc </dev/null 2>/dev/null
105 8b4d5615 2023-05-05 op .Ss 5. Set up the database for searching
106 8b4d5615 2023-05-05 op This is an suggested yet optional step.
108 8b4d5615 2023-05-05 op .Xr msearchd 8
109 8b4d5615 2023-05-05 op offers full text search capabilities using a sqlite3 database that has to
110 8b4d5615 2023-05-05 op be populated with
111 8b4d5615 2023-05-05 op .Xr gmimport 1 .
112 8b4d5615 2023-05-05 op First, create a directory in the
116 8b4d5615 2023-05-05 op .Bd -literal -offset indent
117 8b4d5615 2023-05-05 op # mkdir -p /var/www/msearchd
118 8b4d5615 2023-05-05 op # chown gotmarc /var/www/msearchd
121 8b4d5615 2023-05-05 op Then, populate the database with all emails in the maildir:
122 8b4d5615 2023-05-05 op .Bd -literal -offset indent
123 8b4d5615 2023-05-05 op $ sqlite3 /var/www/msearchd/mails.sqlite3 \e
124 8b4d5615 2023-05-05 op </usr/local/share/examples/gotmarc/schema.sql
125 8b4d5615 2023-05-05 op $ mlist ~/Mail/gotmarc | gmimport /var/www/msearchd/mails.sqlite3
128 8b4d5615 2023-05-05 op At this point,
129 8b4d5615 2023-05-05 op .Xr msearchd 8
130 8b4d5615 2023-05-05 op can be started.
131 8b4d5615 2023-05-05 op .Ss 6. Configure the web server
132 8b4d5615 2023-05-05 op The web server needs to serve the contents of the outdir as-is and
133 8b4d5615 2023-05-05 op handle the requests for
136 8b4d5615 2023-05-05 op .Xr msearchd 8
137 8b4d5615 2023-05-05 op FastCGI server.
140 8b4d5615 2023-05-05 op configuration is provided here for reference:
141 8b4d5615 2023-05-05 op .Bd -literal -offset indent
142 8b4d5615 2023-05-05 op server "marc.example.com" {
143 8b4d5615 2023-05-05 op listen on * port 80
144 8b4d5615 2023-05-05 op root "/gotmarc"
147 8b4d5615 2023-05-05 op # leave out when not using msearchd(8)
148 8b4d5615 2023-05-05 op location "/search" {
149 8b4d5615 2023-05-05 op fastcgi socket "/run/msearchd.sock"
153 8b4d5615 2023-05-05 op .Sh HANDLING NEW MESSAGES
154 8b4d5615 2023-05-05 op New messages should be fetched periodically using tools like
157 8b4d5615 2023-05-05 op .Xr mbsync 1 ,
158 8b4d5615 2023-05-05 op the database updated with
159 8b4d5615 2023-05-05 op .Xr gmimport 1
160 8b4d5615 2023-05-05 op and the web archive refreshed using
161 8b4d5615 2023-05-05 op .Xr gotmarc 1 .
163 8b4d5615 2023-05-05 op It is reccommended to create a script like the following and schedule
164 8b4d5615 2023-05-05 op its execution periodically with
166 8b4d5615 2023-05-05 op .Bd -literal -offset indent
171 8b4d5615 2023-05-05 op minc ~/Mail/gotmarc | gmimport /var/www/msearchd/mails.sqlite3
173 8b4d5615 2023-05-05 op gzip -krq /var/www/gotmarc/ </dev/null 2>/dev/null || true
177 8b4d5615 2023-05-05 op .Xr msearchd 8
179 8b4d5615 2023-05-05 op new messages still needs to be incorporated
180 8b4d5615 2023-05-05 op .Po i.e.\& moved from
185 8b4d5615 2023-05-05 op but no database has to be updated.
186 8b4d5615 2023-05-05 op In that case simplify the
188 8b4d5615 2023-05-05 op invocation as:
190 8b4d5615 2023-05-05 op .Dl minc -q ~/Mail/gotmarc
192 8b4d5615 2023-05-05 op and don't call
193 8b4d5615 2023-05-05 op .Xr gmimport 1
195 8b4d5615 2023-05-05 op .Sh HANDLING MULTIPLE MAILING LISTS
196 8b4d5615 2023-05-05 op If the archive for multiple mailing lists needs to be served from the
197 8b4d5615 2023-05-05 op same box, care must be taken to use different directories and database
198 8b4d5615 2023-05-05 op files to avoid mixing messages.
200 8b4d5615 2023-05-05 op .Xr msearchd 8
201 8b4d5615 2023-05-05 op handles only one database at a time, so multiple instances need to be
202 8b4d5615 2023-05-05 op run, each pointing at the database for only one mailing list.
203 8b4d5615 2023-05-05 op Different FastCGI socket path needs to be used per-instance.
205 8b4d5615 2023-05-05 op .Xr gotmarc 1
206 8b4d5615 2023-05-05 op outdir, maildir and cachedir must be unique per-mailing list, i.e.\& the
207 8b4d5615 2023-05-05 op .Fl c , Fl m No and Fl o
208 8b4d5615 2023-05-05 op flag must always be provided.
210 8b4d5615 2023-05-05 op Very likely, each mailing list will needs its own set of templates, so
211 8b4d5615 2023-05-05 op those needs to be prepared and both
212 8b4d5615 2023-05-05 op .Xr gotmarc 1
214 8b4d5615 2023-05-05 op .Xr msearchd 8
215 8b4d5615 2023-05-05 op have to be pointed at the right template directory.
217 8b4d5615 2023-05-05 op .Xr gmimport 1 ,
218 8b4d5615 2023-05-05 op .Xr gotmarc 1 ,
220 8b4d5615 2023-05-05 op .Xr sqlite3 1 ,
221 8b4d5615 2023-05-05 op .Xr httpd 8 ,
222 8b4d5615 2023-05-05 op .Xr msearchd 8