Blame


1 8b4d5615 2023-05-05 op .\" gotmarc.7 was written by Omar Polo <op@openbsd.org> and is placed in
2 8b4d5615 2023-05-05 op .\" the public domain. The author hereby disclaims copyright to this
3 8b4d5615 2023-05-05 op .\" source code.
4 8b4d5615 2023-05-05 op .Dd May 5, 2023
5 8b4d5615 2023-05-05 op .Dt GOTMARC 7
6 8b4d5615 2023-05-05 op .Os
7 8b4d5615 2023-05-05 op .Sh NAME
8 8b4d5615 2023-05-05 op .Nm gotmarc
9 8b4d5615 2023-05-05 op .Nd mailing list web archive generation system
10 8b4d5615 2023-05-05 op .Sh DESCRIPTION
11 8b4d5615 2023-05-05 op .Nm
12 8b4d5615 2023-05-05 op is a system to incrementally generate a web archive for a mailing list
13 8b4d5615 2023-05-05 op and optionally provide search capabilities.
14 8b4d5615 2023-05-05 op The generated archive is a set of HTML files that can be served as-is,
15 8b4d5615 2023-05-05 op while searching requires the use of a FastCGI server.
16 8b4d5615 2023-05-05 op At a higher level,
17 8b4d5615 2023-05-05 op .Nm
18 8b4d5615 2023-05-05 op is comprised of three main components:
19 8b4d5615 2023-05-05 op .Pp
20 8b4d5615 2023-05-05 op .Bl -tag -width msearchd_8_ -compact -offset indent
21 8b4d5615 2023-05-05 op .It Xr gotmarc 1
22 8b4d5615 2023-05-05 op to generate the static HTML files from a maildir.
23 8b4d5615 2023-05-05 op .It Xr gmimport 1
24 8b4d5615 2023-05-05 op to import emails into a sqlite3 database.
25 8b4d5615 2023-05-05 op .It Xr msearchd 8
26 8b4d5615 2023-05-05 op to provide search results.
27 8b4d5615 2023-05-05 op .El
28 8b4d5615 2023-05-05 op .Sh INITIAL SETUP
29 8b4d5615 2023-05-05 op There are several step necessary to initialize the web archive:
30 8b4d5615 2023-05-05 op .Pp
31 8b4d5615 2023-05-05 op .Bl -enum -compact -offset indent
32 8b4d5615 2023-05-05 op .It
33 8b4d5615 2023-05-05 op Create and populate the output directory.
34 8b4d5615 2023-05-05 op .It
35 8b4d5615 2023-05-05 op Customize the templates.
36 8b4d5615 2023-05-05 op .It
37 8b4d5615 2023-05-05 op Prepare the maildir.
38 8b4d5615 2023-05-05 op .It
39 8b4d5615 2023-05-05 op Generate the web archive.
40 8b4d5615 2023-05-05 op .It
41 8b4d5615 2023-05-05 op Set up the database for searching.
42 8b4d5615 2023-05-05 op .It
43 8b4d5615 2023-05-05 op Configure the web server.
44 8b4d5615 2023-05-05 op .El
45 8b4d5615 2023-05-05 op .Pp
46 8b4d5615 2023-05-05 op It is reccommended to use a dedicate user.
47 8b4d5615 2023-05-05 op Commands to be run as a unpriviledged user are preceded by a dollar sign
48 8b4d5615 2023-05-05 op .Sq $ ,
49 8b4d5615 2023-05-05 op while commands requiring superuser privileges by a hash mark
50 8b4d5615 2023-05-05 op .Sq # .
51 8b4d5615 2023-05-05 op Hereafter, it will be assumed that the local user is called
52 8b4d5615 2023-05-05 op .Sq gotmarc .
53 8b4d5615 2023-05-05 op .Ss 1. Create and populate the output directory
54 8b4d5615 2023-05-05 op The web archive is made of several static files, mostly HTML, that needs
55 8b4d5615 2023-05-05 op to be served by a web server like
56 8b4d5615 2023-05-05 op .Xr httpd 8 .
57 8b4d5615 2023-05-05 op .Pa /var/www/gotmarc
58 8b4d5615 2023-05-05 op is the default location, but a different path can be used.
59 8b4d5615 2023-05-05 op To prepare it, issue:
60 8b4d5615 2023-05-05 op .Bd -literal -offset indent
61 8b4d5615 2023-05-05 op # mkdir -p /var/www/gotmarc
62 8b4d5615 2023-05-05 op # chown gotmarc /var/www/gotmarc
63 8b4d5615 2023-05-05 op .Ed
64 8b4d5615 2023-05-05 op .Pp
65 8b4d5615 2023-05-05 op Then copy the CSS file, optionally tweaking it.
66 8b4d5615 2023-05-05 op .Pp
67 8b4d5615 2023-05-05 op .Dl $ cp /usr/local/share/examples/gotmarc/style.css /var/www/gotmarc
68 8b4d5615 2023-05-05 op .Pp
69 8b4d5615 2023-05-05 op Other eventual assets
70 8b4d5615 2023-05-05 op .Pq e.g.\& logo images
71 8b4d5615 2023-05-05 op need to be copied here as well.
72 8b4d5615 2023-05-05 op .Ss 2. Customize the templates
73 8b4d5615 2023-05-05 op The default templates are installed at
74 8b4d5615 2023-05-05 op .Pa /etc/gotmarc .
75 8b4d5615 2023-05-05 op Since these are anonymous, they need to be tweaked to include
76 8b4d5615 2023-05-05 op information about the mailing list.
77 8b4d5615 2023-05-05 op .Pp
78 8b4d5615 2023-05-05 op Care should be taken when editing these files after generating the
79 8b4d5615 2023-05-05 op archive since existing pages won't be re-generated.
80 8b4d5615 2023-05-05 op The outdir and cachedir
81 8b4d5615 2023-05-05 op .Pq see Xr gotmarc 1
82 8b4d5615 2023-05-05 op needs to be deleted and the web archive generated again.
83 8b4d5615 2023-05-05 op .Xr msearchd 8
84 8b4d5615 2023-05-05 op has to be stopped and restarted as well.
85 8b4d5615 2023-05-05 op .Ss 3. Prepare the maildir
86 8b4d5615 2023-05-05 op The maildir with the mailing list entries needs to be prepared.
87 8b4d5615 2023-05-05 op It is assumed to be at
88 8b4d5615 2023-05-05 op .Pa ~/Mail/gotmarc
89 8b4d5615 2023-05-05 op by default, but a different path can be used.
90 8b4d5615 2023-05-05 op .Ss 4. Generate the web archive
91 8b4d5615 2023-05-05 op .Xr gotmarc 1
92 8b4d5615 2023-05-05 op can be finally used to generate the web archive.
93 8b4d5615 2023-05-05 op The first run may take a while, depending on the size of the maildir,
94 8b4d5615 2023-05-05 op while subsequent runs will be incremental and take less time.
95 8b4d5615 2023-05-05 op .Pp
96 8b4d5615 2023-05-05 op .Dl $ gotmarc -m path/to/maildir -o path/to/outdir
97 8b4d5615 2023-05-05 op .Pp
98 8b4d5615 2023-05-05 op On multi-processor machines multiple processes may be used to save some
99 8b4d5615 2023-05-05 op time with the
100 8b4d5615 2023-05-05 op .Xr gotmarc 1 Fl j No flag.
101 8b4d5615 2023-05-05 op .Pp
102 8b4d5615 2023-05-05 op The generated files may be compressed to save bandwidth:
103 8b4d5615 2023-05-05 op .Pp
104 8b4d5615 2023-05-05 op .Dl $ gzip -krq /var/www/gotmarc </dev/null 2>/dev/null
105 8b4d5615 2023-05-05 op .Ss 5. Set up the database for searching
106 8b4d5615 2023-05-05 op This is an suggested yet optional step.
107 8b4d5615 2023-05-05 op .Pp
108 8b4d5615 2023-05-05 op .Xr msearchd 8
109 8b4d5615 2023-05-05 op offers full text search capabilities using a sqlite3 database that has to
110 8b4d5615 2023-05-05 op be populated with
111 8b4d5615 2023-05-05 op .Xr gmimport 1 .
112 8b4d5615 2023-05-05 op First, create a directory in the
113 8b4d5615 2023-05-05 op .Pa /var/www
114 8b4d5615 2023-05-05 op .Xr chroot 8
115 8b4d5615 2023-05-05 op jail:
116 8b4d5615 2023-05-05 op .Bd -literal -offset indent
117 8b4d5615 2023-05-05 op # mkdir -p /var/www/msearchd
118 8b4d5615 2023-05-05 op # chown gotmarc /var/www/msearchd
119 8b4d5615 2023-05-05 op .Ed
120 8b4d5615 2023-05-05 op .Pp
121 8b4d5615 2023-05-05 op Then, populate the database with all emails in the maildir:
122 8b4d5615 2023-05-05 op .Bd -literal -offset indent
123 8b4d5615 2023-05-05 op $ sqlite3 /var/www/msearchd/mails.sqlite3 \e
124 8b4d5615 2023-05-05 op </usr/local/share/examples/gotmarc/schema.sql
125 8b4d5615 2023-05-05 op $ mlist ~/Mail/gotmarc | gmimport /var/www/msearchd/mails.sqlite3
126 8b4d5615 2023-05-05 op .Ed
127 8b4d5615 2023-05-05 op .Pp
128 8b4d5615 2023-05-05 op At this point,
129 8b4d5615 2023-05-05 op .Xr msearchd 8
130 8b4d5615 2023-05-05 op can be started.
131 8b4d5615 2023-05-05 op .Ss 6. Configure the web server
132 8b4d5615 2023-05-05 op The web server needs to serve the contents of the outdir as-is and
133 8b4d5615 2023-05-05 op handle the requests for
134 8b4d5615 2023-05-05 op .Pa /search
135 8b4d5615 2023-05-05 op via the
136 8b4d5615 2023-05-05 op .Xr msearchd 8
137 8b4d5615 2023-05-05 op FastCGI server.
138 8b4d5615 2023-05-05 op A sample
139 8b4d5615 2023-05-05 op .Xr httpd 8
140 8b4d5615 2023-05-05 op configuration is provided here for reference:
141 8b4d5615 2023-05-05 op .Bd -literal -offset indent
142 8b4d5615 2023-05-05 op server "marc.example.com" {
143 8b4d5615 2023-05-05 op listen on * port 80
144 8b4d5615 2023-05-05 op root "/gotmarc"
145 8b4d5615 2023-05-05 op gzip-static
146 8b4d5615 2023-05-05 op
147 8b4d5615 2023-05-05 op # leave out when not using msearchd(8)
148 8b4d5615 2023-05-05 op location "/search" {
149 8b4d5615 2023-05-05 op fastcgi socket "/run/msearchd.sock"
150 8b4d5615 2023-05-05 op }
151 8b4d5615 2023-05-05 op }
152 8b4d5615 2023-05-05 op .Ed
153 8b4d5615 2023-05-05 op .Sh HANDLING NEW MESSAGES
154 8b4d5615 2023-05-05 op New messages should be fetched periodically using tools like
155 8b4d5615 2023-05-05 op .Xr fdm 1
156 8b4d5615 2023-05-05 op or
157 8b4d5615 2023-05-05 op .Xr mbsync 1 ,
158 8b4d5615 2023-05-05 op the database updated with
159 8b4d5615 2023-05-05 op .Xr gmimport 1
160 8b4d5615 2023-05-05 op and the web archive refreshed using
161 8b4d5615 2023-05-05 op .Xr gotmarc 1 .
162 8b4d5615 2023-05-05 op .Pp
163 8b4d5615 2023-05-05 op It is reccommended to create a script like the following and schedule
164 8b4d5615 2023-05-05 op its execution periodically with
165 8b4d5615 2023-05-05 op .Xr cron 8 :
166 8b4d5615 2023-05-05 op .Bd -literal -offset indent
167 8b4d5615 2023-05-05 op #!/bin/sh
168 8b4d5615 2023-05-05 op
169 8b4d5615 2023-05-05 op set -e
170 8b4d5615 2023-05-05 op fdm -l fetch
171 8b4d5615 2023-05-05 op minc ~/Mail/gotmarc | gmimport /var/www/msearchd/mails.sqlite3
172 8b4d5615 2023-05-05 op gotmarc
173 8b4d5615 2023-05-05 op gzip -krq /var/www/gotmarc/ </dev/null 2>/dev/null || true
174 8b4d5615 2023-05-05 op .Ed
175 8b4d5615 2023-05-05 op .Pp
176 8b4d5615 2023-05-05 op If
177 8b4d5615 2023-05-05 op .Xr msearchd 8
178 8b4d5615 2023-05-05 op is not used,
179 8b4d5615 2023-05-05 op new messages still needs to be incorporated
180 8b4d5615 2023-05-05 op .Po i.e.\& moved from
181 8b4d5615 2023-05-05 op .Pa new/
182 8b4d5615 2023-05-05 op to
183 8b4d5615 2023-05-05 op .Pa cur/
184 8b4d5615 2023-05-05 op .Pc
185 8b4d5615 2023-05-05 op but no database has to be updated.
186 8b4d5615 2023-05-05 op In that case simplify the
187 8b4d5615 2023-05-05 op .Xr minc 1
188 8b4d5615 2023-05-05 op invocation as:
189 8b4d5615 2023-05-05 op .Pp
190 8b4d5615 2023-05-05 op .Dl minc -q ~/Mail/gotmarc
191 8b4d5615 2023-05-05 op .Pp
192 8b4d5615 2023-05-05 op and don't call
193 8b4d5615 2023-05-05 op .Xr gmimport 1
194 8b4d5615 2023-05-05 op at all.
195 8b4d5615 2023-05-05 op .Sh HANDLING MULTIPLE MAILING LISTS
196 8b4d5615 2023-05-05 op If the archive for multiple mailing lists needs to be served from the
197 8b4d5615 2023-05-05 op same box, care must be taken to use different directories and database
198 8b4d5615 2023-05-05 op files to avoid mixing messages.
199 8b4d5615 2023-05-05 op .Pp
200 8b4d5615 2023-05-05 op .Xr msearchd 8
201 8b4d5615 2023-05-05 op handles only one database at a time, so multiple instances need to be
202 8b4d5615 2023-05-05 op run, each pointing at the database for only one mailing list.
203 8b4d5615 2023-05-05 op Different FastCGI socket path needs to be used per-instance.
204 8b4d5615 2023-05-05 op .Pp
205 8b4d5615 2023-05-05 op .Xr gotmarc 1
206 8b4d5615 2023-05-05 op outdir, maildir and cachedir must be unique per-mailing list, i.e.\& the
207 8b4d5615 2023-05-05 op .Fl c , Fl m No and Fl o
208 8b4d5615 2023-05-05 op flag must always be provided.
209 8b4d5615 2023-05-05 op .Pp
210 8b4d5615 2023-05-05 op Very likely, each mailing list will needs its own set of templates, so
211 8b4d5615 2023-05-05 op those needs to be prepared and both
212 8b4d5615 2023-05-05 op .Xr gotmarc 1
213 8b4d5615 2023-05-05 op and
214 8b4d5615 2023-05-05 op .Xr msearchd 8
215 8b4d5615 2023-05-05 op have to be pointed at the right template directory.
216 8b4d5615 2023-05-05 op .Sh SEE ALSO
217 8b4d5615 2023-05-05 op .Xr gmimport 1 ,
218 8b4d5615 2023-05-05 op .Xr gotmarc 1 ,
219 8b4d5615 2023-05-05 op .Xr minc 1 ,
220 8b4d5615 2023-05-05 op .Xr sqlite3 1 ,
221 8b4d5615 2023-05-05 op .Xr httpd 8 ,
222 8b4d5615 2023-05-05 op .Xr msearchd 8