Blob


1 .TH VENTI-FMT 8
2 .SH NAME
3 buildindex,
4 checkarenas,
5 checkindex,
6 conf,
7 fmtarenas,
8 fmtbloom,
9 fmtindex,
10 fmtisect,
11 syncindex \- prepare and maintain a venti server
12 .SH SYNOPSIS
13 .PP
14 .B venti/fmtarenas
15 [
16 .B -Z
17 ]
18 [
19 .B -a
20 .I arenasize
21 ]
22 [
23 .B -b
24 .I blocksize
25 ]
26 .I name
27 .I file
28 .PP
29 .B venti/fmtisect
30 [
31 .B -1Z
32 ]
33 [
34 .B -b
35 .I blocksize
36 ]
37 .I name
38 .I file
39 .PP
40 .B venti/fmtbloom
41 [
42 .B -n
43 .I nblocks
44 |
45 .B -N
46 .I nhash
47 ]
48 [
49 .B -s
50 .I size
51 ]
52 .I file
53 .PP
54 .B venti/fmtindex
55 [
56 .B -a
57 ]
58 .I venti.conf
59 .PP
60 .B venti/conf
61 [
62 .B -w
63 ]
64 .I partition
65 [
66 .I configfile
67 ]
68 .if t .sp 0.5
69 .PP
70 .B venti/buildindex
71 [
72 .B -B
73 .I blockcachesize
74 ]
75 [
76 .B -Z
77 ]
78 .I venti.conf
79 .I tmp
80 .PP
81 .B venti/checkindex
82 [
83 .B -f
84 ]
85 [
86 .B -B
87 .I blockcachesize
88 ]
89 .I venti.conf
90 .I tmp
91 .PP
92 .B venti/checkarenas
93 [
94 .B -afv
95 ]
96 .I file
97 .PP
98 .B venti/copy
99 [
100 .B -f
102 .I src
103 .I dst
104 .I score
106 .I type
108 .SH DESCRIPTION
109 These commands aid in the setup, maintenance, and debugging of
110 venti servers.
111 See
112 .IR venti (7)
113 for an overview of the venti system and
114 .IR venti (8)
115 for an overview of the data structures used by the venti server.
116 .PP
117 Note that the units for the various sizes in the following
118 commands can be specified by appending
119 .LR k ,
120 .LR m ,
121 or
122 .LR g
123 to indicate kilobytes, megabytes, or gigabytes respectively.
124 .SS Formatting
125 To prepare a server for its initial use, the arena partitions and
126 the index sections must be formatted individually, with
127 .I fmtarenas
128 and
129 .IR fmtisect .
130 Then the
131 collection of index sections must be combined into a venti
132 index with
133 .IR fmtindex .
134 .PP
135 .I Fmtarenas
136 formats the given
137 .IR file ,
138 typically a disk partition, into an arena partition.
139 The arenas in the partition are given names of the form
140 .IR name%d ,
141 where
142 .I %d
143 is replaced with a sequential number starting at 0.
144 .PP
145 Options to
146 .I fmtarenas
147 are:
148 .TP
149 .BI -a " arenasize
150 The arenas are of
151 .I arenasize
152 bytes. The default is
153 .BR 512M ,
154 which was selected to provide a balance
155 between the number of arenas and the ability to copy an arena to external
156 media such as recordable CDs and tapes.
157 .TP
158 .BI -b " blocksize
159 The size, in bytes, for read and write operations to the file.
160 The size is recorded in the file, and is used by applications that access the arenas.
161 The default is
162 .BR 8k .
163 .TP
164 .B -4
165 Create a `version 4' arena partition for backwards compatibility with old servers.
166 The default is version 5, used by the current venti server.
167 .TP
168 .B -Z
169 Do not zero the data sections of the arenas.
170 Using this option reduces the formatting time
171 but should only be used when it is known that the file was already zeroed.
172 (Version 4 only; version 5 sections are not and do not need to be zeroed.)
173 .PD
174 .PP
175 .I Fmtisect
176 formats the given
177 .IR file ,
178 typically a disk partition, as a venti index section with the specified
179 .IR name .
180 Each of the index sections in a venti configuration must have a unique name.
181 .PP
182 Options to
183 .I fmtisect
184 are:
185 .TP
186 .BI -b " bucketsize
187 The size of an index bucket, in bytes.
188 All the index sections within a index must have the same bucket size.
189 The default is
190 .BR 8k .
191 .TP
192 .B -1
193 Create a `version 1' index section for backwards compatibility with old servers.
194 The default is version 2, used by the current venti server.
195 .TP
196 .B -Z
197 Do not zero the index.
198 Using this option reduces the formatting time
199 but should only be used when it is known that the file was already zeroed.
200 (Version 1 only; version 2 sections are not and do not need to be zeroed.)
201 .PD
202 .PP
203 .I Fmtbloom
204 formats the given
205 .I file
206 as a bloom filter
207 (see
208 .IR venti (7)).
209 The options are:
210 .TP
211 .BI -n " nblock \fR| " -N " nhash
212 The number of blocks expected to be indexed by the filter
213 or the number of hash functions to use.
214 If the
215 .B -n
216 option
217 is given, it is used, along with the total size of the filter,
218 to compute an appropriate
219 .IR nhash .
220 .TP
221 .BI -s " size
222 The size of the bloom filter. The default is the total size of the file.
223 In either case,
224 .I size
225 is rounded down to a power of two.
226 .PD
227 .PP
228 The
229 .I file
230 argument in the commands above can be of the form
231 .IB file : lo - hi
232 to specify a range of the file.
233 .I Lo
234 and
235 .I hi
236 are specified in bytes but can have the usual
237 .BI k ,
238 .BI m ,
239 or
240 .B g
241 suffixes.
242 Either
243 .I lo
244 or
245 .I hi
246 may be omitted.
247 This notation eliminates the need to
248 partition raw disks on non-Plan 9 systems.
249 .PP
250 .I Fmtindex
251 reads the configuration file
252 .I venti.conf
253 and initializes the index sections to form a usable index structure.
254 The arena files and index sections must have previously been formatted
255 using
256 .I fmtarenas
257 and
258 .I fmtisect
259 respectively.
260 .PP
261 The function of a venti index is to map a SHA1 fingerprint to a location
262 in the data section of one of the arenas. The index is composed of
263 blocks, each of which contains the mapping for a fixed range of possible
264 fingerprint values.
265 .I Fmtindex
266 determines the mapping between SHA1 values and the blocks
267 of the collection of index sections. Once this mapping has been determined,
268 it cannot be changed without rebuilding the index.
269 The basic assumption in the current implementation is that the index
270 structure is sufficiently empty that individual blocks of the index will rarely
271 overflow. The total size of the index should be about 2% to 10% of
272 the total size of the arenas, but the exact percentage depends both on the
273 index block size and the compressed size of blocks stored.
274 See the discussion in
275 .IR venti (8)
276 for more.
277 .PP
278 .I Fmtindex
279 also computes a mapping between a linear address space and
280 the data section of the collection of arenas. The
281 .B -a
282 option can be used to add additional arenas to an index.
283 To use this feature,
284 add the new arenas to
285 .I venti.conf
286 after the existing arenas and then run
287 .I fmtindex
288 .BR -a .
289 .PP
290 A copy of the above mappings is stored in the header for each of the index sections.
291 These copies enable
292 .I buildindex
293 to restore a single index section without rebuilding the entire index.
294 .PP
295 To make it easier to bootstrap servers, the configuration
296 file can be stored in otherwise empty space
297 at the beginning of any venti partitions using
298 .IR conf .
299 A partition so branded with a configuration file can
300 be used in place of a configuration file when invoking any
301 of the venti commands.
302 By default,
303 .I conf
304 prints the configuration stored in
305 .IR partition .
306 When invoked with the
307 .B -w
308 flag,
309 .I conf
310 reads a configuration file from
311 .I configfile
312 (or else standard input)
313 and stores it in
314 .IR partition .
315 .SS Checking and Rebuilding
316 .PP
317 .I Buildindex
318 populates the index for the Venti system described in
319 .IR venti.conf .
320 The index must have previously been formatted using
321 .IR fmtindex .
322 This command is typically used to build a new index for a Venti
323 system when the old index becomes too small, or to rebuild
324 an index after media failure.
325 Small errors in an index can usually be fixed with
326 .IR checkindex .
327 .PP
328 The
329 .I tmp
330 file, usually a disk partition, must be large enough to store a copy of the index.
331 This temporary space is used to perform a merge sort of index entries
332 generated by reading the arenas.
333 .PP
334 Options to
335 .I buildindex
336 are:
337 .TP
338 .BI -B " blockcachesize
339 The amount of memory, in bytes, to use for caching raw disk accesses while running
340 .IR buildindex .
341 (This is not a property of the created index.)
342 The default is 8k.
343 .TP
344 .B -Z
345 Do not zero the index.
346 This option should only be used when it is known that the index was already zeroed.
347 (Version 1 indexes only; see the discussion in
348 .I fmtindex
349 above.)
350 .PD
351 .PP
352 .I Checkindex
353 examines the Venti index described in
354 .IR venti.conf .
355 The program detects various error conditions including:
356 blocks that are not indexed, index entries for blocks that do not exist,
357 and duplicate index entries.
358 If requested, an attempt can be made to fix errors that are found.
359 .PP
360 The
361 .I tmp
362 file, usually a disk partition, must be large enough to store a copy of the index.
363 This temporary space is used to perform a merge sort of index entries
364 generated by reading the arenas.
365 .PP
366 Options to
367 .I checkindex
368 are:
369 .TP
370 .BI -B " blockcachesize
371 The amount of memory, in bytes, to use for caching raw disk accesses while running
372 .IR checkindex .
373 The default is 8k.
374 .TP
375 .B -f
376 Attempt to fix any errors that are found.
377 .PD
378 .PP
379 .I Checkarenas
380 examines the Venti arenas contained in the given
381 .IR file .
382 The program detects various error conditions, and optionally attempts
383 to fix any errors that are found.
384 .PP
385 Options to
386 .I checkarenas
387 are:
388 .TP
389 .B -a
390 For each arena, scan the entire data section.
391 If this option is omitted, only the end section of
392 the arena is examined.
393 .TP
394 .B -f
395 Attempt to fix any errors that are found.
396 .TP
397 .B -v
398 Increase the verbosity of output.
399 .PD
400 .SH SOURCE
401 .B \*9/src/cmd/venti/srv
402 .SH SEE ALSO
403 .IR venti (7),
404 .IR venti (8)
405 .SH BUGS
406 .I Buildindex
407 should allow an individual index section to be rebuilt.
408 The merge sort could be performed in the space used to store the
409 index rather than requiring a temporary file.