3 venti.conf \- venti configuration
5 Venti is a SHA1-addressed archival storage server.
8 for a full introduction to the system.
9 This page documents the structure and operation of the server.
11 A venti server requires multiple disks or disk partitions,
12 each of which must be properly formatted before the server
15 The venti server maintains three disk structures, typically
16 stored on raw disk partitions:
19 which holds, in sequential order,
20 the contents of every block written to the server;
23 which helps locate a block in the data log given its score;
26 a concise summary of which scores are present in the index.
27 The data log is the primary storage.
28 To improve the robustness, it should be stored on
29 a device that provides RAID functionality.
30 The index and the bloom filter are optimizations
31 employed to access the data log efficiently and can be rebuilt
34 The data log is logically split into sections called
36 typically sized for easy offline backup
38 A data log may comprise many disks, each storing
41 .IR "arena partitions" .
42 Arena partitions are filled in the order given in the configuration.
44 The index is logically split into block-sized pieces called
46 each of which is responsible for a particular range of scores.
47 An index may be split across many disks, each storing many buckets.
49 .IR "index sections" .
51 The index must be sized so that no bucket is full.
52 When a bucket fills, the server must be shut down and
53 the index made larger.
54 Since scores appear random, each bucket will contain
55 approximately the same number of entries.
56 Index entries are 40 bytes long. Assuming that a typical block
57 being written to the server is 8192 bytes and compresses to 4096
58 bytes, the active index is expected to be about 1% of
60 Storing smaller blocks increases the relative index footprint;
61 storing larger blocks decreases it.
62 To allow variation in both block size and the random distribution
63 of scores to buckets, the suggested index size is 5% of
66 The (optional) bloom filter is a large bitmap that is stored on disk but
67 also kept completely in memory while the venti server runs.
68 It helps the venti server efficiently detect scores that are
70 already stored in the index.
71 The bloom filter starts out zeroed.
72 Each score recorded in the bloom filter is hashed to choose
74 bits to set in the bloom filter.
75 A score is definitely not stored in the index of any of its
78 The bloom filter thus has two parameters:
81 and the total bitmap size
82 (maximum 512MB, 2\s-2\u32\d\s+2 bits).
84 The bloom filter should be sized so that
95 is the expected number of blocks stored on the server
98 is the bitmap size in bits.
99 The false positive rate of the bloom filter when sized
100 this way is approximately 2\s-2\u\-\fInblock\fR\d\s+2.
102 less than 10 are not very useful;
104 greater than 24 are probably a waste of memory.
114 it will derive an appropriate
117 Venti can make effective use of large amounts of memory
122 holds recently-accessed venti data blocks, which the server refers to as
124 The lump cache should be at least 1MB but can profitably be much larger.
125 The lump cache can be thought of as the level-1 cache:
126 read requests handled by the lump cache can
131 holds recently-accessed
133 blocks from the arena partitions.
134 The block cache needs to be able to simultaneously hold two blocks
135 from each arena plus four blocks for the currently-filling arena.
136 The block cache can be thought of as the level-2 cache:
137 read requests handled by the block cache are slower than those
138 handled by the lump cache, since the lump data must be extracted
139 from the raw disk blocks and possibly decompressed, but no
140 disk accesses are necessary.
144 holds recently-accessed or prefetched
146 The index cache needs to be able to hold index entries
147 for three or four arenas, at least, in order for prefetching
148 to work properly. Each index entry is 50 bytes.
149 Assuming 500MB arenas of
150 128,000 blocks that are 4096 bytes each after compression,
151 the minimum index cache size is about 6MB.
152 The index cache can be thought of as the level-3 cache:
153 read requests handled by the index cache must still go
154 to disk to fetch the arena blocks, but the costly random
155 access to the index is avoided.
157 The size of the index cache determines how long venti
158 can sustain its `burst' write throughput, during which time
159 the only disk accesses on the critical path
160 are sequential writes to the arena partitions.
161 For example, if you want to be able to sustain 10MB/s
162 for an hour, you need enough index cache to hold entries
163 for 36GB of blocks. Assuming 8192-byte blocks,
164 you need room for almost five million index entries.
165 Since index entries are 50 bytes each, you need 250MB
167 If the background index update process can make a single
168 pass through the index in an hour, which is possible,
169 then you can sustain the 10MB/s indefinitely (at least until
170 the arenas are all filled).
174 requires memory equal to its size on disk,
177 A reasonable starting allocation is to
178 divide memory equally (in thirds) between
179 the bloom filter, the index cache, and the lump and block caches;
180 the third of memory allocated to the lump and block caches
181 should be split unevenly, with more (say, two thirds)
182 going to the block cache.
184 The venti server announces two network services, one
185 (conventionally TCP port
188 the venti protocol as described in
191 (conventionally TCP port
195 The venti web server provides the following
196 URLs for accessing status information:
199 A summary of the usage of the arenas and index sections.
206 Brief storage totals.
209 The current integer value of
213 whether or not to compress blocks
216 whether to write entries to the debugging logs;
218 whether to collect run-time statistics;
219 .BR icachesleeptime ,
220 the time in milliseconds between successive updates
221 of megabytes of the index cache;
222 .BR arenasumsleeptime ,
223 the time in milliseconds between reads while
224 checksumming an arena in the background.
225 The two sleep times should be (but are not) managed by venti;
226 they exist to provide more experience with their effects.
227 The other variables exist only for debugging and
228 performance measurement.
230 .BI /set/ variable / value
236 .BI /graph/ name / param / param / \fR...
237 A PNG image graphing the named run-time statistic over time.
238 The details of names and parameters are undocumented;
241 in the venti sources.
244 A list of all debugging logs present in the server's memory.
247 The contents of the debugging log with the given
251 Force venti to begin flushing the index cache to disk.
252 The request response will not be sent until the flush
256 Force venti to begin flushing the arena block cache to disk.
257 The request response will not be sent until the flush
261 Requests for other files are served by consulting a
262 directory named in the configuration file
266 .SS Configuration File
267 A venti configuration file
268 enumerates the various index sections and
269 arenas that constitute a venti system.
270 The components are indicated by the name of the file, typically
271 a disk partition, in which they reside. The configuration
272 file is the only location that file names are used. Internally,
273 venti uses the names assigned when the components were formatted
280 In particular, only the configuration needs to be
281 changed if a component is moved to a different file.
283 The configuration file consists of lines in the form described below.
289 Names the index for the system.
293 is an arena partition, formatted using
298 is an index section, formatted using
301 After formatting a venti system using
303 the order of arenas and index sections should not be changed.
304 Additional arenas can be appended to the configuration;
309 flag to update the index.
311 The configuration file also holds configuration parameters
312 for the venti server itself.
326 network address to announce venti service
330 .BI httpaddr " netaddr
331 network address to announce HTTP service
336 queue writes in memory
337 (default is not to queue)
339 See the server description in
341 for explanations of these variables.
346 isect /tmp/disks/isect0
347 isect /tmp/disks/isect1
348 arenas /tmp/disks/arenas
357 Setting up a venti server is too complicated.
359 Venti should not require the user to decide how to
360 partition its memory usage.