VENTI(8) VENTI(8)
NAME
venti - archival storage server
SYNOPSIS
venti/venti [ -Ldrs ] [ -a address ] [ -B blockcachesize ] [
-c config ] [ -C lumpcachesize ] [ -h httpaddress ] [ -I
indexcachesize ] [ -m free-memory-percent ] [ -W webroot ]
DESCRIPTION
Venti is a SHA1-addressed archival storage server. See
venti(6) for a full introduction to the system. This page
documents the structure and operation of the server.
A venti server requires multiple disks or disk partitions,
each of which must be properly formatted before the server
can be run.
Disk
The venti server maintains three disk structures, typically
stored on raw disk partitions: the append-only data log,
which holds, in sequential order, the contents of every
block written to the server; the index, which helps locate a
block in the data log given its score; and optionally the
bloom filter, a concise summary of which scores are present
in the index. The data log is the primary storage. To
improve the robustness, it should be stored on a device that
provides RAID functionality. The index and the bloom filter
are optimizations employed to access the data log effi-
ciently and can be rebuilt if lost or damaged.
The data log is logically split into sections called arenas,
typically sized for easy offline backup (e.g., 500MB). A
data log may comprise many disks, each storing one or more
arenas. Such disks are called arena partitions. Arena par-
titions are filled in the order given in the configuration.
The index is logically split into block-sized pieces called
buckets, each of which is responsible for a particular range
of scores. An index may be split across many disks, each
storing many buckets. Such disks are called index sections.
The index must be sized so that no bucket is full. When a
bucket fills, the server must be shut down and the index
made larger. Since scores appear random, each bucket will
contain approximately the same number of entries. Index
entries are 40 bytes long. Assuming that a typical block
being written to the server is 8192 bytes and compresses to
4096 bytes, the active index is expected to be about 1% of
the active data log. Storing smaller blocks increases the
relative index footprint; storing larger blocks decreases
Page 1 Plan 9 (printed 10/29/25)
VENTI(8) VENTI(8)
it. To allow variation in both block size and the random
distribution of scores to buckets, the suggested index size
is 5% of the active data log.
The (optional) bloom filter is a large bitmap that is stored
on disk but also kept completely in memory while the venti
server runs. It helps the venti server efficiently detect
scores that are not already stored in the index. The bloom
filter starts out zeroed. Each score recorded in the bloom
filter is hashed to choose nhash bits to set in the bloom
filter. A score is definitely not stored in the index of
any of its nhash bits are not set. The bloom filter thus
has two parameters: nhash (maximum 32) and the total bitmap
size (maximum 512MB, 232 bits).
The bloom filter should be sized so that nhash x nblock _<
0.7 x b, where nblock is the expected number of blocks
stored on the server and b is the bitmap size in bits. The
false positive rate of the bloom filter when sized this way
is approximately 2-nblock. Nhash less than 10 are not very
useful; nhash greater than 24 are probably a waste of mem-
ory. Fmtbloom (see venti-fmt(8)) can be given either nhash
or nblock; if given nblock, it will derive an appropriate
nhash.
Memory
Venti can make effective use of large amounts of memory for
various caches.
The lump cache holds recently-accessed venti data blocks,
which the server refers to as lumps. The lump cache should
be at least 1MB but can profitably be much larger. The lump
cache can be thought of as the level-1 cache: read requests
handled by the lump cache can be served instantly.
The block cache holds recently-accessed disk blocks from the
arena partitions. The block cache needs to be able to
simultaneously hold two blocks from each arena plus four
blocks for the currently-filling arena. The block cache can
be thought of as the level-2 cache: read requests handled by
the block cache are slower than those handled by the lump
cache, since the lump data must be extracted from the raw
disk blocks and possibly decompressed, but no disk accesses
are necessary.
The index cache holds recently-accessed or prefetched index
entries. The index cache needs to be able to hold index
entries for three or four arenas, at least, in order for
prefetching to work properly. Each index entry is 50 bytes.
Assuming 500MB arenas of 128,000 blocks that are 4096 bytes
each after compression, the minimum index cache size is
about 6MB. The index cache can be thought of as the level-3
Page 2 Plan 9 (printed 10/29/25)
VENTI(8) VENTI(8)
cache: read requests handled by the index cache must still
go to disk to fetch the arena blocks, but the costly random
access to the index is avoided.
The size of the index cache determines how long venti can
sustain its `burst' write throughput, during which time the
only disk accesses on the critical path are sequential
writes to the arena partitions. For example, if you want to
be able to sustain 10MB/s for an hour, you need enough index
cache to hold entries for 36GB of blocks. Assuming 8192-
byte blocks, you need room for almost five million index
entries. Since index entries are 50 bytes each, you need
250MB of index cache. If the background index update pro-
cess can make a single pass through the index in an hour,
which is possible, then you can sustain the 10MB/s indefi-
nitely (at least until the arenas are all filled).
The bloom filter requires memory equal to its size on disk,
as discussed above.
A reasonable starting allocation is to divide memory equally
(in thirds) between the bloom filter, the index cache, and
the lump and block caches; the third of memory allocated to
the lump and block caches should be split unevenly, with
more (say, two thirds) going to the block cache.
Network
The venti server announces two network services, one (con-
ventionally TCP port venti, 17034) serving the venti proto-
col as described in venti(6), and one serving HTTP (conven-
tionally TCP port http, 80).
The venti web server provides the following URLs for access-
ing status information:
/index A summary of the usage of the arenas and index
sections.
/xindex An XML version of /index.
/storage Brief storage totals.
/set/variable
The current integer value of variable. Variables
are: compress, whether or not to compress blocks
(for debugging); logging, whether to write entries
to the debugging logs; stats, whether to collect
run-time statistics; icachesleeptime, the time in
milliseconds between successive updates of mega-
bytes of the index cache; arenasumsleeptime, the
time in milliseconds between reads while
Page 3 Plan 9 (printed 10/29/25)
VENTI(8) VENTI(8)
checksumming an arena in the background. The two
sleep times should be (but are not) managed by
venti; they exist to provide more experience with
their effects. The other variables exist only for
debugging and performance measurement.
/set/variable/value
Set variable to value.
/graph/name/param/param
A PNG image graphing the named run-time statistic
over time. The details of names and parameters
are undocumented; see httpd.c in the venti
sources.
/log A list of all debugging logs present in the
server's memory.
/log/name The contents of the debugging log with the given
name.
/flushicache
Force venti to begin flushing the index cache to
disk. The request response will not be sent until
the flush has completed.
/flushdcache
Force venti to begin flushing the arena block
cache to disk. The request response will not be
sent until the flush has completed.
Requests for other files are served by consulting a direc-
tory named in the configuration file (see webroot below).
Configuration File
A venti configuration file enumerates the various index sec-
tions and arenas that constitute a venti system. The compo-
nents are indicated by the name of the file, typically a
disk partition, in which they reside. The configuration
file is the only location that file names are used. Inter-
nally, venti uses the names assigned when the components
were formatted with fmtarenas or fmtisect (see venti-
fmt(8)). In particular, only the configuration needs to be
changed if a component is moved to a different file.
The configuration file consists of lines in the form
described below. Lines starting with # are comments.
index name Names the index for the system.
arenas file File is an arena partition, formatted using
Page 4 Plan 9 (printed 10/29/25)
VENTI(8) VENTI(8)
fmtarenas.
isect file File is an index section, formatted using
fmtisect.
bloom file File is a bloom filter, formatted using
fmtbloom.
After formatting a venti system using fmtindex, the order of
arenas and index sections should not be changed. Additional
arenas can be appended to the configuration; run fmtindex
with the -a flag to update the index.
The configuration file also holds configuration parameters
for the venti server itself. These are:
mem size lump cache size
bcmem size block cache size
icmem size index cache size
addr netaddr network address to announce venti service
(default tcp!*!venti)
httpaddr netaddr network address to announce HTTP service
(default tcp!*!http)
queuewrites queue writes in memory (default is not to
queue)
webroot dir directory tree containing files for
venti's internal HTTP server to consult
for unrecognized URLs
The units for the various cache sizes above can be specified
by appending a `k', `m', or `g' (case-insensitive) to indi-
cate kilobytes, megabytes, or gigabytes respectively.
The file name in the configuration lines above can be of the
form file:lo-hi to specify a range of the file. Lo and hi
are specified in bytes but can have the usual k, m, or g
suffixes. Either lo or hi may be omitted. This notation
eliminates the need to partition raw disks on non-Plan 9
systems.
Command Line
Many of the options to Venti duplicate parameters that can
be specified in the configuration file. The command line
options override those found in a configuration file. Addi-
tional options are:
-c config The server configuration file (default
venti.conf)
-d Produce various debugging information on standard
error. Implies -s.
Page 5 Plan 9 (printed 10/29/25)
VENTI(8) VENTI(8)
-L Enable logging. By default all logging is dis-
abled. Logging slows server operation consider-
ably.
-m Allocate free-memory-percent percent of the
available free RAM, and partition it per the
guidelines in the Memory subsection. This per-
centage should be large enough to include the
entire bloom filter. This overrides all other
memory sizing parameters, including those on the
command line and in the configuration file. 25%
is a reasonable choice.
-r Allow only read access to the venti data.
-s Do not run in the background. Normally, the
foreground process will exit once the Venti
server is initialized and ready for connections.
EXAMPLE
A simple configuration:
% cat venti.conf
index main
isect /tmp/disks/isect0
isect /tmp/disks/isect1
arenas /tmp/disks/arenas
bloom /tmp/disks/bloom
%
Format the index sections, the arena partition, the bloom
filter, and finally the main index:
% venti/fmtisect isect0. /tmp/disks/isect0
% venti/fmtisect isect1. /tmp/disks/isect1
% venti/fmtarenas arenas0. /tmp/disks/arenas &
% venti/fmtbloom /tmp/disks/bloom &
% wait
% venti/fmtindex venti.conf
%
Start the server and check the storage statistics:
% venti/venti
% hget http://$sysname/storage
SOURCE
/sys/src/cmd/venti/srv
SEE ALSO
venti(1), venti(2), venti(6), venti-backup(8), venti-fmt(8)
Sean Quinlan and Sean Dorward, ``Venti: a new approach to
Page 6 Plan 9 (printed 10/29/25)
VENTI(8) VENTI(8)
archival storage'', Usenix Conference on File and Storage
Technologies , 2002.
BUGS
Setting up a venti server is too complicated.
Page 7 Plan 9 (printed 10/29/25)