The Zebra configuration file, read by zebraidx
and
zebrasrv
defaults to zebra.cfg
unless specified by -c
option.
You can edit the configuration file with a normal text editor.
parameter names and values are separated by colons in the file. Lines
starting with a hash sign (#
) are
treated as comments.
If you manage different sets of records that share common
characteristics, you can organize the configuration settings for each
type into "groups".
When zebraidx
is run and you wish to address a
given group you specify the group name with the -g
option.
In this case settings that have the group name as their prefix
will be used by zebraidx
.
If no -g
option is specified, the settings
without prefix are used.
In the configuration file, the group name is placed before the option
name itself, separated by a dot (.). For instance, to set the record type
for group public
to grs.sgml
(the SGML-like format for structured records) you would write:
public.recordType: grs.sgml
To set the default value of the record type to text
write:
recordType: text
The available configuration settings are summarized below. They will be explained further in the following sections.
type
Specifies how records with the file extension
name should be handled by the indexer.
This option may also be specified as a command line option
(-t
). Note that if you do not specify a
name, the setting applies to all files.
In general, the record type specifier consists of the elements (each
element separated by dot), fundamental-type,
file-read-type and arguments. Currently, two
fundamental types exist, text
and
grs
.
record-id-spec
Specifies how the records are to be identified when updated. See Section 3, “Locating Records”.
database
Specifies the Z39.50 database name.
boolean
Specifies whether key information should be saved for a given group of records. If you plan to update/delete this type of records later this should be specified as 1; otherwise it should be 0 (default), to save register space. See Section 5, “Indexing with File Record IDs”.
boolean
Specifies whether the records should be stored internally in the Zebra system files. If you want to maintain the raw records yourself, this option should be false (0). If you want Zebra to take care of the records for you, it should be true(1).
register-location
Specifies the location of the various register files that Zebra uses to represent your databases. See Section 7, “Register Location”.
register-location
Enables the safe update facility of Zebra, and tells the system where to place the required, temporary files. See Section 8, “Safe Updating - Using Shadow Registers”.
directory
Directory in which various lock files are stored.
directory
Directory in which temporary files used during zebraidx's update phase are stored.
directory
Specifies the directory that the server uses for temporary result sets.
If not specified /tmp
will be used.
path
Specifies a path of profile specification files.
The path is composed of one or more directories separated by
colon. Similar to PATH
for UNIX systems.
path
Specifies a path of record filter modules.
The path is composed of one or more directories separated by
colon. Similar to PATH
for UNIX systems.
The 'make install' procedure typically puts modules in
/usr/local/lib/idzebra-2.0/modules
.
filename
Defines the filename which holds fields structure
definitions. If omitted, the file default.idx
is read.
Refer to Section 1, “The default.idx file” for
more information.
integer
Specifies the maximum number of records that will be sorted
in a result set. If the result set contains more than
integer
records, records after the
limit will not be sorted. If omitted, the default value is
1,000.
integer
Enables whether static ranking is to be enabled (1) or disabled (0). If omitted, it is disabled - corresponding to a value of 0. Refer to Section 9.2, “Static Ranking” .
integer
Controls whether Zebra should calculate approximate hit counts and
at which hit count it is to be enabled.
A value of 0 disables approximate hit counts.
For a positive value approximate hit count is enabled
if it is known to be larger than integer
.
Approximate hit counts can also be triggered by a particular attribute in a query. Refer to Section 3.2.5, “Global Approximative Limit Attribute (type 12)”.
filename
Specifies the filename(s) of attribute set files for use in
searching. In many configurations bib1.att
is used, but that is not required. If Classic Explain
attributes is to be used for searching,
explain.att
must be given.
The path to att-files in general can be given using
profilePath
setting.
See also Section 3.4, “The Attribute Set (.att) Files”.
size
Specifies size
of internal memory
to use for the zebraidx program.
The amount is given in megabytes - default is 4 (4 MB).
The more memory, the faster large updates happen, up to about
half the free memory available on the computer.
Yes/Auto/No
Tells zebra if it should use temporary files when indexing. The
default is Auto, in which case zebra uses temporary files only
if it would need more that memMax
megabytes of memory. This should be good for most uses.
dir
Specifies a directory base for Zebra. All relative paths
given (in profilePath, register, shadow) are based on this
directory. This setting is useful if your Zebra server
is running in a different directory from where
zebra.cfg
is located.
file
Specifies a file with description of user accounts for Zebra. The format is similar to that known to Apache's htpasswd files and UNIX' passwd files. Non-empty lines not beginning with # are considered account lines. There is one account per-line. A line consists of fields separate by a single colon character. First field is username, second is password.
file
Specifies a file with description of user accounts for Zebra. File format is similar to that used by the passwd directive except that the password are encrypted. Use Apache's htpasswd or similar for maintenance.
user
:
permstring
Specifies permissions (privilege) for a user that are allowed
to access Zebra via the passwd system. There are two kinds
of permissions currently: read (r) and write(w). By default
users not listed in a permission directive are given the read
privilege. To specify permissions for a user with no
username, or Z39.50 anonymous style use
anonymous
. The permstring consists of
a sequence of characters. Include character w
for write/update access, r
for read access and
a
to allow anonymous access through this account.
accessfile
Names a file which lists database subscriptions for individual users.
The access file should consists of lines of the form
username: dbnames
, where dbnames is a list of
database names, separated by '+'. No whitespace is allowed in the
database list.
charsetname
Tells Zebra to interpret the terms in Z39.50 queries as
having been encoded using the specified character
encoding. The default is ISO-8859-1
; one
useful alternative is UTF-8
.
value
Specifies whether Zebra keeps a copy of indexed keys. Use a value of 1 to enable; 0 to disable. If storeKeys setting is omitted, it is enabled. Enabled storeKeys are required for updating and deleting records. Disable only storeKeys to save space and only plan to index data once.
value
Specifies whether Zebra keeps a copy of indexed records. Use a value of 1 to enable; 0 to disable. If storeData setting is omitted, it is enabled. A storeData setting of 0 (disabled) makes Zebra fetch records from the original locaction in the file system using filename, file offset and file length. For the DOM and ALVIS filter, the storeData setting is ignored.