Extended indexing of MARC records will help you if you need index a combination of subfields, or index only a part of the whole field, or use during indexing process embedded fields of MARC record.
Extended indexing of MARC records additionally allows:
to index data in LEADER of MARC record
to index data in control fields (with fixed length)
to use during indexing the values of indicators
to index linked fields for UNIMARC based formats
In compare with simple indexing process the extended indexing may increase (about 2-3 times) the time of indexing process for MARC records.
At the beginning, we have to define the term index-formula for MARC records. This term helps to understand the notation of extended indexing of MARC records by Zebra. Our definition is based on the document "The table of conformity for Z39.50 use attributes and RUSMARC fields". The document is available only in Russian language.
The index-formula is the combination of subfields presented in such way:
71-00$a, $g, $h ($c){.$b ($c)} , (1)
We know that Zebra supports a BIB-1 attribute - right truncation. In this case, the index-formula (1) consists from forms, defined in the same way as (1)
71-00$a, $g, $h 71-00$a, $g 71-00$a
The original MARC record may be without some elements, which included in index-formula.
This notation includes such operands as:
It means whitespace character.
The position may contain any value, defined by MARC format. For example, index-formula
70-#1$a, $g , (2)
includes
700#1$a, $g 701#1$a, $g 702#1$a, $g
The repeatable elements are defined in figure-brackets {}. For example, index-formula
71-00$a, $g, $h ($c){.$b ($c)} , (3)
includes
71-00$a, $g, $h ($c). $b ($c) 71-00$a, $g, $h ($c). $b ($c). $b ($c) 71-00$a, $g, $h ($c). $b ($c). $b ($c). $b ($c)
All another operands are the same as accepted in MARC world.
Extended indexing overloads path
of
elm
definition in abstract syntax file of Zebra
(.abs
file). It means that names beginning with
"mc-"
are interpreted by Zebra as
index-formula. The database index is created and
linked with access point (BIB-1 use attribute)
according to this formula.
For example, index-formula
71-00$a, $g, $h ($c){.$b ($c)} , (4)
in .abs
file looks like:
mc-71.00_$a,_$g,_$h_(_$c_){.$b_(_$c_)}
The notation of index-formula uses the operands:
It means whitespace character.
The position may contain any value, defined by MARC format. For example, index-formula
70-#1$a, $g , (5)
matches mc-70._1_$a,_$g_
and includes
700_1_$a,_$g_ 701_1_$a,_$g_ 702_1_$a,_$g_
The repeatable elements are defined in figure-brackets {}. For example, index-formula
71#00$a, $g, $h ($c) {.$b ($c)} , (6)
matches
mc-71.00_$a,_$g,_$h_(_$c_){.$b_(_$c_)}
and
includes
71.00_$a,_$g,_$h_(_$c_).$b_(_$c_) 71.00_$a,_$g,_$h_(_$c_).$b_(_$c_).$b_(_$c_) 71.00_$a,_$g,_$h_(_$c_).$b_(_$c_).$b_(_$c_).$b_(_$c_)
Embedded index-formula (for linked fields) is between <>. For example, index-formula
4--#-$170-#1$a, $g ($c) , (7)
matches
mc-4.._._$1<70._1_$a,_$g_(_$c_)>_
and
includes
463_._$1<70._1_$a,_$g_(_$c_)>_
All another operands are the same as accepted in MARC world.
indexing LEADER
You need to use keyword "ldr" to index leader. For example, indexing data from 6th and 7th position of LEADER
elm mc-ldr[6] Record-type ! elm mc-ldr[7] Bib-level !
indexing data from control fields
indexing date (the time added to database)
elm mc-008[0-5] Date/time-added-to-db !
or for RUSMARC (this data included in 100th field)
elm mc-100___$a[0-7]_ Date/time-added-to-db !
using indicators while indexing
For RUSMARC index-formula
70-#1$a, $g
matches
elm 70._1_$a,_$g_ Author !:w,!:p
When Zebra finds a field according to
"70."
pattern it checks the indicators. In this
case the value of first indicator doesn't mater, but the value of
second one must be whitespace, in another case a field is not
indexed.
indexing embedded (linked) fields for UNIMARC based formats
For RUSMARC index-formula
4--#-$170-#1$a, $g ($c)
matches
elm mc-4.._._$1<70._1_$a,_$g_(_$c_)>_ Author !:w,!:p
Data are extracted from record if the field matches to
"4.._."
pattern and data in linked field
match to embedded
index-formula
70._1_$a,_$g_(_$c_)
.