DBAddr

Name

DBAddr -- sets the database(s) connection string

indexer.conf search.htm

Synopsis

DBAddr {addr...}

Description

The DBAddr command defines the connection string for the database mnoGoSearch will use for indexing and searching, using URL-style notation.

It's possible to specify multiple databases in the same DBAddr command (e.g. in case of a distributed cluster configuration).

addr has the following format:


type://[user[:password]@]hostname[:port]/dbname/[?param=value[&param=value...]]

DBAddr must be used before any other commands, and has a global effect on the entire configuration file.

Type, hostname, dbname, user, password and port are the most important parts of a DBAddr value.

The main part of a DBAddr command can also be optionally followed by a number of additional parameters, like DBMode.

The main part is separated from the parameters using the QUESTION MARK character (?), and the parameters are separated from each other using the AMPERSAND SIGN character (&).

Notes on type

Currently supported type values are mysql, pgsql, mssql, sybase, oracle, ibase, db2, mimer, sqlite, sqlite3, cache.

Notes on hostname

An empty host name with PostgreSQL, for example,


pgsql://user:password@/dbname/
      
means that mnoGoSearch will communicate with the PostgreSQL server using a UNIX socket rather than a TCP port.

Hostname is not important when describing an ODBC database (on Windows, or on UNIX when mnoGoSearch is compiled with UnixODBC or iODBC support). Specifying an ODBC Data Source Name (DSN) in the dbname part is enough, so the hostname part can be either omitted or can be set to localhost:


mysql://root@/myodbc/
mysql://root@localhost/myodbc/
      

Special characters in user and password

Some special characters, if they appear in the user name or password, need to be escaped using %XX notation, where XX is a hexadecimal character code. Use %3A for ":", %3B for ";", %3C for "<", %3D for "=", %3E for ">", %3F for "?" and %40 for "@" characters. For example,


DBAddr pgsql://user:pwd%3Awith%40special%3Cchars@/search/
      
corresponds to the local PostgreSQL database search with the user name user and password "pwd:with@special<chars".

The DBMode parameter

The DBMode optional parameter can be set to one of the three values: single, multi or blob. The default value is blob. See the Section called Word modes with an SQL database in Chapter 7 for details.

The Socket parameter

MySQL and PostgreSQL users can specify a path to the UNIX socket when connecting to a server running on the local machine:


DBAddr mysql://foo:bar@localhost/mnogosearch/?socket=/tmp/mysql.sock
DBAddr pgsql://foo:bar@/mnogosearch/?socket=/tmp/s.PGSQL.5432
      

The SetNames parameter

MySQL and PostgreSQL users can set connection character set by specifying the SetNames parameter. It is important if the default MySQL or PostgreSQL client character set settings does not correspond to the LocalCharset setting of mnoGoSearch.


LocalCharset utf8
DBAddr http://root@localhost/test/?setnames=utf8
      
If a non-empty SetNames value is specified, MySQL and PostgreSQL drivers will send a SET NAMES query after connection is established.

The QCache parameter

When QCache is set to yes, search.cgi enables search results cache, which is used to improve search performance of the subsequent queries for the same words, including navigating through the search result pages (e.g. watching documents 11-20, 21-30 and so on. Also, search results cache enables the search in found feature.

The ps parameter

MySQL and PostgreSQL users can specify ps=yes to tell indexer to use the prepared statement API at crawling and indexing time.

Using prepared statements for a series of similar SQL queries is usually somewhat faster than direct execution of the same set of non-prepared SQL statements, however prepared statements for MySQL and PostgreSQL appeared in mnoGoSearch version 3.3.8 and is not enabled by default yet for stability purposes.

Parameters affecting score (wf and nwf)

Starting from the version 3.3.0 it is possible to specify the wf=XXXX parameter for DBAddr in search.htm. Starting from the version 3.3.3, nwf=XXXX is also allowed.

These parameters are useful if you merge two or more databases and want to give more score to the search results coming from a certain database.

The DBAddr parameters have higher precedence over the search query parameters. For example, if wf is specified as a DBAddr parameter, then the global wf values (specified in QUERY_STRING or using wf in search.htm) are not used. The format of the wf and nwf DBAddr parameters is similar to the format described in the Section called Changing weights of the different document parts at search time in Chapter 11.


DBAddr mysql://root@localhost/db1/?wf=FFFF
DBAddr mysql://root@localhost/db2/?wf=1111
DBAddr mysql://root@localhost/db3/?wf=1111
      

The MaxResults parameter

Starting from the version 3.3.0, the MaxResults=num parameter is available to specify the maximum number of search results returned from the database. It can be useful if you want to add a limited number of sponsored links on the top of the search results:


DBAddr mysql://root@localhost/avd/?wf=FFFF=1&MaxResults=1
DBAddr mysql://root@localhost/db1/?wf=1111
DBAddr mysql://root@localhost/db2/?wf=1111
      

MaxResults affects the value returned by the method RESULT::total_found(). The documents outside of the MaxResults range are not included into the rearch results.

The DebugSQL parameter

Starting from the version 3.3.3, the DebugSQL=yes/no parameter is understood. When DebugSQL is set to yes, indexer and search.cgi print all SQL queries sent to the database. mnoGoSearch must be compiled with --with-debug, otherwise DebugSQL=yes is ignored.


DBAddr mysql://root@localhost/test/?DebugSQL=yes
      

The TrackQuery parameter

Use the trackquery=yes parameter to activate search query tracking. Please refer to the Section called Tracking search queries in Chapter 11 for details..

Parameters related to DBMode=blob

Starting from the version 3.2.36, DBAddr supports the Deflate=yes|no parameter. With Deflate=yes specified, indexer compresses data when creating fast search index with indexer -Eblob, which gives a smaller database size and faster search. This option is effective for DBMode=blob only. There is no need to specify this option in search template, search.cgi detects and handles compressed data automatically.


DBAddr mysql://foo:bar@localhost/mnogosearch/?DBMode=blob&Deflate=yes
      

Starting from the version 3.2.36, DBAddr supports zint4=yes|no parameter. With zint4=yes specified, indexer compresses document IDs using a special compression method which we called zint4. This method is very effective for a sorted array of document IDs and compresses data up to 85% with relative good decompression speed. In conjunction with the Deflate=yes parameter, compression ratio can be up to 99.8%. This option is used with DBMode=blob, for the #rec_id array only.


DBAddr mysql://foo:bar@localhost/mnogosearch/?DBMode=blob&Deflate=yes&zint4=yes
      

MySQL specific parameters

MyCnfGroup - Loading my.cnf file

When initializing a connection to MySQL, mnoGoSearch forces loading of the my.cnf configuration file from the client option group by default. Use MyCnfGroup=group to load options from another group, or MyCnfGroup=no to prevent loading of my.cnf:


# Load options from a non-default option group
DBAddr mysql://foo:bar@localhost/mnogosearch/?MyCnfGroup=mnogosearch

# Prevent loading my.cnf
DBAddr mysql://foo:bar@localhost/mnogosearch/?MyCnfGroup=no
        
When connecting to MySQL, mnoGoSearch uses MySQL C API call to tell the MySQL connection handler which option group to load (unless MyCnfGroup=no is specified):

mysql_options(mysql, MYSQL_READ_DEFAULT_GROUP, MyCnfGroup);
        

SQLLog - Using general log

MySQL users can specify whether to switch MySQL query logging on/off using the SQLLog parameter:


DBAddr mysql://foo:bar@localhost/mnogosearch/?sqllog=0
DBAddr mysql://foo:bar@localhost/mnogosearch/?sqllog=1
        
If SQLLog parameter is given, then mnoGoSearch sends the SET SQL_LOG_OFF=X query after connection is established. Only users with MySQL SUPER privilege can use this parameter.

SQLLogBin - Using binary log

MySQL users can specify whether to do binary logging by setting the SQLLogBin parameter:


DBAddr mysql://foo:bar@localhost/mnogosearch/?sqllogbin=0
DBAddr mysql://foo:bar@localhost/mnogosearch/?sqllogbin=1
        
If SQLLogBin parameter is given, then mnoGoSearch sends the SET SQL_LOG_BIN=X after connection is established. Only users with MySQL SUPER privilege can use this parameter.

Compress - Using compression in client-server protocol


# Enable client-server compression
DBAddr mysql://foo:bar@localhost/mnogosearch/?Compress=yes

# Disable client-server compression
DBAddr mysql://foo:bar@localhost/mnogosearch/?Compress=no
        

If Compression=yes is specified then mnoGoSearch uses MySQL C API call to activate client-server compression:


mysql_options(mysql, MYSQL_OPT_COMPRESS, 0);
        

If Compression=no is specified, or Compression is omitted, then this call is not done and therefore no compression happens.

Compression improves crawling, indexing and search performance when connecting to a remote MySQL server. It is not recommended to use compression with localhost.

MultiInsert - using multiple INSERT syntax


# Enable multiple INSERT
DBAddr mysql://foo:bar@localhost/mnogosearch/?MultiInsert=yes

# Disable multiple INSERT
DBAddr mysql://foo:bar@localhost/mnogosearch/?MultiInsert=no
        

If MultiInsert=yes is specified then indexer -Eblob uses multiple INSERT syntax when runnign with a MySQL database, for example:


INSERT INTO t1 VALUES (100,1,'...'),(101,1,'...'),(102,1,'...'),...,(200,1,'...')
        

If MultiInsert=no is specified, or MultiInsert is omitted, then this call is not done and therefore no compression happens.

MultiInsert=yes mode improves performance.

Examples


DBAddr mysql://foo:bar@localhost/mnogosearch/?DBMode=multi
DBAddr mysql://foo:pwd%3Awith%40special%3Cchars@localhost/mnogosearch/