CollectLinks

Name

CollectLinks -- defines what kind of links between documents should be stored in the database (e.g. for popularity rank).

indexer.conf

Synopsis

CollectLinks {all | yes | no | inner | outer | site | page | badscheme | bad | hops | filter | persite}...

Description

CollectLinks defines what kind of links between documents should be stored in the database. This information can be used to calculate popularity rank, as well as for SEO purposes.

Multiple arguments are possible in the same command.

The following argument values are understood:

Using popularity rank is described in details in the Section called Popularity in Chapter 11.

mnoGoSearch versions prior to 3.3.0 implicitly collected links between all crawled documents. Starting from the version 3.3.0, the default behavior was changed to skip collecting links, for crawling performance purposes. As a side effect popularity rank calculation is not possible in the default configuration. If popularity rank is important for your installation, please specify CollectLinks yes in indexer.conf.

The links table

Link information is stored into the table 'links' of the mnoGoSearch database, with the following structure:


CREATE TABLE links (
  url_id int(11) NOT NULL,
  weight float NOT NULL,
  url text NOT NULL,
  src varchar(10) NOT NULL,
  rel varchar(32) NOT NULL,
  linktext text NOT NULL,
  KEY url_id (url_id)
);

Note: The structure can slightly vary depending on the database backend being used.

The src field stores information about the link source, where the link came from:

The rel field stores information from the rel attribute of the tag link, e.g.:


<link rel="canonical" href="http://www.site.com/"/>

Scope

CollectLinks affects all Server and Realm commands until the end of the configuration file, or until the next CollectLinks command.

Examples


CollectLinks yes
      

See also

FollowLinks, ServerWeight.