Search Site        
 
Home  

Run-time configuration and usage

Configuration

indexer configuration is covered mostly by indexer.conf-dist file which is included in the sources. Please read indexer.1 and indexer.conf.5 man pages, then take a look at etc/indexer.conf-dist and *.conf samples in doc/samples directory of distribution.
  • Some examples of indexer.conf file
  • Minimal indexer.conf file
  • indexer.conf for 'link validation' mode
  • indexer.conf for 'ftpsearch' mode
  • To configure search frontends (search.cgi, search.php3), you should edit search.htm HTML template file in etc directory of mnoGoSearch installation. See doc/templates.txt for detailed description. You can also take a look at various *.htm files in doc/samples directory.

  • Some examples of search.htm file
  • Template suitable for search through MySQL site
  • Sample from Udmurtia search engine
  • We would like people to send us various indexer.conf and search.htm files. We will publish it here and include in distribution.

    Running indexer

    Just run indexer once a week (a day, a hour ...) to find the latest modifications in your web sites. indexer will reindex expired documents:

    	sh$ indexer
    
    If you want to reindex all documents (irrelevant if it is expired or not), please use -a option. indexer also have -t, -u, -s options to limit indexing to only a part of the database.

    To clear the whole database, use indexer -C. You can also clear database only partially by using -t, -u, -s options.

    Run indexer -S, to view database statistics, including total and expired documents count for each status. -t, -u, -s filters can be used in this mode as well.

    The meaning of status is:

    • 0 - new (not indexed yet) URL
    If status is not 0, then it is HTTP response code.
    Some of HTTP codes are here:
  • 200 - "OK" (URL is successfully indexed)
  • 301 - "Moved Permanently" (redirect to another URL)
  • 302 - "Moved Temporarily" (redirect to another URL)
  • 303 - "See Other" (redirect to another URL)
  • 304 - "Not modified" (URL is not modified since last indexing)
  • 401 - "Authorization required" (use login/password for given URL)
  • 403 - "Forbidden" (you have no access to this URLs)
  • 404 - "Not found" (there were references to URLs that do not exist)
  • 500 - "Internal Server Error" (error in CGI, etc)
  • 503 - "Service Unavailable" (host is down, connection timeout)
  • 504 - "Gateway Timeout" (read timeout when retrieving document)
  • If mnoGoSearch founds URL with HTTP 301,302,303 code it will index URL given in Location: http://www.somewhere.com" field of HTTP-header instead. This feature is called redirection.

    HTTP 401 means that this URL is password protected. You can use AuthBasic command in indexer.conf to set login:password for this URL or URLs.

    HTTP 404 means that you have incorrect reference in you document (reference to resource that does not exist). Check referrer field in url table. You can also check such referrers by indexer -I -s 404.

    If you have bad connection with HTTP server, you can run several indexer processes simultaneously with the same indexer.conf file. We have successfully tested 30 simultaneous indexer processes.

    Notes for several indexer processes at the same time:

    • You can run several indexer processes with different configuration files on different MySQL databases.
    • It is not recommended to use the same MySQL database with different indexer.conf files! First process could add something but second could delete it, and it will never stop.

    You can also insert indexer into your crontab job.


    Performing search

    Open search.cgi in your browser:
    http://your.web.server/path/to/search.cgi

    Or, if you prefer PHP3:
    http://your.web.server/path/to/search.php3 if you have handler for php3 documents in your HTTP server configuration or, if you have PHP3 as CGI, http://your.web.server/cgi-bin/php.cgi/path/to/search.php3

    To find something just type words you want to find and press SUBMIT button. For example, mysql odbc. mnoGoSearch will find all documents containing the word "mysql" or the word "odbc". Best matching documents will be displayed first.

    If you prefer more advanced results and use PHP you can use query language. search.cgi does not have advanced search yet. It is on TODO. search.php3 understands the following commands:

    & - logical AND. For example, "mysql & odbc". mnoGoSearch will find any URLs that contain both "mysql" and "odbc".

    | - logical OR. For example "mysql|odbc". It's just the same as "mysql odbc". Space " " is equal to "|". mnoGoSearch will find any URLs, that contain word "mysql" or word "odbc".

    ~ - logical NOT. For example "mysql & ~odbc". mnoGoSearch will find URLs that contain word "mysql" and at the same time do not contain word "odbc". Note that ~ just excludes word from the result. Query "~odbc" will find nothing! mnoGoSearch compose WHERE condition in mysql query using all of the words in search query (to make search very quick): .... WHERE word in ('all','of','the','words','have','been','typed')

    () - group command to compose more complex queries. For example "(mysql | msql) & ~postgresql". Query language is simple (and powerful). Just consider query as common logical expression.


    Bugs

    Bug Reporting

    If you think you've found a bug in mnoGoSearch, you can report it to mnoGoSearch Developers Team

    When reporting a bug, please fully specify you platform, MySQL version, PHP version (if you use it). Database statistics (count of records in all tables) and contents of indexer.conf file would also be helpful.




    Copyright © 2000-2015 Lavtech.Com Corp.