Table of Contents |
---|
Indexing into SOLR is controlled by an indexing daemon: aidxd
. This daemon probes PostgreSQL for available load-id(s) to index. This "queue" is represented by the table reporting.t_client_index_process
. See Data Warehouse Design for more information on the structure of this table. When processing is successfully completed into PostgreSQL, apgupd
registers a new, index-ready load-id. The indexing daemon aidxd
recognizes this as an available load-id and begins the indexing process for that particular load-id. aidxd
is installed as part of the CLAIMS Direct repository. Please see the Client Tools Installation Instructions for more information about how to install this tool.
Note |
---|
If you have chosen to deploy SOLR as Type 3,
|
Usage
Code Block | ||
---|---|---|
| ||
aidxd [ Options ... ] --nodaemon don't put process into background --once only process one load-id --pidfile=s specify location of PIDFILE (default=/var/log/alexandria/aidxd.pid) --interval=i n-seconds between probing for new loads --tmp=dir specify temporary file storage (default=/tmp) --clean remove temporary processing directory --batchsize=i maximum number of documents to parallelize --nthreads=i maximum number of processes to parallelize --facility=s logging facility (default=aidxd) --help print this usage and exit -------- --idxversion= 21 --idxcls=s Alexandria::DWH::Index::DocumentEx --idxexe=s specify indexing script (default aidx) --quiet suppress output from sub-process NOTE: suppressing this output will make it difficult to track down errors originating in --idxexe --solrdbname=s base url for index (default=alexandria) --core=s index core (default=alexandria) --autooptimize issue an 'optimize' call to SOLR after optinterval continuous load-id(s) --optinterval # of load-id(s) after which an optimize is issued (default=100) --optsegments=n optimize down to n-segments (default=16) --nostatistics do not gather indexing statistics |
Options
Argument | Description | Default Value |
---|---|---|
--nodaemon --once | When specified, aidxd will run in the foreground. If --once is given, --nodaemon is implied and only one load-id will be processed. | N/A |
--interval | Time (in seconds) between successive indexing queue probes | 10 |
--tmp | Temporary processing area which holds the transformed XML before being POST ed to SOLR | /tmp |
--batchsize | Number of documents to extract for indexing | 250 |
--nthreads | Number of parallel extraction processes. This value should be adjusted depending on available processing power on the PostgreSQL data warehouse server. A rule of thumb would be to set this to the number of cores. | 8 |
--idxversion | The version of the index | 21 |
--idxcls | The indexing class used in XML transformation | Alexandria::DWH::Index::DocumentEx |
--solrdbname --core | Base URL for indexing. If different from the default, it should have an index entry in /etc/alexandria.xml | alexandria |
--autooptimize | DO NOT USE |
Daemon Execution
Starting
Code Block | ||
---|---|---|
| ||
# v2.1: all defaults $INSTALL_BASE/bin/aidxd --idxversion=21 --idxcls=Alexandria::DWH::Index::DocumentEx # v2.1: Only process one load-id $INSTALL_BASE/bin/aidxd --idxversion=21 --idxcls=Alexandria::DWH::Index::DocumentEx --once |
...