Indexing into SOLR Solr is controlled by an indexing daemon: aidxd
. This daemon probes PostgreSQL for available load-id(s) to index. This "queue" is represented by the table reporting.t_client_index_process
. See Data Warehouse Design for more information on the structure of this table. When processing is successfully completed into PostgreSQL, apgupd
registers a new, index-ready load-id. The indexing daemon aidxd
recognizes this as an available load-id and begins the indexing process for that particular load-id. aidxd
is is installed as part of the CLAIMS Direct repositoryClient Tools. Please see the Client Tools Installation Instructions for more information about how to install this tool.
Note |
---|
If you have chosen to deploy SOLR Solr as Type 3, --core must be specified corresponding to your subscription level. - Basic:
--core=alexandria-standard - Premium:
--core=alexandria-premium - Premium-Plus:
--core=alexandria-premium-plus
|
Usage
Code Block |
---|
|
aidxd [ Options ... ]
--nodaemon don't put process into background
--once only process one load-id
--pidfile=s specify location of PIDFILE
(default=/var/log/alexandria/aidxd.pid)
--interval=i n-seconds between probing for new loads
--tmp=dir specify temporary file storage (default=/tmp)
--clean remove temporary processing directory
--batchsize=i maximum number of documents to parallelize
--nthreads=i maximum number of processes to parallelize
--facility=s logging facility (default=aidxd)
--help print this usage and exit
--------
--idxversion= 21
--idxcls=s Alexandria::DWH::Index::DocumentEx
--dbfunc=s specify an alternative extraction function (default=xml.f_patent_document_s)
--idxexe=s specify indexing script (default aidx)
--quiet suppress output from sub-process
NOTE: suppressing this output will make it difficult
to track down errors originating in --idxexe
--pgdbname=s source postgresql instance as defined in /etc/alexandria.xml
--solrdbname=s base url for index (default=alexandria)
--core=s index core (default=alexandria)
--tolerate tolerate indexing errors and attempt again
--autooptimize issue an 'optimize' call to SOLRSolr after optinterval
continuous load-id(s)
--optinterval # of load-id(s) after which an optimize is issued
(default=100)
--optsegments=n optimize down to n-segments (default=16)
--nostatistics do not gather indexing statistics |
Options
Argument | Description | Default Value |
---|
--nodaemon
--once | When specified, aidxd will run in the foreground. If --once is given, --nodaemon is implied and only one load-id will be processed. | N/A |
--interval | Time (in seconds) between successive indexing queue probes | 10 |
--tmp | Temporary processing area which holds the transformed XML before being POST ed to SOLRSolr | /tmp |
--batchsize | Number of documents to extract for indexing | 250 |
--nthreads | Number of parallel extraction processes. This value should be adjusted depending on available processing power on the PostgreSQL data warehouse server. A rule of thumb would be to set this to the number of cores. | 8 |
--idxversion | The version of the index | 21 |
--idxcls | The indexing class used in XML transformation | Alexandria::DWH::Index::DocumentEx |
--dbfunc | Specify an alternative extraction function | xml.f_patent_document_s |
--pgdbname | Source PostgreSQL instance as defined in /etc/alexandria.xml | alexandria |
--solrdbname
--core | Base URL for indexing. If different from the default, it should have an index entry in /etc/alexandria.xml. | alexandria |
--tolerate | (v2.6-1) Tolerate a wide variety of errors and re-try failed index | N/A |
--autooptimize | DO NOT USE | N/A |
Daemon Execution
Starting
Code Block |
---|
|
# v2.1: all defaults
$INSTALL_BASE/bin/aidxd --idxversion=21 --idxcls=Alexandria::DWH::Index::DocumentEx
# v2.1: Only process one load-id
$INSTALL_BASE/bin/aidxd --idxversion=21 --idxcls=Alexandria::DWH::Index::DocumentEx --once |
Pausing/Resuming/Stopping
Code Block |
---|
|
# pause the daemon
kill -s USR1 <pid>
# resume processing
kill -s USR2 <pid>
# stop daemon entirely
Killkill -s INT <pid> |