Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Code Block
languagetext
aext [ Options ... ]
  --pgdbname=s      Database configuration name (default=alexandria)
  --solrurl=s       SOLRSolr index url (default=http://solr.alexandria.com:8080/alexandria-index/alexandria)
  --loadid=i        modified-load-id to extract
  --table=s         extract from table
  --sqlq=s          extract from SQL statement
  --solrq=s         extract from SOLRSolr query
  --root=s          directory to deposit output file(s) or into which files will be archived
  --prefix=s        prefix for output files (default=batch)
  --archive         archive data into predictable path structure
  --nthreads=i      number of parallel processes (default=4)
  --batchsize=i     number of documents per process (default=500)
  --dbfunc=s        specific user-defined database function

...

Parameter

Description
pgdbnameAs configured in /etc/alexandria.xml, the database entry pointing to the on-site CLAIMS Direct PostgreSQL instance. The default value is alexandria as this value is pre-configured in /etc/alexandria.xml.
solrurlAvailable with optional SOLR Solr on-site installation only, this is the URL of the standalone CLAIMS Direct SOLR Solr instance or, if used, the URL of the load balancer. Although there is a default value, if you specify --solrq, this parameter is mandatory.

...

ParameterDescription
loadidThe modified_load_id of the table xml.t_patent_document_values. Please see the documentation on content updates describing the various load-ids.
tableThe name of a user-created table with a minimum required column publication_id.
sqlqAny raw SQL that returns one or more publication_id values.
solrqAny raw SOLR Solr query.

Extract Naming and Destination

...

Code Block
languagetext
aext --sqlq="SELECT t.publication_id from xml.t_patent_document_values as t where t.modified_load_id=261358" \
     --archive \
     --root=/tmp

Extracting Using

...

Solr

If the optional CLAIMS Direct SOLR Solr instance is installed, the power of SOLR Solr can be used to search, filter, and extract documents. This example simply pulls the same set of documents as above using SOLR Solr query syntax.

Code Block
languagetext
aext --solrurl=http://SOLR-INSTANCE-URL/alexandria-v2.1/alexandria --archive --solrq='loadid:261358'

[aindex01] [2017/04/06 04:17:11] [DEBUG     ] [preparing extract ...]
[aindex01] [2017/04/06 04:17:11] [DEBUG     ] [creating t_tmp_000000000000_001491466631 ... ]
[aindex01] [2017/04/06 04:17:11] [DEBUG     ] [querying SOLR (http://SOLR-INSTANCE-URL/alexandria-v2.1/alexandria { loadid:261358 })]
[aindex01] [2017/04/06 04:17:12] [DEBUG     ] [running extract ...]
[aindex01] [2017/04/06 04:17:27] [DEBUG     ] [finalizing extract ...]
[aindex01] [2017/04/06 04:17:27] [INFO      ] [extract complete: { 4613 documents across 10 batches in 15.643s (294.894/s) }]

...