Page tree
Skip to end of metadata
Go to start of metadata

Processes The main executable script used for indexing is aidx delivered as part of Alexandria::Library. This script is responsible for pulling source data, converting it into SOLR XML and submitting via HTTP POST to SOLR for indexing. The conversion process from CLAIMS Direct XML to SOLR XML is handled by the indexer class (default is Alexandria::DWH::Index::Document). Alexandria::Client::Tools also provides an indexing daemon, aidxd which monitors an index process queue.…
Re-indexing Data from CLAIMS Direct Data Warehouse
Introduction There are a number of reasons one would need to re-index data from the data warehouse. These range from simply repeating a load-id to a complete re-index of the entire contents of the data warehouse. In this blog, I'm going to go over the mechanisms that move data from the data warehouse to the index and ways in which these mechanisms can be used to trigger partial or full re-indexing of data. Background In all installations of CLAIMS Direct,…
Sorting Through Data Warehouse Updates
The CLAIMS Direct data warehouse is updated continuously. On average, 1-3% of the entire database is touched weekly. These 1-3 million updates fall into two categories: new documents (roughly 20%) and updated documents (roughly 80%).  New documents are, of course, new publications from issuing authorities, US, EP etc. Updates are generally, but not limited to, CPC changes, family changes (adding family ID), IFI integrated content (patent status, standard names and claims summaries),…
XML Functionality Inside CLAIMS Direct Data Warehouse
Overview The CLAIMS Direct Web Services (CDWS) offer a variety of entry points into both the data warehouse and SOLR index. These are mid-to-high-level entry points and can satisfy most requirements pertaining to searching and extracting data for a typical search/view application. There are, however, corner cases which may require more intricate extraction of particular information. On the other hand, there may also be situations where massive amounts of data need to be extracted for further,…
Understanding the SOLR Result Set - Sort Parameter
Paging results is cumbersome and inefficient. In this next segment I'd like to talk about simple and complex sorting. Sorting, used effectively with the rows parameter can push relevant documents into the first page of results. Generally, you can sort on any indexed field but you can also utilize query boosting and functions to influence sort order.  CLAIMS Direct is configured to return empty fields at the top when asc is the direction and the bottom when desc is the direction.…
Understanding the SOLR Result Set - fl parameter
In this first of a series of blogs about SOLR result sets I'd like to talk about returned fields, both static and dynamic. Stored Fields Any field that is stored in SOLR can be returned in the result set.…

  • No labels