Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Reverted from v. 40

Table of Contents

Panel
titleRelated Pages

For step-by-step instructions, see the PostgreSQL Installation Instructions.

The PostgreSQL component is the heart of CLAIMS Direct. It contains the XML for the entire data warehouse collection, processes updates from the primary, and functions as data source for the optional

...

Solr index.

Table of Contents

Hardware Requirements

Requirement
Recommended
CPU4-cores
System Memory24GB
Storage Capacity

...

6TB (SSD

...

required)

Software Requirements

Requirement

...

Supported Versions
Notes
Operating SystemRHEL

...

/Rocky 8, Amazon Linux

...

 yum install \
    postgresql postgresql-contrib \
    postgresql-odbc postgresql-pl-perl \
    postgresql-server

...

2We do not support Ubuntu or any operating system not explicitly listed.
PostgreSQL

10 - 14

# RHEL/CentOS 7
sudo yum install -y \
https://download.postgresql.org/pub/repos/yum/reporpms/EL-7-x86_64/pgdg-redhat-repo-latest.noarch.rpm

# RHEL/Rocky 8
sudo dnf -y install \
 https://download.postgresql.org/pub/repos/yum/reporpms/EL-8-x86_64/pgdg-redhat-repo-latest.noarch.rpm
IFI CLAIMS Repository
# Amazon Linux 2
sudo yum -y install \
https://repo.ificlaims.com/ifi-claims-direct/

...

...

...

...

...

...


# RHEL / CentOS 7
sudo yum -y install \
 https://repo.ificlaims.com/ifi-claims-direct/centos/

...

...

...

...

...



# RHEL/Rocky 8
sudo dnf -y install \
https://repo.ificlaims.com/ifi-claims-direct/

...

...

...

...

...

Note
titlelibxml2

Some CLAIMS Direct loading and maintenance code utilizes the PostgreSQL perl extension (plperl) as well as a heavy reliance on the libxml2 XML parsing library

...

8.4.62.7.8works
9.1.22.7.8works
9.1.42.7.8works
9.1.72.7.8fails
9.2.42.9.1works
9.3.12.7.6fails
9.3.22.9.1works

...

.

...

IFI CLAIMS has produced a patched release of libxml2 as an RPM. It is highly recommended to update libxml2 from the IFI CLAIMS software repository. For additional distributions, please contact support@ificlaims.com.

PostgreSQL

...

Regardless of installation type, careful planning of disk resources is important for efficient data loading into and extraction out of PostgreSQL. There are 6 logical segments inside the CLAIMS Direct data warehouse.

SegmentDescriptionSize
work indexAll indices pertaining to loading30GB (variable)
work textAll raw table data queued for loading100GB (variable)
xml indexAll permanent indices for the data warehouse400GB
xml textAll permanent text for the data warehouse1TB
pg dataThe cluster metadata, reporting, and logging directory5GB (variable)
pg xlogLog shipping for replication50GB (variable)

Each of these segments can be allocated discrete disk space through the use of TABLESPACES. Although not required, the use of TABLESPACES will improve loading and extraction performance. The total logical size of the data warehouse is approximately 2TB after initial loading.

Suggested PostgreSQL Disk Layout

As mentioned above, the CLAIMS Direct PostgreSQL cluster can utilize TABLESPACES to separate text, index, and work table data. An optimal (but not mandatory) layout will have each of the following paths on separate disk groups where "disk group" is understood to be a discrete disk or set of disks exposed to the operating system as a device capable of supporting an ext4 file system.

Please note, these are only suggestions. Your environment and disk sub-system naming may be different, or you can choose not to use TABLESPACES at all. A PostgreSQL cluster running on a 2TB RAID0 sub-system exposed as one device, for example, wouldn't benefit as noticeably using TABLESPACES as a mixed RAID environment with multiple devices.

If you choose to implement TABLESPACES, you will need to edit alexandria-dwh.sql and insert your custom tablespace definitions. These are clearly marked per segment as:

-- TABLESPACE definition for work text

-- TABLESPACE definition for xml text
-- TABLESPACE definition for work index
-- TABLESPACE definition for xml index
 
-- insert your TABLESPACE name in the appropriate section
set default_tablespace = ''

 

 

tblspc_wrkidxwork index/wrkidx/_pg30GB
tblspc_wrktxtwork text/wrktxt/_pg100GB
tblspc_xmlidxxml index/xmlidx/_pg400GB
tblspc_xmltxtxml text/xmltxt/_pg2000GB

During cluster initialization (initdb), you can further define the location for the segments: pg data and pg xlog.

PostgreSQL Initial Configuration

CLAIMS Direct requires a working PostgreSQL cluster. If you do not have an initialized cluster, the following steps will initialize the cluster and give you rudimentary authentication and access levels needed to run CLAIMS Direct.

 

su - postgres

 

# -D, --pgdata: specify location of pg data segment
# -X, --xlogdir: specify location of pg xlog segment
initdb -A trust \
       -D /var/lib/pgsql/data \
       -E utf8 \
       -X /pgxlog

 

The following configuration files are to be edited:

/var/lib/pgsql/data/pg_hba.conf

# Allow trusted local access
local   all             all                                     trust
 
# IPv4 local connections
host    all             all             127.0.0.1/32            trust
 
# Other hosts on subnets that may require access, for example
# host    all             all             192.168.10.0/24         trust

 

/var/lib/pgsql/data/postgresql.conf

listen_addresses = '*'                  # what IP address(es) to listen on;

 

If you already have an initialized cluster, please be certain that local access is enabled for stand-alone installation. In either distributed install, if a separate services machine is created, its IP address needs access. This is imperative for the client update procedures.

 

Finally, enable and start the PostgreSQL cluster:

# on systemd based systems:
systemctl enable postgresql.service
 
systemctl start postgresql.service

 

CLAIMS Direct PostgreSQL Role

CLAIMS Direct requires the role alexandria:

psql -Upostgres postgres
=#> create role alexandria with superuser login;

 

Creating the Database

The PostgreSQL data warehouse portion of CLAIMS Direct is delivered in two parts:

  •  PostgreSQL database schema (alexandria-dwh.sql)
  • <table>.gz files located in the sub-directory data

 

# create the database
cat alexandria-dwh.sql | psql -Ualexandria postgres

Tuning the Data Warehouse

Although specifying exact postgresql.conf configuration parameters may seem beneficial, in reality, every installation is different. There are many factors in tuning your cluster including system memory, resource contention from other services running on the server, available disk space, disk types et al. The primary cluster supporting your updates was tuned using the tool pgtune. It is recommended to run the tool and apply the changes before bulk loading the data.

Loading the Tables

The process of loading CLAIMS Direct data into PostgreSQL tables will take 3-5 days on most hardware configurations.

The following code block outlines the loading of the CLAIMS Direct data into PostgreSQL tables:

 

for each table, load ...
gunzip -c data/<table>.gz | psql -Ualexandria alexandria
  
# sample batch script
for tbldata in $(ls data/*.gz); do
  gunzip -c $tbldata | psql -Ualexandria alexandria
done

Creating the Credentials Database for CDWS

...

, Database Schema and Tools

In an effort to streamline CLAIMS Direct PostgreSQL schema versioning, we offer a schema and tools package via the CLAIMS Direct yum repository which is used to create the CLAIMS Direct database and initializes the XML schema.

Amazon Linux 2

Code Block
languagebash
sudo amazon-linux-extras install epel
sudo amazon-linux-extras enable postgresql14
sudo yum clean all
sudo yum -y install \
  postgresql postgresql-contrib postgresql-plperl postgresql-server \
  alexandria-schema-tools \
  libxml2

RHEL / CentOS 7

Code Block
languagebash
sudo yum -y install epel-release

sudo yum clean all
# Note: this installs the default version 10. If you would like a higher version,
#       please see: https://www.postgresql.org/download/linux/redhat/
sudo dnf -y install \
  postgresql postgresql-contrib postgresql-plperl postgresql-server \
  alexandria-schema-tools \
  libxml2

RHEL / Rocky 8

Code Block
languagebash
sudo dnf -y install epel-release
# Rocky
sudo dnf config-manager --set-enabled powertools
# RHEL
sudo subscription-manager repos --enable codeready-builder-for-rhel-8-x86_64-rpms

sudo dnf clean all
# Note: this installs the default version 10. If you would like a higher version,
#       please see: https://www.postgresql.org/download/linux/redhat/
sudo dnf -y install \
  postgresql postgresql-contrib postgresql-plperl postgresql-server \
  alexandria-schema-tools \
  libxml2

The package contains the SQL needed to create the database supporting PostgreSQL versions 10 - 14, as well as tools to check installation, populate the tables, and perform extracts. To create the databases, simply load the SQL via psql into the instance.

Code Block
languagebash
echo "create role alexandria with superuser login;" \
  | psql -Upostgres postgres  

cat /usr/share/alexandria/alexandria-schema-xml/alexandria-schema-xml.sql \
  | psql -Ualexandria postgres


In addition to the schema, there is a collection of tools to help determine instance loading feasibility, populating, bulk extracting, counts and sizes of tables, and more. All tools accept the same parameters with reasonable defaults for a Type-1 CLAIMS Direct installation.

Code Block
languagetext
Usage: CD-TOOL OPTIONS

  Option        Description                     Default
  --------------------------------------------------------
  -H|--host     specify host                    localhost
  -P|--port     specify port                    5432
  -d|--database specify database                alexandria
  -u|--user     specify database user name      alexandria
  -h|--help     print this usage and exit

ToolDescription
cd-pre-flight-check.sh 

Use this tool to test the viability of populating the newly created database. It runs a few simple checks.

Code Block
languagebash
cd-pre-flight-check.sh
Testing localhost/alexandria ...
  OK    : procedual language sql
  OK    : procedual language plpgsql
  OK    : procedual language plperl
  OK    : procedual language plperlu
  OK    : XML capability (test 1/libxml): 
  OK    : XML capability (test 2/libxml): 
cd-load.sh 

You can use this tool to populate the initial data delivery. After unpacking or otherwise making the initial data available, cd into the top level directory and simply run:

Code Block
languagebash
cd-load.sh

Note: This tool expects there to be a ./data directory in the current working directory.

cd-count.sh 

This tool counts the rows in all tables in the main schema (cdws, xml).

Code Block
languagebash
cd-count.sh
683	cdws.t_applications
0	cdws.t_cited_documents
0	cdws.t_class_hierarchies
813	cdws.t_priority_documents
366	xml.t_abstract
0	xml.t_amended_claims
366	xml.t_application_reference
# etc.
cd-analyze-tables.sh Analyze all tables in the main schema.
cd-table-size.sh 

Calculate the on-disk size of each table in the main schema.

Code Block
languagebash
cd-table-size.sh
192 kB	cdws.t_applications
48 kB	cdws.t_cited_documents
40 kB	cdws.t_class_hierarchies
272 kB	cdws.t_priority_documents
416 kB	xml.t_abstract
32 kB	xml.t_amended_claims
232 kB	xml.t_application_reference
# etc ...
cd-extract.sh 

Extract all relevant table data to ./data directory.

Panel
titleNext Steps

Once the data has been loaded, proceed to: