Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

The PostgreSQL component is the heart of CLAIMS Direct. It contains the XML for the entire data warehouse collection, processes updates from the primary, and functions as data source for the optional SOLR index.

...

Requirement
Recommended
CPU4-cores
System Memory24GB
Storage Capacity4TB (SSD preferred)

 Software Software Requirements

Requirement
Minimum Version
Notes
Operating SystemRHEL 6, Fedora 20, Centos 6 
Development ToolsDistribution version
yum|dnf groupinstall "Development Tools"
PostgreSQLDistribution version
 yum|dnf install \
    postgresql postgresql-contrib \
    postgresql-odbc postgresql-pl-perl \
    postgresql-server
System LibrariesDistribution version
yum|dnf install \
libxml2 libxml2-devel \
libxslt libxslt-devel
# Please see note below regarding libxml2
Perl and ModulesDistribution version
yum|dnf install \
perl-Module-Install \ perl-DBD-Pg \ perl-XML-LibXML \ perl-XML-LibXSLT \ perl-CPAN
Note
titlelibxml2

Some CLAIMS Direct loading and maintenance code utilizes the postgresql PostgreSQL perl extension (plperl) as well as a heavy reliance on the libxml2 XML parsing library. The following table lists some inconsistent behavior with disparate versions of postgresql PostgreSQL and libxml2.

8.4.62.7.8works
9.1.22.7.8works
9.1.42.7.8works
9.1.72.7.8fails
9.2.42.9.1yes
9.3.12.7.6fails
9.3.22.9.1works

No postgresql PostgreSQL version compiled with libxml2 < 2.7.8 works and additionally, postgresql PostgreSQL 9.1.7 fails even with libxml2 2.7.8

IFI Claims has produced a patched release of libxml2-2.9.2 as an RPM. It is recommended to locally install this package replacing the package in the distribution. The RPM can be downloaded at the URL: http://alexandria.fairviewresearch.com/software/libxml2/f20/libxml2-2.9.2-1.fc20.x86_64.rpm. For additional versions, please contact support@ificlaims.com.

...

Regardless of installation type, careful planning of disk resources is important for efficient data loading into and extraction out of postgresqlPostgreSQL. There are 6 logical segments inside the CLAIMS Direct data warehouse.

work indexAll indices pertaining to loading30GB (variable)
work textAll raw table data queued for loading100GB (variable)
xml indexAll permanent indices for the data warehouse400GB
xml textAll permanent text for the data warehouse1TB
pg dataThe cluster meta data, reporting, and logging directory5GB (variable)
pg xlogLog shipping for replication50GB (variable)

Each of these segments can be allocated discrete disk space through the use of TABLESPACES. Although not required, the use of TABLESPACES will improve loading and extraction performance. The total logical size of the data warehouse is approximately 2TB after initial loading.

...

As mentioned above, the CLAIMS Direct PostgreSQL cluster can utilize TABLESPACES to separate text, index, and work table data. An optimal (but not mandatory) layout will have each of the following paths on separate disk groups where "disk group" is understood to be a discreet discrete disk or set of disks exposed to the operating system as a device capable of supporting an ext4 file system.

Please note, these are only suggestions. Your environment and disk sub-system naming may be different, or you can choose not to use TABLESPACES at all. A postgresql PostgreSQL cluster running on a 2TB RAID0 sub-system exposed as one device, for example, wouldn't benefit as noticeably using TABLESPACES as a mixed RAID environment with multiple devices.

...

Creating the Database

The PostgreSQL data warehouse portion of CLAIMS Direct is delivered in 2 parts:

  •  postgresql  PostgreSQL database schema (alexandria-dwh.sql)
  • <table>.gz files located in the sub-directory data


 

...