Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

The CLAIMS Direct Web Services (CDWS) offer a variety of entry points into both the data warehouse and SOLR Solr index. These are mid-to-high-level entry points and can satisfy most requirements pertaining to searching and extracting data for a typical search/view application. There are, however, corner cases which may require more intricate extraction of particular information. On the other hand, there may also be situations where massive amounts of data need to be extracted for further, down-stream processing. The following discussion will touch on solutions for both of these cases.

...

Function
Example

string

xml.f_textContent(xpath::text,xml)

Extract text of node:

Code Block
select xml.f_textContent('./document-id/lang', content)
from xml.t_publication_reference
where publication_id=xml.f_ucid2id('US-5551212-A');
----
EN

string

xml_f_getAttribute(att::text,xml)

Extract single attribute value:

Code Block
languagesql
select xml.f_getAttribute('ucid', content)
from xml.t_publication_reference
where publication_id=xml.f_ucid2id('US-5551212-A');
----
US-5551212-A

boolean

xml.f_nodeExists(xpath::text,xml)

Test presence of a particular node:

Code Block
languagesql
select xml.f_nodeExists('//date', content)
from xml.t_publication_reference
where publication_id=xml.f_ucid2id('US-5551212-A');
----
t

XML

xml.f_findNode(xpath::text,xml)

Select a node (type XML) satisfying the given xpath expression:

Code Block
languagesql
select xml.f_findNode('./main-classification', content)
from xml.t_classification_national
where publication_id=xml.f_ucid2id('US-5551212-A');
----
 <main-classification mxw-id="PCL893513162" load-source="mcf">053397</main-classification>

XML[]

xml.f_findNodes(xpath::text,xml)

Select an array of nodes satisfying the given xpath expression:

Code Block
languagesql
select xml.f_findNodes('./classification-ipcr', content)
from xml.t_classifications_ipcr
where publication_id=xml.f_ucid2id('US-5551212-A');
----
{
  "<classification-ipcr ...",
  "<classification-ipcr ...",
  "<classification-ipcr ..."
}

XML[]

xml.f_getElementsByTagName(name::text,xml))

Select an array of nodes based on element name:

Code Block
languagesql
select xml.f_getElementsByTagName('classification-ipcr', content)
from xml.t_classifications_ipcr
where publication_id=xml.f_ucid2id('US-5551212-A');
----
{
  "<classification-ipcr ...",
  "<classification-ipcr ...",
  "<classification-ipcr ..."
}

string

xml.f_nodeName(xml)

Select the name of the content node:

Code Block
languagesql
select xml.f_nodeName( content )
from xml.t_invention_title
where publication_id=xml.f_ucid2id('US-5551212-A');
-----------------
 invention-title

XML[]

xml.f_childNodes(xml)

Select all child nodes:

Code Block
languagesql
select xml.f_childNodes( content )
from xml.t_application_reference
where publication_id=xml.f_ucid2id('US-5551212-A');
-----------------------------------------------------------------------------------
 <document-id mxw-id="PAPP60364944" load-source="docdb" format="epo">
   <country>US</country>
   <doc-number>2582893</doc-number>
   <kind>A</kind>
   <date>19930303</date>
   <lang>EN</lang>
 </document-id>
 <document-id mxw-id="PAPP84571955" load-source="patent-office" format="original">
   <country>US</country>
   <doc-number>08025828</doc-number>
   <date>19930303</date>
   <lang>EN</lang>
 </document-id>

string

xml.f_node2string(xml,indent)

Concatenate all child text nodes:

Code Block
languagesql
select xml.f_node2string( content::text, 1 )
from xml.t_abstract
where publication_id=xml.f_ucid2id('US-5551212-A');
-----------------------------------------------------------------------------------
 A package is formed by holding a nonextensible strip taut and wrapping it around the upper and/or lower periphery of the bundle and joining the ends of the strip segments e.g. by adhesive bonding. A projecting margin of the strip is folded to overlie the bundle or underlie the bundle so that an L-configuration is imparted to the frame formed around the bundle after the strip is applied thereto.

...


Use Case: Creating relational view of XML content

...