...
modified_load_id
– theload-id
that last modified the document, whether it be addition, correction or deletion of contentdeleted_load_id
– theload-id
that the entire document was marked deletedcreated_load_id
– theload-id
when the document was added to the data warehouse
Each satellite table also contains a modified_load_id
column that is specific to the content in that table and will always be <= xml.t_patent_document_values(modified_load_id)
. The following table should clarify the sequence of update events and the effect on load-id
values for three database tables.
date | load-source | Load-Id | table | created_load_id | modified_load_id | deleted_load_id |
---|---|---|---|---|---|---|
20161004 | US | 10000 | xml.t_patent_document_values | 10000 | 10000 | NULL |
xml.t_ifi_integrated_content assuming ifi integrated content is not delivered on date of publication | n/a | n/a | n/a |
xml.t_parties | n/a | 10000 | n/a | |||
20161006 | IFI-CONT | 10001 | xml.t_patent_document_values | 10000 | 10001 | NULL |
xml.t_ifi_integrated_content | n/a | 10001 | n/a |
xml.t_parties parties are not updated with IFI content loads | n/a | 10000 | n/a | |||
20161011 | DOCDB | 10002 | xml.t_patent_document_values | 10000 | 10002 | NULL |
xml.t_ifi_integrated_content | n/a | 10001 | n/a |
xml.t_parties | n/a | 10002 | n/a |
This granularity offers the ability to trace modifications to individual containers (tables) as well as the overall modification status of the document as a whole.
Info | |||||
---|---|---|---|---|---|
| |||||
How many US documents published on 20161004 have updated CPC classifications?
Out of 6680 documents published on 20161004, 6145 had CPC modifications after publication. |
Reporting
Knowing which tables and containers have changed is important, but the when and why are also critically important. The information gathered during data warehouse updates is stored in reporting.t_client_load_process
. For a detailed description of the contents of this table, see CLAIMS Direct Design Documentation. Of the many questions that can be posed, for our purposes in this blog, we are only concerned with answering the following questions:
...
Code Block | ||
---|---|---|
| ||
select count(*) as updated_docs from xml.t_patent_document_values where created_load_id < modified_load_id and modified_load_id in ( select load_id from reporting.t_client_load_process where date_trunc('day', completed_stamp) = '2016-10-04'::date ); updated_docs ---------- 362388 |
...