The CLAIMS Direct data warehouse is updated continuously. On average, 1-3% of the entire database is touched weekly. These 1-3 million updates fall into two categories: new documents (roughly 20%) and updated documents (roughly 80%). New documents are, of course, new publications from issuing authorities, US, EP etc. Updates are generally, but not limited to, CPC changes, family changes (adding family idID), IFI integrated content (patent status, standard names and claims summaries), reassignments, legal status, et al. I suggest reading about content updates for a better understanding of load sources and issuing authorities. This blog will describe how to determine what has been updated and further, differentiating new content from updated content at the document level as well as container level.
...
Info | |||
---|---|---|---|
| |||
How many US documents published on 20161004 have updated CPC classifications?
Out of 6680 documents published on 20161004, 6145 had CPC modifications after publication. |
...
Code Block |
---|
select count(*) as updated_docs from xml.t_patent_document_values where created_load_id < modified_load_id and modified_load_id in ( select load_id from reporting.t_client_load_process where date_trunc('day', completed_stamp) = '2016-10-04'::date ); updated_docs ---------- 362388 |
In part II of this series, I will detail methods which can be used to monitor this table and react to changes either at the load-id-level or over the course of an interval of time.