Page tree
Skip to end of metadata
Go to start of metadata


2018 Notes

December 2018: Rekey of some IN apps

In March 2018 we solved most of the issues related to the numbering format of Indian records. Since then, publication numbers of records corresponding to Indian applications published before January 2016 have a publication number format of YYYYOONNNNN in CLAIMS Direct.

We recently discovered that there were still a few records which contained the old office codes. This created some duplicates such as IN-1992BO00073-A/IN-1992MU00073-A.

In order to solve this issue we rekeyed those records with the old office codes, following the Indian patent office rules:

BO --> MU
CA --> KO
MA --> CH

Examples:
IN-1992BO00073-A is now IN-1992MU00073-A
IN-1992CA00749-A is now IN-1992KO00749-A
IN-1998MA01586-A is now N-1998CH01586-A

As a result of this process 7998 IN records have been marked "deleted".

September 2018: Coverage of EP A4 documents

EP kind code A4 documents cover the supplementary search reports as issued by the EPO following PCT applications entering the EP regional phase. Until now these documents were not fully integrated into the normal EPO publication flow of A and B publications.

Now A4 data is available in CLAIMS Direct both in XML format as well as PDF. So far, the entire year 2017 and the first half of 2018 are covered.

The EPO plans to deliver future A4 documents in batches every half year.

August 2018: Release of CLAIMS Direct Web Services version 3.5

CLAIMS Direct Web Services version 3.5 was released Saturday August 25th. The following changes have been implemented in this release:

  • RFC: if no first inventive CPC is available, use first inventive from IPCR
    The reporting column cpc-first-inventive will use the first inventive IPCR if no CPC first inventive value is present.
  • RFC: create new endpoint that retrieves all associated attachments as a zip archive
    This new endpoint, /attachment/fetchall retrieves all attachments available for a specific ucid and bundles them into a single zip archive. Please see /attachment/fetchall for service details.
  • RFC: (BETA) enable attachment manipulation of image type and size
    A new set of parameters is available on the /attachment/fetch endpoint that will convert image types to common web formats, including jpeg and png. Please see /attachment/fetch for the additional parameters.

July 2018: Extending PDF collection

In the last few weeks, we have added full document PDFs to the following countries:

  • CA - Documents published from January 2018 now have full document PDFs. In the next few weeks, we will also be loading PDFs for the Canadian backfile.
  • KR - We have added PDF files to the pre-2006 records which were loaded in January.
  • RU - Full document PDFs are now available for Russian documents published from 2005 up to the current date. We have also loaded PDFs for some pre-2005 records, although this collection is not complete.
  • WO - PCT documents now have PDF files.

We will soon expand our coverage even further by processing PDF files for India, focusing first on applications and later adding granted patents.

April 2018: WO records in Asian languages

We have processed a WIPO backfile of records in Asian languages. The total number of records included in this backfile with new or updated Asian language text is 211915 and the load-id is 304475. We are currently processing machine translations to English for these records.

April 2018: Translations for EP and WO records

As we announced in March, we have added English translations to EP and WO records that were filed in languages other than English.

For EP records these are the translated fields that we added:

abstracts332878
descriptions1389918
claims859589

For WO records:

abstracts4262
descriptions1070417
claims1070539


In the next few days, we plan to load a backfile of descriptions and claims for WO applications published in Chinese, Japanese, and Korean, going back to 1978. This data will include both the original language and English translations.

April 2018: Correction of issues in the ifi-integrated-data section

  • Unexpected expiration dates in applications

CLAIMS Direct provides calculated anticipated and adjusted expiration dates for granted patents (publication-type=G ). We recently discovered that some applications (publication-type=A) contained expiry dates in the ifi-integrated-content section. To maintain data consistency, we have removed those unexpected expiry dates as of load-id 302070.

  • Incorrect status in some Japanese records with kind code U

The IFI calculated status of some JP applications with kind code U was showing as expired when the correct status should have been granted. The origin of the problem was that JP laws changed around 1994 and therefore kind code U has different meanings before and after the changes. Though this was considered in IFI's calculation rules, some JP records were still being calculated incorrectly as publication-type=G when they should have been publication-type=A. This error in the publication type caused the status to be calculated incorrectly. This has now been fixed in all affected records and they have been reloaded as of load-id 302163.

March 2018: Reload of some Australian and Indian records

In the last few days we have solved a couple of issues caused by problems in the raw data:

  • Future AU publication dates

When processing data from the Australian national register, we found several records with OPI dates well into the future. According to the patent office, there was an issue with the optical character recognition for these dates. Although some of these dates have already been corrected, it's taking longer than expected for the patent office to fix the problem. To prevent this from happening again, we have changed these future dates to our default entry for dates that we know to be incorrect (00010101).

  • IN numbering formats

For most Indian patent authorities, we receive data from the Indian patent office without a publication number, so we use the application number to fill the publication reference element and to build a UCID. In general, the application reference keeps the original application number format from the patent office and the publication reference (as well as UCID) is calculated by IFI rules.

From 2016 the Indian patent office changed the format of their application numbers. The EPO also recently started to deliver Indian applications in DocDB, but using a different number format. This created some inconsistencies in our data.

To solve this problem, we did a rekey of around 27192 records (load-id 300781).

Now, records corresponding to Indian applications published before January 2016 have a publication number format of YYYYOONNNNN:

YYYY: four-digit year
OO: two-character office code
NNNNN: sequence number zero padded to five digits

While from 2016, patent numbers will use the format YYYYOTNNNNNN:

YYYY: four-digit year
O: one-character office code (1 for Delhi, 2 for Mumbai, 3 for Kolkata, and 4 for Chennai)
T: type of application *
NNNNNN: sequence number

* type of application:
1 = Ordinary Application
2 = Ordinary-Divisional Application
3 = Ordinary-Patent of Addition Application
4 = Convention Application
5 = Convention-Divisional Application
6 = Convention-Patent of Addition Application
7 = PCT National Phase Application
8 = PCT National Phase-Divisional Application
9 = PCT National Phase Patent of Addition Application

March 2018: Updates to Brazil, Taiwan, EP, and WIPO

In the first week of March, we completed an update of Brazilian records which resolves a major delay in BR front file deliveries. This included around 10,000 Brazilian full text records, which contain both the original language as well as translations. These records include full document PDFs, but are still missing bibliographic data from DocDB since the EPO has not published them yet. We will update any information delivered by DocDB as soon as it is available. BR full text is now up-to-date as of 20180206. Going forward, we expect to maintain front file publications within two weeks of publication. 

Similarly, we are loading missing TW full text records published in the last quarter of 2017 and forward.

In parallel, we will be adding English translations to EP and WO records that were filed in languages other than English. We plan to start with EP documents filed in French or German.

January 2018: Korean backfile and Australian pharmaceutical data loaded

We have finished the Korean backfile load announced in December, which added full text to more than 2.2 million KR documents published between 1979 and 2005. English translations will be processed soon.

Pharmaceutical names from the AU register data are now available in the new ifi-annotated-data container. See ifi-annotated-data for more information.

January 2018: Translations for Spanish patents

We are starting a reload of the ES collection to integrate machine translations. This will add English descriptions and claims to more than 370,000 records that currently contain Spanish text only. We will also be completing English translations for titles and abstracts.

Below are translation totals per field after this reload:

invention-title = 1513825
abstract = 818939
description = 370798
claims = 371645

We plan to finish this reload before January 21st.

2017 Notes

December 2017: Korean full text backfile

We will soon be loading a backfile of KR full text going back to 1983 for applications and to 1979 for granted patents and utility models.

November 2017: Australian data update

As announced in September, we have loaded AU full text and register data. More than 800,000 AU records have descriptions and claims as of today.

We also added ifi-integrated-data to AU records, as well as office-specific-data from the AU national register.

There are a few issues related to incorrect or missing paragraphs in the description text. We are in contact with our provider to solve this problem as quickly as possible.

We will soon begin loading the pharmaceutical names from the AU register data to our new ifi-annotated-data container. A patch will be required, for which you will receive instructions in advance.

September 2017: ifi-container for Japanese records

The load of ifi-integrated-content for JP records has been completed with the exception of ifi-standardized-names, which will be will be added at a later time.

August 2017: DTD revised from v2.2 to v2.3

We have released version 2.3 of the CLAIMS Direct XML DTD. For details, see XML DTD and Schemas.

This DTD includes new elements that will shortly contain data from the Australian (AU) national register and other similar data sources in the future. We plan to begin loading the AU full text and register data at the end of September and to complete it before the end of October.

August 2017: Recalculated ifi-patent-status for US "abandoned" patents

Based on recommendations from the USPTO, we recently processed patent status data from a new service called Patent Examination Data System (PEDS) and used it to recalculate the ifi-patent-status for abandoned patents. According to the USPTO, the PEDS beta version replaces the PAIR Bulk Data beta product and corrects one of the major problems with the PAIR Bulk Data beta system related to abandoned patents.

After processing this data, we realized that the PEDS status does not always match the status in the public PAIR portal. Therefore, we have contacted the USPTO regarding these discrepancies and are currently waiting for their reply. As a result, further data loads from PEDS (PAIR) to CLAIMS Direct are on hold and will depend on the stability and reliability of the data source.

July 2017: Austrian full text data available

Austrian applications and granted patents are now available for Premium and Premium+ subscribers as described in the data coverage table.

We would like to update you on other content improvements:

  • The reload of the ifi-integrated-content of EP records that we announced some weeks ago is now complete.
  • PDFs for US granted patents published before the year 2000 are now available.

June 2017: Filling gaps in KR translations

We have improved the production of Korean to English translations and were able to update 126,339 records as detailed below:

62850 titles
58002 abstracts
21650 descriptions
21842 claims

June 2017: Adding translations to CA records published in French

The Canadian Patent Office accepts English as well as French when applying for a patent. Machine translations to English for descriptions and claims filed in French are now available in CLAIMS Direct.

June 2017: Delivering priority linkage-type codes

Since February of this year the attribute list for the element priority-claim contains an optional attribute linkage-type, which is related to divisions, additions, and continuations, among others. For some authorities, this is especially important when we calculate expiry dates.

From now on we are delivering the priority linkage-type codes when available.

Example: AU-2017202677-A1

<priority-claim mxw-id="PPC175612506" ucid="AU-2015050665-W" linkage-type="A" load-source="docdb">
  <document-id format="epo">
    <country>AU</country>
    <doc-number>2015050665</doc-number>
    <kind>W</kind>
    <date>20151026</date>
  </document-id>
</priority-claim>

Regarding the backfile, we will reload records for countries where this data affects expiry dates. In this sense, reload of the NL backfile is complete as of load-id 268293.

For more information, see XML Content Description.

May 2017: BR and TW data loaded

Premium+ subscribers now have access to full text of BR and TW documents as described in the data coverage table.

We loaded close to 1.2 million records covering TW applications, utility models and granted patents, and more than 100,000 Brazilian records.

April 2017: Planning a reload of ifi-integrated-content of EP records

We would like to announce an improvement in the way we calculate the patent status for EP Designated States. Our upcoming reload of EP records will incorporate the following changes:

  • Due to customer requests for a simple live/dead status, we are now offering an In-force/Not-in-force status for Designated States. This will result in a more reliable and easy-to-understand calculation of status, and will make this field more usable for things like queries and reports. The new rules are described in more detail here. Our prior calculation rules can be seen here.
  • As a result of this reload and the previous reload of bad formatted claims, we will add the Claims Summary information to EP records.
  • We plan to start reloading EP granted patents on May 22, 2017. This will take a few weeks to complete.

April 2017: CH data available

Premium and Premium+ subscribers now have access to full text of CH applications and granted patents both in the original language and in English translation. We have updated 98,224 records published from 1980 to February of this year.

March 2017: dnum-type added to citations

dnum-type is an attribute added to citations in order to distinguish between publication and application numbers.
In order to avoid a complete reload of patent citations, we only add the dnum-type attribute to cited applications. If no dnum-type attribute is present on the patcit element, then the citation is referencing a publication.

As a result, 77,786 records having cited applications have been reloaded. Below you can see one example:

<patcit mxw-id="PCIT377701692" load-source="docdb" ucid="CN-201420631033-U" dnum-type="application">
    <document-id format="epo">
       <country>CN</country>
       <doc-number>201420631033</doc-number>
       <kind>U</kind>
       <date>20141028</date>
    </document-id>
    <sources>
       <source name="APP" created-by-npl="N"/>
    </sources>
</patcit>

March 2017: We start loading TW data

We started loading the backfile of TW data in early MarchPremium+ subscribers will have access to this data in the next few weeks. We plan to cover applications, granted patents, and utility models published from 2000.

March 2017: Reload of Euro-PCT claims finished

Euro-PCT records reload is now complete. We solved format issues found in the original data, so there are no longer any claims with an empty num attribute in our EP collection.

This reload fixed:

  • Approximately 520,000 EP records published between 1978 and 2012  with missing claim numbers (e.g. EP-0000252-B1)
  • Approximately 1 million EP records with format issues in the claims element (e.g. EP-0006849-A1)

February 2017: Upcoming reloads

Claims format: In 2016 we fixed some content-format issues in the claims section, which were wrong in the original data sources. For example, JP national data is delivered with all claims squashed into claim 1. Also we get some records without claim numbers. The two big claims reloads that we completed are: 5,763,744 JP original language claims and 1,773,963 WO claims. Total - more than 7.5 million claims fixed.

Euro-PCT records, which have the same issues as the "parent" PCT records, are going to be reloaded to fix the claims issue.
We plan to complete this reload before the end of February 2017.

Citations: DOCDB used to consolidate all citations at the earliest publication, A1 or A2 kind codes. This changed at some point last year and now citations are no longer consolidated. As a result of this change we are going to reload citations for about 14.5 million records in the upcoming weeks. This reload will fix a related issue, a former limit of 99 in the number of citations. This limit doesn't exist any more and after the reload, approximately 173,000 records will have more than 99 patent or non-patent citations.
We plan to complete this reload before the end of March 2017.

2016 Notes

April 2016: Details for duplicate citation fix

This fix affects approximately 1.1 million records from 17 authorities where we have identified duplicate citations records. This reload will eliminate these duplicate records.

We plan to complete this reload between April 1-7, 2016.

March 2016: Details for WO reload

We are reloading WO records published between 1978 and 2009 in order to add original parties data including agent and correspondent which is not covered via DOCDB. In addition we will add more complete priority data and improve claims structures.

We plan to complete this load between March 7-31, 2016.

The following provides an example of the priority-claims and parties changes.

WO-2009022636-A1 before reload:

<priority-claims>
   <priority-claim mxw-id="PPC87648930" ucid="JP-2007210505-A" load-source="docdb">
      <document-id format="epo">
        <country>JP</country>
        <doc-number>2007210505</doc-number>
        <kind>A</kind>
        <date>20070810</date>
     </document-id>
   </priority-claim>
</priority-claims>

...
<parties>
   <applicants>
     <applicant mxw-id="PPAR364824575" load-source="docdb" sequence="1" format="epo">
        <addressbook>
          <last-name>ICHIKAWA CO LTD</last-name>
          <address>
             <country>JP</country>
           </address>
        </addressbook>
     </applicant>
        <applicant mxw-id="PPAR364813372" load-source="docdb" sequence="1" format="intermediate">
        <addressbook>
          <last-name>ICHIKAWA CO., LTD.</last-name>
        </addressbook>
     </applicant>

 ...

WO-2009022636-A1  after reload:

<priority-claims>
   <priority-claim mxw-id="PPC159257708" load-source="patent-office">
     <document-id format="original">
       <country>JP</country>
       <doc-number>2007-210505</doc-number>
       <date>20070810</date>
     </document-id>
   </priority-claim>
   <priority-claim mxw-id="PPC87648930" ucid="JP-2007210505-A" load-source="docdb">
     <document-id format="epo">
       <country>JP</country>
       <doc-number>2007210505</doc-number>
       <kind>A</kind>
       <date>20070810</date>
     </document-id>
   </priority-claim>

...

<parties>
   <applicants>
     <applicant mxw-id="PPAR364824575" load-source="docdb" sequence="1" format="epo">
       <addressbook>
         <last-name>ICHIKAWA CO LTD</last-name>
         <address>
           <country>JP</country>
         </address>
       </addressbook>
     </applicant>
     <applicant mxw-id="PPAR364813372" load-source="docdb" sequence="1" format="intermediate">
       <addressbook>
         <last-name>ICHIKAWA CO., LTD.</last-name>
       </addressbook>
     </applicant>
     <applicant mxw-id="PPAR1081843961" load-source="patent-office" sequence="1" format="original">
       <addressbook>
         <name>ICHIKAWA CO., LTD.</name>
       <address>
         <address-1>14-15, Hongo 2-chome, Bunkyo-ku, Tokyo 1130033</address-1>
         <country>JP</country>
       </address>
     </addressbook>
   </applicant>

---

February 2016: Details for US grants reload

We will reload the <application-reference> and <publication-reference> containers for approximately 3.6 million US grants. We will be adding entity-status at the publication-reference level as well as original format application-reference information. We will also fill in missing art unit (see details below).

We plan to complete this load between February 22 and Mar 7, 2016.

The reload will include the missing us-art-unit attribute for US grants from 2001-2004. Additionally we will add national application number in its original format. So for example: 

US-6500018-B1 before update:

<application-reference mxw-id="PAPP61299422" ucid="US-94451301-A" us-series-code="09" load-source="docdb">
  <document-id format="epo">
    <country>US</country>
    <doc-number>94451301</doc-number>
    <kind>A</kind>
    <date>20010831</date>
    <lang>EN</lang>
  </document-id>
</application-reference>

US-6500018-B1 after update:

<application-reference ucid="US-94451301-A" us-series-code="09" us-art-unit="2833">
    <document-id mxw-id="PAPP61299422" load-source="docdb" format="epo">
       <country>US</country>
       <doc-number>94451301</doc-number>
       <kind>A</kind>
       <date>20010831</date>
       <lang>EN</lang>
    </document-id>
   <document-id mxw-id="PAPP95329224" load-source="patent-office" format="original">
       <country>US</country>
       <doc-number>09944513</doc-number>
       <date>20010831</date>
       <lang>EN</lang>
   </document-id>
</application-reference>

In the case of entity-status, this will be added to the publication reference.

Example of new entity-status attribute in US-7318729-B2:

<publication-reference fvid="75564347" ucid="US-7318729-B2" entity-status="small">
   <document-id>
       <country>US</country>
       <doc-number>7318729</doc-number>
       <kind>B2</kind>
       <date>20080115</date>
       <lang>EN</lang>
   </document-id>
</publication-reference>

Feb 2016: Details for JP reload – rekey corrections

We have identified about 2.5 million JP records that were "re-keyed" to fix their publication numbers. The new records were loaded, however the old records were never marked deleted. These effectively duplicate records will be reloaded and marked deleted between February 13-17, 2016.

DTD Revised from v2.0 to v2.1

See XML DTD and Schemas (released: Oct 19, 2015). Changes to the actual data will not begin publishing until November 30, 2015.

New CLAIMS Global content coverage page

See Claims Global Data Coverage (released: Oct 19, 2015)

Custom Text Web Service (TWS)

See Custom Service Providing Application-Centric Integrated View TWS

Reporting service

See Reporting

Citation service

See Citations

Family service

See Family

Main differences between CD1.5 and CD2.0

  • The web service authentication method has changed from httpauth to using the http headers to pass x-user and x-password to the service
  • With 2.0, JSON response are now wrapped in a response container so in CD1.5 {responseHeader:...} in CD2.0 is  {status: success, time:1 ...content:{responseHeader:....}}
  • New reporting service allows authorized users to produce CSV formatted data sets from lists of patent numbers or search results
  • No labels