Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Table of Contents

2018 Notes

December 2018: Rekey of some IN apps

In March 2018 we solved most of the issues related to the numbering format of Indian records. Since then, publication numbers of records corresponding to Indian applications published before January 2016 have a publication number format of YYYYOONNNNN in CLAIMS Direct.

We recently discovered that there were still a few records which contained the old office codes. This created some duplicates such as IN-1992BO00073-A/IN-1992MU00073-A.

In order to solve this issue we rekeyed those records with the old office codes, following the Indian patent office rules:

BO --> MU
CA --> KO
MA --> CH

Examples:
IN-1992BO00073-A is now IN-1992MU00073-A
IN-1992CA00749-A is now IN-1992KO00749-A
IN-1998MA01586-A is now N-1998CH01586-A

As a result of this process 7998 IN records have been marked "deleted".

September 2018: Coverage of EP A4 documents

EP kind code A4 documents cover the supplementary search reports as issued by the EPO following PCT applications entering the EP regional phase. Until now these documents were not fully integrated into the normal EPO publication flow of A and B publications.

Now A4 data is available in CLAIMS Direct both in XML format as well as PDF. So far, the entire year 2017 and the first half of 2018 are covered.

The EPO plans to deliver future A4 documents in batches every half year.

August 2018: Release of CLAIMS Direct Web Services version 3.5

CLAIMS Direct Web Services version 3.5 was released Saturday August 25th. The following changes have been implemented in this release:

  • RFC: if no first inventive CPC is available, use first inventive from IPCR
    The reporting column cpc-first-inventive will use the first inventive IPCR if no CPC first inventive value is present.
  • RFC: create new endpoint that retrieves all associated attachments as a zip archive
    This new endpoint, /attachment/fetchall retrieves all attachments available for a specific ucid and bundles them into a single zip archive. Please see /attachment/fetchall for service details.
  • RFC: (BETA) enable attachment manipulation of image type and size
    A new set of parameters is available on the /attachment/fetch endpoint that will convert image types to common web formats, including jpeg and png. Please see /attachment/fetch for the additional parameters.

July 2018: Extending PDF collection

In the last few weeks, we have added full document PDFs to the following countries:

  • CA - Documents published from January 2018 now have full document PDFs. In the next few weeks, we will also be loading PDFs for the Canadian backfile.
  • KR - We have added PDF files to the pre-2006 records which were loaded in January.
  • RU - Full document PDFs are now available for Russian documents published from 2005 up to the current date. We have also loaded PDFs for some pre-2005 records, although this collection is not complete.
  • WO - PCT documents now have PDF files.

We will soon expand our coverage even further by processing PDF files for India, focusing first on applications and later adding granted patents.

April 2018: WO records in Asian languages

We have processed a WIPO backfile of records in Asian languages. The total number of records included in this backfile with new or updated Asian language text is 211915 and the load-id is 304475. We are currently processing machine translations to English for these records.

April 2018: Translations for EP and WO records

As we announced in March, we have added English translations to EP and WO records that were filed in languages other than English.

For EP records these are the translated fields that we added:

abstracts332878
descriptions1389918
claims859589

For WO records:

abstracts4262
descriptions1070417
claims1070539


In the next few days, we plan to load a backfile of descriptions and claims for WO applications published in Chinese, Japanese, and Korean, going back to 1978. This data will include both the original language and English translations.

April 2018: Correction of issues in the ifi-integrated-data section

  • Unexpected expiration dates in applications

CLAIMS Direct provides calculated anticipated and adjusted expiration dates for granted patents (publication-type=G ). We recently discovered that some applications (publication-type=A) contained expiry dates in the ifi-integrated-content section. To maintain data consistency, we have removed those unexpected expiry dates as of load-id 302070.

  • Incorrect status in some Japanese records with kind code U

The IFI calculated status of some JP applications with kind code U was showing as expired when the correct status should have been granted. The origin of the problem was that JP laws changed around 1994 and therefore kind code U has different meanings before and after the changes. Though this was considered in IFI's calculation rules, some JP records were still being calculated incorrectly as publication-type=G when they should have been publication-type=A. This error in the publication type caused the status to be calculated incorrectly. This has now been fixed in all affected records and they have been reloaded as of load-id 302163.

March 2018: Reload of some Australian and Indian records

In the last few days we have solved a couple of issues caused by problems in the raw data:

  • Future AU publication dates

When processing data from the Australian national register, we found several records with OPI dates well into the future. According to the patent office, there was an issue with the optical character recognition for these dates. Although some of these dates have already been corrected, it's taking longer than expected for the patent office to fix the problem. To prevent this from happening again, we have changed these future dates to our default entry for dates that we know to be incorrect (00010101).

  • IN numbering formats

For most Indian patent authorities, we receive data from the Indian patent office without a publication number, so we use the application number to fill the publication reference element and to build a UCID. In general, the application reference keeps the original application number format from the patent office and the publication reference (as well as UCID) is calculated by IFI rules.

From 2016 the Indian patent office changed the format of their application numbers. The EPO also recently started to deliver Indian applications in DocDB, but using a different number format. This created some inconsistencies in our data.

To solve this problem, we did a rekey of around 27192 records (load-id 300781).

Now, records corresponding to Indian applications published before January 2016 have a publication number format of YYYYOONNNNN:

YYYY: four-digit year
OO: two-character office code
NNNNN: sequence number zero padded to five digits

While from 2016, patent numbers will use the format YYYYOTNNNNNN:

YYYY: four-digit year
O: one-character office code (1 for Delhi, 2 for Mumbai, 3 for Kolkata, and 4 for Chennai)
T: type of application *
NNNNNN: sequence number

* type of application:
1 = Ordinary Application
2 = Ordinary-Divisional Application
3 = Ordinary-Patent of Addition Application
4 = Convention Application
5 = Convention-Divisional Application
6 = Convention-Patent of Addition Application
7 = PCT National Phase Application
8 = PCT National Phase-Divisional Application
9 = PCT National Phase Patent of Addition Application

March 2018: Updates to Brazil, Taiwan, EP, and WIPO

In the first week of March, we completed an update of Brazilian records which resolves a major delay in BR front file deliveries. This included around 10,000 Brazilian full text records, which contain both the original language as well as translations. These records include full document PDFs, but are still missing bibliographic data from DocDB since the EPO has not published them yet. We will update any information delivered by DocDB as soon as it is available. BR full text is now up-to-date as of 20180206. Going forward, we expect to maintain front file publications within two weeks of publication. 

Similarly, we are loading missing TW full text records published in the last quarter of 2017 and forward.

In parallel, we will be adding English translations to EP and WO records that were filed in languages other than English. We plan to start with EP documents filed in French or German.

January 2018: Korean backfile and Australian pharmaceutical data loaded

We have finished the Korean backfile load announced in December, which added full text to more than 2.2 million KR documents published between 1979 and 2005. English translations will be processed soon.

Pharmaceutical names from the AU register data are now available in the new ifi-annotated-data container. See ifi-annotated-data for more information.

January 2018: Translations for Spanish patents

We are starting a reload of the ES collection to integrate machine translations. This will add English descriptions and claims to more than 370,000 records that currently contain Spanish text only. We will also be completing English translations for titles and abstracts.

Below are translation totals per field after this reload:

invention-title = 1513825
abstract = 818939
description = 370798
claims = 371645

We plan to finish this reload before January 21st.

2017 Notes

December 2017: Korean full text backfile

...