Page tree
Skip to end of metadata
Go to start of metadata

September 2019: DTD v2.5 released, CPCI added, and bulk attachments service released

1) We have released v2.5 of the DTD. This release includes the following changes (see the XML DTD and Schemas for more details):

  • added CPCI value to schema attribute on classification-cpc 
  • made classification-cpc  unbounded (*) as child of classifications-cpc 

2) The EPO has announced that CPC-International will be rolled out starting with the next DocDB data delivery. From week 2019/36 onward all CPC classifications will have classification-scheme=“CPCI”. Classification-schemes “CPC” and “CPCNO” will be discontinued from that week onward. More information about this change can be found in Content Notifications.

3) We are releasing a new bulk attachments web service on September 1. This service is an interface to download attachments in bulk format. It is modeled on the existing XML update service which is used by on-site customers to manage updates to their CLAIMS Direct instance. Please see Bulk Attachments for full details.

July 2019: Soviet Union data added

In July we processed full text data from the former Soviet Union. IFI's collection now includes Author Certificates and Patents from 1924 to 1993.

June 2019: Six new full text countries added

Over the past few weeks we have added full text and PDF files for ARIPO, Bulgaria, Eurasia, OAPI and Romania to our Premium+ collection. Subscribers to either Premium or Premium+ also now have access to full text for Slovakia. Details can be seen in our data coverage table.

May 2019: Full text added for Czech Republic/Czechoslovakia

Full text and PDF files for Czech Republic (CZ) and the former Czechoslovakia (CS) are now available for Premium and Premium+ subscribers, respectively.

May 2019: Kind code changes in Norway and the Netherlands

Following EPO recommendations, we have changed the kind codes of some NL and NO records as follows:

Country CodeOld Kind CodeNew Kind Code
NLCC1 or C2
NOAL

May 2019: Change in CS and CZ document numbers

As announced in April, CS and CZ records changed the format of their patent numbers following DocDB recommendations. Some of the changes included moving the year to the end for CZ publication numbers dating prior to 2000 and suppressing embedded zeroes for CZ publication numbers, applications, and priority numbers dating from 2000 onward. The following table illustrates the changes.

Country CodeKind/YearOld FormatNew Format
CSA1, A2, A3YYnnnnn(nnnnn)nYY
CZpublications before 2000YYnnnnn(nnnn)nYY
CZpublications from 2000 onwardCCYYnnnnCCYY(nnn)n
CZapplications and priorities from 2000 onwardCCYYnnnnCCYY(nnn)n

The rekey, as initially described by the EPO, would have resulted in collisions. For example:

  • CZ-9102003-A3 (published 19930113) would have been rekeyed as CZ-200391-A3
  • CZ-20030091-A3 (published 20030618) would have been rekeyed as CZ-200391-A3

To avoid these collisions, the rekeyed ucids of 104 CS records delivered by the EPO match the application number and don't follow the rules above.

April 2019: Referenced images now available in Spanish front file

Beginning in January 2019, image files in our attachment server will include referenced images in TIFF format for ES patents and utility models.

April 2019: Five new full text countries added

Over the past week we have added full text and PDF files for Hungary, Lithuania, Latvia, Portugal, and Slovenia to our Premium+ collection. Details can be seen in our data coverage table.

April 2019: East German patents added

We have added description, claims and full document PDFs of DD granted patents published by the former East German patent office between 1980 and 2003. This content is available to Premium and Premium+ subscription levels.

March 2019: Canadian PDF backfile extended

Over the past few weeks we have expanded our collection of Canadian PDFs. Although there are still a few gaps, the addition extends our Canadian PDF coverage back to publication year 2000.

December 2018: Rekey of some IN apps

In March 2018 we solved most of the issues related to the numbering format of Indian records. Since then, publication numbers of records corresponding to Indian applications published before January 2016 have a publication number format of YYYYOONNNNN in CLAIMS Direct.

We recently discovered that there were still a few records which contained the old office codes. This created some duplicates such as IN-1992BO00073-A/IN-1992MU00073-A.

In order to solve this issue we rekeyed those records with the old office codes, following the Indian patent office rules:

BO --> MU
CA --> KO
MA --> CH

Examples:
IN-1992BO00073-A is now IN-1992MU00073-A
IN-1992CA00749-A is now IN-1992KO00749-A
IN-1998MA01586-A is now N-1998CH01586-A

As a result of this process 7998 IN records have been marked "deleted".

September 2018: Coverage of EP A4 documents

EP kind code A4 documents cover the supplementary search reports as issued by the EPO following PCT applications entering the EP regional phase. Until now these documents were not fully integrated into the normal EPO publication flow of A and B publications.

Now A4 data is available in CLAIMS Direct both in XML format as well as PDF. So far, the entire year 2017 and the first half of 2018 are covered.

The EPO plans to deliver future A4 documents in batches every half year.

August 2018: Release of CLAIMS Direct Web Services version 3.5

CLAIMS Direct Web Services version 3.5 was released Saturday August 25th. The following changes have been implemented in this release:

  • RFC: if no first inventive CPC is available, use first inventive from IPCR
    The reporting column cpc-first-inventive will use the first inventive IPCR if no CPC first inventive value is present.
  • RFC: create new endpoint that retrieves all associated attachments as a zip archive
    This new endpoint, /attachment/fetchall retrieves all attachments available for a specific ucid and bundles them into a single zip archive. Please see /attachment/fetchall for service details.
  • RFC: (BETA) enable attachment manipulation of image type and size
    A new set of parameters is available on the /attachment/fetch endpoint that will convert image types to common web formats, including jpeg and png. Please see /attachment/fetch for the additional parameters.

July 2018: Extending PDF collection

In the last few weeks, we have added full document PDFs to the following countries:

  • CA - Documents published from January 2018 now have full document PDFs. In the next few weeks, we will also be loading PDFs for the Canadian backfile.
  • KR - We have added PDF files to the pre-2006 records which were loaded in January.
  • RU - Full document PDFs are now available for Russian documents published from 2005 up to the current date. We have also loaded PDFs for some pre-2005 records, although this collection is not complete.
  • WO - PCT documents now have PDF files.

We will soon expand our coverage even further by processing PDF files for India, focusing first on applications and later adding granted patents.

April 2018: WO records in Asian languages

We have processed a WIPO backfile of records in Asian languages. The total number of records included in this backfile with new or updated Asian language text is 211915 and the load-id is 304475. We are currently processing machine translations to English for these records.

April 2018: Translations for EP and WO records

As we announced in March, we have added English translations to EP and WO records that were filed in languages other than English.

For EP records these are the translated fields that we added:

abstracts332878
descriptions1389918
claims859589

For WO records:

abstracts4262
descriptions1070417
claims1070539


In the next few days, we plan to load a backfile of descriptions and claims for WO applications published in Chinese, Japanese, and Korean, going back to 1978. This data will include both the original language and English translations.

April 2018: Correction of issues in the ifi-integrated-data section

  • Unexpected expiration dates in applications

CLAIMS Direct provides calculated anticipated and adjusted expiration dates for granted patents (publication-type=G ). We recently discovered that some applications (publication-type=A) contained expiry dates in the ifi-integrated-content section. To maintain data consistency, we have removed those unexpected expiry dates as of load-id 302070.

  • Incorrect status in some Japanese records with kind code U

The IFI calculated status of some JP applications with kind code U was showing as expired when the correct status should have been granted. The origin of the problem was that JP laws changed around 1994 and therefore kind code U has different meanings before and after the changes. Though this was considered in IFI's calculation rules, some JP records were still being calculated incorrectly as publication-type=G when they should have been publication-type=A. This error in the publication type caused the status to be calculated incorrectly. This has now been fixed in all affected records and they have been reloaded as of load-id 302163.

March 2018: Reload of some Australian and Indian records

In the last few days we have solved a couple of issues caused by problems in the raw data:

  • Future AU publication dates

When processing data from the Australian national register, we found several records with OPI dates well into the future. According to the patent office, there was an issue with the optical character recognition for these dates. Although some of these dates have already been corrected, it's taking longer than expected for the patent office to fix the problem. To prevent this from happening again, we have changed these future dates to our default entry for dates that we know to be incorrect (00010101).

  • IN numbering formats

For most Indian patent authorities, we receive data from the Indian patent office without a publication number, so we use the application number to fill the publication reference element and to build a UCID. In general, the application reference keeps the original application number format from the patent office and the publication reference (as well as UCID) is calculated by IFI rules.

From 2016 the Indian patent office changed the format of their application numbers. The EPO also recently started to deliver Indian applications in DocDB, but using a different number format. This created some inconsistencies in our data.

To solve this problem, we did a rekey of around 27192 records (load-id 300781).

Now, records corresponding to Indian applications published before January 2016 have a publication number format of YYYYOONNNNN:

YYYY: four-digit year
OO: two-character office code
NNNNN: sequence number zero padded to five digits

While from 2016, patent numbers will use the format YYYYOTNNNNNN:

YYYY: four-digit year
O: one-character office code (1 for Delhi, 2 for Mumbai, 3 for Kolkata, and 4 for Chennai)
T: type of application *
NNNNNN: sequence number

* type of application:
1 = Ordinary Application
2 = Ordinary-Divisional Application
3 = Ordinary-Patent of Addition Application
4 = Convention Application
5 = Convention-Divisional Application
6 = Convention-Patent of Addition Application
7 = PCT National Phase Application
8 = PCT National Phase-Divisional Application
9 = PCT National Phase Patent of Addition Application

March 2018: Updates to Brazil, Taiwan, EP, and WIPO

In the first week of March, we completed an update of Brazilian records which resolves a major delay in BR front file deliveries. This included around 10,000 Brazilian full text records, which contain both the original language as well as translations. These records include full document PDFs, but are still missing bibliographic data from DocDB since the EPO has not published them yet. We will update any information delivered by DocDB as soon as it is available. BR full text is now up-to-date as of 20180206. Going forward, we expect to maintain front file publications within two weeks of publication. 

Similarly, we are loading missing TW full text records published in the last quarter of 2017 and forward.

In parallel, we will be adding English translations to EP and WO records that were filed in languages other than English. We plan to start with EP documents filed in French or German.

January 2018: Korean backfile and Australian pharmaceutical data loaded

We have finished the Korean backfile load announced in December, which added full text to more than 2.2 million KR documents published between 1979 and 2005. English translations will be processed soon.

Pharmaceutical names from the AU register data are now available in the new ifi-annotated-data container. See ifi-annotated-data for more information.

January 2018: Translations for Spanish patents

We are starting a reload of the ES collection to integrate machine translations. This will add English descriptions and claims to more than 370,000 records that currently contain Spanish text only. We will also be completing English translations for titles and abstracts.

Below are translation totals per field after this reload:

invention-title = 1513825
abstract = 818939
description = 370798
claims = 371645

We plan to finish this reload before January 21st.

Earlier Release Notes

Release notes prior to 2018 can be seen in the Archived Release Notes section.

  • No labels