The Attachment Server contains clipped images, drawings, chemistry files, full page images, and other non-textual files delivered by patent offices. Most of the links to the files in the Attachment Server are embedded in CLAIMS Direct XML files. There is a simple web service available to access this content (see Attachments for more information). This document describes the content of the Attachment Server, explains how image files are embedded in the XML files, and discusses how to interface with the server.
Raw Data
Image content from patents is available from data sources in different formats, which include the following:
- Full page images
These images are facsimiles of the original documents delivered as TIFF files or PDF documents, which can be a single page (one file per page) or multi-page (one file per document).
- Embedded images
Embedded images are images that are part of the document. Unlike full page images, they are not the entire document.
- ST35, mixed mode
Mixed-mode format is an old format delivered by some patent offices. This format includes both text and binary data. The text blocks and binary blocks are bytes of data lined up in a file. There is no separate image file; the image data is just another "in-line" block in the file.
- Referenced images
Images are delivered as separate files in TIFF or JPEG format. They are named systematically, and correspond to references within the XML or SGML documents. This is the most common current practice for delivering images. Multi-page TIFFs are indicated when multipleid
attributes point to the same file. These are usually only found in JP records. A special tool or viewer is required for proper handling of multi-page TIFFs. A list of viewers can be found here.
Like the images, other non-textual content can also be provided in separate files and stored in special formats such as chemical structures, gene sequences, or mathematical formulae.
Attachments in the CLAIMS Direct Patent Database
Images and other files containing non-textual content (collectively "the attachments") are stored in the Attachment Server. Files containing embedded images, as well as mixed mode files, are pre-processed before being loaded to the CLAIMS Direct Patent Database. Links to the attachments are generally referenced in the CLAIMS Direct XML files.
IMG Element
The most common way to find embedded/referenced images in the full text of Alexandria XML files is by using the img element.
Depending on the data source, different parameters can be found in this element, including the following most relevant attributes:
Attribute | Description | ||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
id | Unique image identifier
| ||||||||||||||||||||||||||||||||||||||
file | Name of the attachment file | ||||||||||||||||||||||||||||||||||||||
he | Height of the image described in pixels or percentage | ||||||||||||||||||||||||||||||||||||||
wi | Width of the image described in pixels or percentage | ||||||||||||||||||||||||||||||||||||||
img-format | Format of the image file, which includes the following options:
| ||||||||||||||||||||||||||||||||||||||
img-content | Describes the kind of content in the image file. Through the years, different values have been allowed in the
Note: The current default value is |
The img element may be found in the abstract, description, or claims sections of the XML document:
Abstract Section
Code Block | ||
---|---|---|
| ||
<abstract mxw-id="PA87583406" lang="EN" load-source="patent-office"> <p num="pa01">This invention relates to conduits (12) that allow communication of fluids from one portion of a patient's body to another; and, more particularly, to a blood flow conduit to allow communication from a heart chamber to a vessel or vice versa, and/or vessel to vessel <img id="iaf01" file="imgaf001.tif" wi="78" he="89" img-content="drawing" img-format="tif"/> </p> </abstract> |
Description Section
Code Block | ||
---|---|---|
| ||
<p num="p0078">For screening purposes a GST and 6 x His fusion of the LBD (from amino acids 155 of hLXRalpha to 447) of human LXRalpha is constructed by first cloning a Gateway cassette (Invitrogen) in frame into the Sma I site of the pAGGHLT Polylinker (Pharmingen) [...] Primers used for Amplification are: <img id="ib0004" file="imgb0004.tif" wi="165" he="15" img-content="dna" img-format="tif"/> <!-- [...] --> </p> |
Claims Section
Code Block | ||
---|---|---|
| ||
<claim-text>A phosphoramidite reagent of the formula I <img file="00270001.tif" id="img-00270001" he="32" wi="91" img-format="tif" img-content="cf"/> wherein Y and Y' each independently are selected from optionally substituted C<sub>1-6</sub>-alkyl or Y and Y' together with the nitrogen to which they are bonded form a non-aromatic [...] </claim-text> |
Drawings Section
Attachment references can be embedded in the img
element as described above but they can also be in a drawings
section of the XML document. Pages published at the end of the original patent document and containing drawings are also referenced in this section.
Code Block | ||
---|---|---|
| ||
<drawings mxw-id="PDW3055834" load-source="patent-office"> <figure num="1"> <img file="00380001.tif" id="img-00380001" he="228" wi="180" img-format="tif" img-content="dr"/> </figure> </drawings> |
A figure id
can be used to reference a drawing in the full text as follows:
Code Block | ||
---|---|---|
| ||
<p num="p0031">Referring now to <figref idrefs="f0001">FIGURES 1A and 1B</figref>, a coronary artery bypass is accomplished by disposing a conduit 12 (<figref idrefs="f0001">Fig. 1B</figref>) in a heart wall or myocardium MYO of a patient's heart PH (<figref idrefs="f0001">Fig. 1A</figref>). [...] </p> <drawings mxw-id="PDW10967064" load-source="patent-office"> <figure id="f0001" num="1"> <img id="if0001" file="imgf0001.tif" wi="140" he="230" img-content="drawing" img format="tif"/> </figure> </drawings> |
Chemistry Section
Chemical formulas can be found in the img
section with "chem" or "cf", along with "drawing" values in the img-content
attribute. Another way to encode a chemistry-specific embedded image is through the use of the chemistry
element:
Code Block | ||
---|---|---|
| ||
<claim-text>Use of a compound according to formula (I), or pharmaceutically acceptable salts or solvates thereof, <chemistry id="chem0011" num="0011"> <img id="ib0071" file="imgb0071.tif" wi="74" he="50" img-content="chem" img-format="tif"/> </chemistry> wherein [...] </claim-text> |
The USPTO provides chemical structures in ChemDraw (CDX) and MDL (MOL) formats. References to these special files can be found in the attachment
element in the chemistry
section:
Code Block | ||
---|---|---|
| ||
<chemistry id="CHEM-US-00001" num="00001"> <img id="EMI-C00001" he="19.64mm" wi="54.36mm" file="US07307149-20071211-C00001.TIF" alt="embedded image" img-content="table" img-format="tif"/> <attachments> <attachment idref="CHEM-US-00001" attachment-type="cdx" file="US07307149-20071211-C00001.CDX"/> <attachment idref="CHEM-US-00001" attachment-type="mol" file="US07307149-20071211-C00001.MOL"/> </attachments> </chemistry> |
Other Sections
In the same way that chemical structures are referenced in the special "chemistry" section, other attachments also have specific elements in the XML files: sequence-list
, math
, megatable-doc
, and table-external-doc
.
Search Report Pages
Search report pages published by some patent authorities are frequently distributed as full page images. References to the image files can be found in the search-report-data
container under the doc-page
element:
Code Block | ||
---|---|---|
| ||
<search-report-data> <doc-page id="srep0001" file="srep0001.tif" wi="154" he="233" type="tif"/> </search-report-data> |
File Naming
There is no consensus in naming referenced files. Every publishing authority follows its own rules, which also change with the years. The following tables demonstrate some representative examples of attachment listings associated with patents from different publishing authorities.
Attachment list for: EP-2207108-A1-20100714
File | Size |
---|---|
DOCUMENT.PDF | 698793 |
imgaf001.tif | 6725 |
imgb0001.tif | 321 |
imgb0002.tif | 315 |
imgf0001.tif | 4219 |
imgf0002.tif | 6892 |
srep0001.tif | 33861 |
Attachment list for: WO-2010072727-A1-20100701
File | Size |
---|---|
imgf000004_0001.tif | 1464 |
imgf000010_0001.tif | 435 |
imgf000010_0002.tif | 435 |
imgf000037_0001.tif | 7738 |
imgf000040_0001.tif | 9871 |
imgf000041_0001.tif | 6357 |
imgf000042_0001.tif | 1935 |
imgf000043_0001.tif | 1464 |
imgf000046_0001.tif | 1372 |
Attachment list for: WO-2009081462-A1-20090702
File | Size |
---|---|
JPOXMLDOC01-appb-D000003.tif | 3974 |
JPOXMLDOC01-appb-D000004.tif | 7764 |
JPOXMLDOC01-appb-D000005.tif | 6896 |
JPOXMLDOC01-appb-D000006.tif | 4642 |
JPOXMLDOC01-appb-D000007.tif | 4252 |
JPOXMLDOC01-appb-D000008.tif | 4270 |
JPOXMLDOC01-appb-D000009.tif | 8998 |
JPOXMLDOC01-appb-T000001.jpg | 92907 |
JPOXMLDOC01-appb-T000002.jpg | 65901 |
Attachment list for: US-7307149-B2-20071211
File | Size |
---|---|
US07307149-20071211-C00001.CDX | 3431 |
US07307149-20071211-C00001.MOL | 681 |
US07307149-20071211-C00001.TIF | 3056 |
US07307149-20071211-D00001.TIF | 10470 |
US07307149-20071211-D00002.TIF | 21196 |
US07307149-20071211-D00003.TIF | 13830 |
US07307149-20071211-D00004.TIF | 14215 |
US07307149-20071211-D00005.TIF | 93857 |
US07307149-20071211-D00006.TIF | 12986 |
US07307149-20071211-D00007.TIF | 16709 |
US07307149-20071211-D00008.TIF | 4793 |
US07307149-20071211-D00009.TIF | 4896 |
US07307149-20071211-S00001.XML | 159831 |
Attachment list for: JP-2005100000-A-20050414
File | Size |
---|---|
00000001.TIF | 3759 |
2005100000.pdf | 84862 |
2005100000.pos | 1496 |
2005100000.tif | 81180 |
Attachment list for: KR-100920729-B1-20091007
File | Size |
---|---|
1020070098852.pdf | 2406235 |
112007070719375-sdosl.app | 2032 |
112007081317753-pat00001.tif | 1950 |
112007081317753-pat00002.tif | 1950 |
112007081317753-pat00003.tif | 1950 |
112007081317753-pat00005.jpg | 204870 |
112007081317753-pat00006.jpg | 180538 |
112007081317753-pat00007.jpg | 279496 |
112007081317753-pat00008.jpg | 166127 |
112007081317753-pat00009.jpg | 170515 |
112007081317753-pat00010.jpg | 244507 |
112007081317753-pat00011.jpg | 156090 |
112007081317753-pat00012.jpg | 156553 |
112008080483320-pat00013.jpg | 37395 |
R1020070098852.jpg | 252603 |
Data Coverage
Note that the availability of attachments will differ depending on your subscription level. The attachment collection can be queried through our shared service API to get exact file counts for your level.
Country | Type | Format | Years | |
---|---|---|---|---|
AP grants | Full document | multi-page PDF | 1985-2005 | |
AT apps | Full document | multi-page PDF | 2005 to present | |
AT grants | Full document | multi-page PDF | 1990 to present | |
AT utility models | Full document | multi-page PDF | 1994 to present | |
AU apps and grants | Full document | multi-page PDF | 1990 to present | |
BE apps | Referenced images | TIFF | 1980 to present | |
Full document | multi-page PDF | 1980 to present | ||
BE grants | Full document | multi-page PDF | 2015 to present | |
BG apps, grants, and utility models | Full document | multi-page PDF | 1994 to present | |
BR apps | Full document | multi-page PDF | 2010 to present | |
BR grants | Full document | multi-page PDF | 2014 to present | |
BR utility models | Full document | multi-page PDF | 2009 to present | |
CA apps and grants | Full document | multi-page PDF | 2000 to present | |
CH apps and grants | Full document | multi-page PDF | 1980 to present | |
CN apps | Referenced images | TIFF | 2011 to present | |
Full document | multi-page PDF | 1985 to present | ||
CN grants | Referenced images | TIFF | 2011 to present | |
Full document | multi-page PDF | 1990 to present | ||
CN utility models | Referenced images | TIFF | 2011 to present | |
Full document | multi-page PDF | 1985 to present | ||
CS grants | Full document | multi-page PDF | 1980-1993 | |
CZ apps, grants, and utility models | Full document | multi-page PDF | 1993 to present | |
DD grants | Full document | multi-page PDF | 1980-2003 | |
DE apps, grants, | Referenced images | TIFF | 2004 to present | |
DK apps and grants | Referenced images | TIFF | 1980 to present | |
Full document | multi-page PDF | 1980 to present | ||
EA grants | Full document | multi-page PDF | 2000 to present | |
EP apps | Referenced images | TIFF | 1978 to present | |
Full document | multi-page PDF | 1978 to present | ||
EP grants | Referenced images | TIFF | 1980 to present | |
Full document | multi-page PDF | 1980 to present | ||
ES apps, grants, | Referenced images | TIFF | 2019 to present | |
Full document | multi-page PDF | 2007 to present | ||
FI apps and grants | Referenced images | TIFF | 1980 to present | |
Full document | multi-page PDF | 1980 to present | ||
FR apps | Referenced images | TIFF | 1981 to present | |
Full document | multi-page PDF | 1981 to present | ||
GB apps | Full document | multi-page PDF | 1980 to present | |
HU apps and grants | Full document | multi-page PDF | 1980 to present | |
JP apps | Front-page drawings | TIFF | 1980 to present | |
JP apps, grants, | Referenced images | multi-page TIFF + POS file | 2004 to present | |
Full document | multi-page PDF | 2004 to present | ||
KR apps | Full document | multi-page PDF | 1983 to present | |
KR grants and utility models | Full document | multi-page PDF | 1979 to present | |
KR apps, grants, | Referenced images | TIFF, JPEG, APP | 2006 to present | |
Full document | multi-page PDF | 2006 to present | ||
LT grants | Full document | multi-page PDF | 1994 to present | |
LU apps | Referenced images | TIFF | 1980 to present | |
Full document | multi-page PDF | 1980 to present | ||
LV grants | Full document | multi-page PDF | 1994 to present | |
NL apps | Referenced images | TIFF | 1990 to present | |
Full document | multi-page PDF | 1990 to present | ||
NL grants | Referenced images | TIFF | 1997 to present | |
Full document | multi-page PDF | 1997 to present | ||
PT apps and OA grants | Full document | multi-page PDF | 1986 to present | RU apps1980-2007 |
PT apps and grants | Full document | multi-page PDF | 1986 to present | |
RO apps | Full document | multi-page PDF | 2011 to present | |
RO grants | Full document | multi-page PDF | 1993 to present | |
RU apps, grants, and utility models | Full document | multi-page PDF | 2005 to present | |
RU grants and utility models | Referenced images | TIFF and JPEG | 1994 to present | |
SI grants | Full document | multi-page PDF | 1992 to present | |
SK apps and grants | Full document | multi-page PDF | 1993 to present | |
SK utility models | Full document | multi-page PDF | 2008 to present | |
TW apps | Full document | multi-page PDF | 2003 to present | |
TW grants | Full document | multi-page PDF | 2000 to present | |
TW utility models | Full document | multi-page PDF | 2004 to present | |
US apps | Full document | multi-page PDF | 2001 to present | |
US grants | Full document | multi-page PDF | 1920 to preent | |
US apps and grants | Referenced images | TIFF | 2001 to present | |
Complex work units | CDX, MOL, NB, XML | 2001 to present | ||
WO | Full document | multi-page PDF | 1978 to present | |
Referenced images | TIFF, JPEG | 1978 to present |