...
Parameter | Description |
---|---|
nthreads | For increased speed, the extraction of data by default is done using parallel processes. This parameter specifies exactly how many parallel processes will be used. A general rule of thumb is to set this parameter to the number of CPU cores the machine has. |
batchsize | This parameter specifies the number of documents to extract per thread. If you know the content you are extracting, this parameter can be used to increase speed, . e.g., bibliographic content only would benefit from a larger value while full-text content would benefit from a lower value. |
...