Package: orderanalyzer 1.0.1

orderanalyzer: Extracting Order Position Tables from PDF-Based Order Documents

Functions for extracting text and tables from PDF-based order documents. It provides an n-gram-based approach for identifying the language of an order document. It furthermore uses R-package 'pdftools' to extract the text from an order document. In the case that the PDF document is only including an image (because it is scanned document), R package 'tesseract' is used for OCR. Furthermore, the package provides functionality for identifying and extracting order position tables in order documents based on a clustering approach.

Authors:Michael Scholz [cre, aut], Joerg Bauer [aut]

orderanalyzer_1.0.1.tar.gz
orderanalyzer_1.0.1.zip(r-4.7)orderanalyzer_1.0.1.zip(r-4.6)orderanalyzer_1.0.1.zip(r-4.5)
orderanalyzer_1.0.1.tgz(r-4.6-any)orderanalyzer_1.0.1.tgz(r-4.5-any)
orderanalyzer_1.0.1.tar.gz(r-4.7-any)orderanalyzer_1.0.1.tar.gz(r-4.6-any)
orderanalyzer_1.0.1.tgz(r-4.6-emscripten)
manual.pdf |manual.html
card.svg |card.png
orderanalyzer/json (API)

# Install 'orderanalyzer' in R:
install.packages('orderanalyzer', repos = c('https://michael-scholz-dev.r-universe.dev', 'https://cloud.r-project.org'))

On CRAN:

Conda:

This package does not link to any Github/Gitlab/R-forge repository. No issue tracker or development information is available.

1.00 score 486 downloads 3 exports 38 dependencies

Last updated from:1080393bb8. Checks:9 OK. Indexed: yes.

TargetResultTimeFilesSyslog
linux-devel-x86_64OK135
source / vignettesOK160
linux-release-x86_64OK154
macos-release-arm64OK146
macos-oldrel-arm64OK165
windows-develOK123
windows-releaseOK150
windows-oldrelOK108
wasm-releaseOK110

Exports:extractTablesextractTextidentifyLanguage

Dependencies:clicpp11data.tabledigestdplyrfastmatchgenericsglueISOcodesjsonlitelatticelifecyclelubridatemagrittrMatrixmatrixcalcpillarpkgconfigpurrrquantedaR6RcpprlangrlistSnowballCstopwordsstringistringrtibbletidyrtidyselecttimechangeutf8vctrswithrXMLxml2yaml