Extract text, glyphs, words and metrics from PDF documents with PHP

SetaPDF-Extractor

Extract text, glyphs, words and metrics from PDF documents with PHP

Get Word Groups 

This demos use the word group strategy which allows you to extract groups of words which are related to each other, such as words in a column or paragraph. The strategy also comes with a logic to reassemble words which are separated by hyphens on several lines.

While this demo shows the text result, you may check out the next demo, which visually marks the found word groups in the original PDF document. 

Select or upload a file

The uploaded files are bound to your browser session and are not accessible by any other user. They will get deleted after 24 hours automatically.

Loading...