Auto-Count Symbols in Portable Document Format (PDF)

Loading...
Thumbnail Image

Date

2021-04-22

Access

Authors

Florian, Andrew

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Estimating electrical costs often involves counting symbols in a PDF document. Existing software has sped up this process compared to manual counting, but there is room for further improvement. The proposed solution builds on open source components to efficiently search a PDF document for the outlines of all symbols, including letters or numbers, used by electrical engineers to differentiate between otherwise similar symbols. It then sorts these outlines into groups and counts each occurrence. Symbol for symbol, it takes less than half the time required by two leading competitors. Unfortunately, current settings often produce numerous sub-groups which need to be combined to provide meaningful totals. K-means and other improved clustering methods are being explored. The proposed concept could also be helpful in other similar applications that identify symbols or text in images.

Description

Citation

DOI

Collections