• Find People
  • Campus Map
  • PiratePort
  • A-Z
    • About
    • Submit
    • Browse
    • Login
    View Item 
    •   ScholarShip Home
    • Seminars and Conference Speakers
    • Research and Creative Achievement Week
    • 13th Annual RCAW (2019)
    • View Item
    •   ScholarShip Home
    • Seminars and Conference Speakers
    • Research and Creative Achievement Week
    • 13th Annual RCAW (2019)
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Browse

    All of The ScholarShipCommunities & CollectionsDateAuthorsTitlesSubjectsTypeDate SubmittedThis CollectionDateAuthorsTitlesSubjectsTypeDate Submitted

    My Account

    Login

    Statistics

    View Google Analytics Statistics

    Big Data Analytics for Historical Document Processing

    Thumbnail
    View/ Open
    Poster (2.313Mb)

    Show full item record
    Author
    Philips, James
    Abstract
    Historical Document Processing is the process of digitizing written material from the past for future use by historians and other scholars. It incorporates algorithms and software tools from various subfields of computer science, including computer vision, document analysis and recognition, natural language processing, and machine learning, to convert images of ancient manuscripts, letters, diaries, and early printed texts automatically into a digital format usable in information retrieval systems. Within the past twenty years, as libraries, museums, and other cultural heritage institutions have scanned an increasing volume of their historical document archives, the need to transcribe the full text from these collections has become acute. Big Data Analytics and infrastructure will be essential tools in this field. This study compares performance analysis of two OCR systems, discusses an Historical Document Processing (HDP) workflow, and highlights the role of OCR software in a RESTful API for an HDPaaS (HDP as a Service) system.
    URI
    http://hdl.handle.net/10342/7389
    Date
    2019-04-01
    Collections
    • 13th Annual RCAW (2019)

    xmlui.ArtifactBrowser.ItemViewer.elsevier_entitlement

    East Carolina University has created ScholarShip, a digital archive for the scholarly output of the ECU community.

    • About
    • Contact Us
    • Send Feedback