Repository logo
 

Bibliographic Reference Analysis in Archival Data Using Supervised Machine Learning and Grammatical Features

dc.access.optionOpen Access
dc.contributor.advisorTabrizi, M. H. N
dc.contributor.authorPhilips, James Patrick
dc.contributor.departmentComputer Science
dc.date.accessioned2022-02-10T15:13:56Z
dc.date.available2022-12-01T09:02:00Z
dc.date.created2021-12
dc.date.issued2021-11-19
dc.date.submittedDecember 2021
dc.date.updated2022-02-08T15:32:25Z
dc.degree.departmentComputer Science
dc.degree.disciplineMS-Software Engineering
dc.degree.grantorEast Carolina University
dc.degree.levelMasters
dc.degree.nameM.S.
dc.description.abstractBibliographic references are integral to scholarly discourse in humanities disciplines. While prior work has focused on reference extraction and parsing, little research has investigated the classification of footnotes containing bibliographic citations and author commentary using supervised machine learning methodologies. For this thesis, we contextualize bibliographic reference analysis within the broader domain of archival document processing through an original literature survey of current techniques, tools, and trends in the field of historical document processing. Next, we review related work on bibliographic citation identification and reference parsing. Finally, using a historiographic dataset drawn from the JSTOR humanities archive, we train and compare the performance of a suite of single and hybrid machine learning classifiers on a novel, previously unexplored bibliographic reference classification task. Moreover, as a part of this analysis, we compare the performance of traditional features and novel, grammatical features drawn from natural language processing. Our work demonstrates the superiority of hybrid models for classification of scholarly footnotes containing historiographic bibliographic references, the transferability of features from reference extraction to this research problem, and the viability of training machine learning models for this task utilizing novel, grammatical features.
dc.embargo.lift2022-12-01
dc.format.mimetypeapplication/pdf
dc.identifier.urihttp://hdl.handle.net/10342/9733
dc.language.isoen
dc.publisherEast Carolina University
dc.subjectBibliographic references
dc.subjectsupervised machine learning
dc.subjectgrammar
dc.subject.lcshBibliographical citations
dc.subject.lcshMachine-readable bibliographic data
dc.titleBibliographic Reference Analysis in Archival Data Using Supervised Machine Learning and Grammatical Features
dc.typeMaster's Thesis
dc.type.materialtext

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
PHILIPS-MASTERSTHESIS-2021.pdf
Size:
2.8 MB
Format:
Adobe Portable Document Format

Collections