METAMORPHIC TESTING FOR FAIRNESS EVALUATION IN LARGE LANGUAGE MODELS
dc.contributor.advisor | Dr.Srinivasan Madhusudan | |
dc.contributor.author | Anthamola, Harishwar Reddy | |
dc.contributor.committeeMember | Dr.Tabrizi Nasseh | |
dc.contributor.committeeMember | Dr.Hart David Marvin | |
dc.contributor.department | Computer Science | |
dc.date.accessioned | 2025-06-05T17:33:01Z | |
dc.date.available | 2025-06-05T17:33:01Z | |
dc.date.created | 2025-05 | |
dc.date.issued | May 2025 | |
dc.date.submitted | May 2025 | |
dc.date.updated | 2025-05-22T21:15:17Z | |
dc.degree.college | College of Engineering and Technology | |
dc.degree.grantor | East Carolina University | |
dc.degree.major | MS-Software Engineering | |
dc.degree.name | M.S. | |
dc.degree.program | MS-Software Engineering | |
dc.description.abstract | Large Language Models (LLMs) have made significant progress in Natural Language Processing, yet they remain susceptible to fairness-related issues, often reflecting biases from their training data. These biases present risks, mainly when LLMs are used in sensitive domains such as healthcare, finance, and law. This research proposes a metamorphic testing approach to uncover fairness bugs in LLMs systematically. We define and apply fairness-oriented metamorphic relations (MRs) to evaluate state-of-the-art models like LLaMA and GPT across diverse demographic inputs. By generating and analyzing source and follow-up test cases, we identify patterns of bias, particularly in tone and sentiment. Results show that tone-based MRs detected up to 2,200 fairness violations, while sentiment-based MRs detected fewer than 500, highlighting the strength of this method. This study presents a structured strategy for enhancing fairness in LLMs and improving their robustness in critical applications. | |
dc.format.mimetype | application/pdf | |
dc.identifier.uri | http://hdl.handle.net/10342/14051 | |
dc.language.iso | English | |
dc.publisher | East Carolina University | |
dc.subject | Computer Science | |
dc.title | METAMORPHIC TESTING FOR FAIRNESS EVALUATION IN LARGE LANGUAGE MODELS | |
dc.type | Master's Thesis | |
dc.type.material | text |
Files
Original bundle
1 - 1 of 1