Risk-Based Test Case Prioritization Using Large Language Models in Regression Testing

Guzman-Sanchez, Jose

Risk-Based Test Case Prioritization Using Large Language Models in Regression Testing

dc.contributor.advisor	Madhusudan Srinivasan
dc.contributor.author	Guzman-Sanchez, Jose
dc.contributor.committeeMember	Nic Herndon
dc.contributor.committeeMember	Nasseh Nassehzadeh-Tabrizi
dc.contributor.department	Computer Science
dc.date.accessioned	2025-06-05T17:32:06Z
dc.date.available	2025-06-05T17:32:06Z
dc.date.created	2025-05
dc.date.issued	May 2025
dc.date.submitted	May 2025
dc.date.updated	2025-05-22T21:15:15Z
dc.degree.college	College of Engineering and Technology
dc.degree.grantor	East Carolina University
dc.degree.major	MS-Software Engineering
dc.degree.name	M.S.
dc.degree.program	MS-Software Engineering
dc.description.abstract	Regression testing is critical to ensuring software quality after performing code modifications. However, complete test execution on complex and robust test suites can be infeasible due to time and resource constraints. Therefore, test case prioritization (TCP) strategies aim to organize test cases to increase fault detection rates early during test execution. This study proposes a risk-based test case prioritization approach that leverages large language models (LLMs) to estimate the fault-proneness of individual methods to guide the prior- itization process. An LLM is fine-tuned to predict the risk score of each function based on several software metrics, which is used to perform static analysis of test cases to determine an overall risk ranking. The prioritized test suites are evaluated using established metrics, including Fault Detection Rate (FDR) and Average Percentage of Faults Detected (APFD). The evaluation of this approach is compared against baseline techniques such as coverage-based and randomized prioritization. The results of this experiment, conducted on open-source Java projects, determined that the risk-based LLM prioritization approach outperforms traditional TCP methods in early fault detection, highlighting the potential of including LLMs in regression testing workflows.
dc.format.mimetype	application/pdf
dc.identifier.uri	http://hdl.handle.net/10342/14048
dc.language.iso	English
dc.publisher	East Carolina University
dc.subject	Computer Science
dc.title	Risk-Based Test Case Prioritization Using Large Language Models in Regression Testing
dc.type	Master's Thesis
dc.type.material	text

Files

Original bundle

Now showing 1 - 1 of 1

Name:: GUZMAN-SANCHEZ-PRIMARY-2025.pdf
Size:: 793.72 KB
Format:: Adobe Portable Document Format

Download

Collections

Master's Theses
Computer Science