Improving JavaScript Test Quality with Large Language Models: Lessons from Test Smell Refactoring
| dc.contributor.author | Gabriel Amaral | |
| dc.contributor.author | Henrique L. Gomes | |
| dc.contributor.author | Eduardo Figueiredo | |
| dc.contributor.author | Carla Bezerra | |
| dc.contributor.author | Larissa Rocha | |
| dc.coverage.spatial | Bolivia | |
| dc.date.accessioned | 2026-03-22T19:46:21Z | |
| dc.date.available | 2026-03-22T19:46:21Z | |
| dc.date.issued | 2025 | |
| dc.description.abstract | Test smells—poor design choices in test code—can hinder test maintainability, clarity, and reliability. Prior studies have proposed rulebased detection tools and manual refactoring strategies, most focus on statically typed languages such as Java. In this paper, we investigate the potential of Large Language Models (LLMs) to automatically refactor test smells in JavaScript, a dynamically typed and widely used language with limited prior research in this area. We conducted an empirical study using GitHub Copilot Chat and Amazon CodeWhisperer to refactor 148 test smell instances across 10 real-world JavaScript projects. Our evaluation assessed smell removal effectiveness, behavioral preservation, introduction of new smells, and structural code quality based on six software metrics. Results show that Copilot removed 58.78% of the smells successfully, outperforming Whisperer’s 47.30%, while both tools preserved test behavior in most cases. However, both also introduced new smells, highlighting current limitations. Our findings reveal the strengths and trade-offs of LLM-based refactoring and provide insights for building more reliable and smell-aware testing tools for JavaScript. | |
| dc.identifier.doi | 10.5753/sbes.2025.11568 | |
| dc.identifier.uri | https://doi.org/10.5753/sbes.2025.11568 | |
| dc.identifier.uri | https://andeanlibrary.org/handle/123456789/78025 | |
| dc.language.iso | en | |
| dc.source | United Food and Commercial Workers | |
| dc.subject | Code refactoring | |
| dc.subject | Computer science | |
| dc.subject | JavaScript | |
| dc.subject | Software engineering | |
| dc.subject | Test (biology) | |
| dc.subject | Programming language | |
| dc.subject | Code smell | |
| dc.subject | Quality (philosophy) | |
| dc.subject | Unit testing | |
| dc.subject | Software | |
| dc.title | Improving JavaScript Test Quality with Large Language Models: Lessons from Test Smell Refactoring | |
| dc.type | article |