Improving JavaScript Test Quality with Large Language Models: Lessons from Test Smell Refactoring

Gabriel Amaral; Henrique L. Gomes; Eduardo Figueiredo; Carla Bezerra; Larissa Rocha

doi:10.5753/sbes.2025.11568

Improving JavaScript Test Quality with Large Language Models: Lessons from Test Smell Refactoring

dc.contributor.author	Gabriel Amaral
dc.contributor.author	Henrique L. Gomes
dc.contributor.author	Eduardo Figueiredo
dc.contributor.author	Carla Bezerra
dc.contributor.author	Larissa Rocha
dc.coverage.spatial	Bolivia
dc.date.accessioned	2026-03-22T19:46:21Z
dc.date.available	2026-03-22T19:46:21Z
dc.date.issued	2025
dc.description.abstract	Test smells—poor design choices in test code—can hinder test maintainability, clarity, and reliability. Prior studies have proposed rulebased detection tools and manual refactoring strategies, most focus on statically typed languages such as Java. In this paper, we investigate the potential of Large Language Models (LLMs) to automatically refactor test smells in JavaScript, a dynamically typed and widely used language with limited prior research in this area. We conducted an empirical study using GitHub Copilot Chat and Amazon CodeWhisperer to refactor 148 test smell instances across 10 real-world JavaScript projects. Our evaluation assessed smell removal effectiveness, behavioral preservation, introduction of new smells, and structural code quality based on six software metrics. Results show that Copilot removed 58.78% of the smells successfully, outperforming Whisperer’s 47.30%, while both tools preserved test behavior in most cases. However, both also introduced new smells, highlighting current limitations. Our findings reveal the strengths and trade-offs of LLM-based refactoring and provide insights for building more reliable and smell-aware testing tools for JavaScript.
dc.identifier.doi	10.5753/sbes.2025.11568
dc.identifier.uri	https://doi.org/10.5753/sbes.2025.11568
dc.identifier.uri	https://andeanlibrary.org/handle/123456789/78025
dc.language.iso	en
dc.source	United Food and Commercial Workers
dc.subject	Code refactoring
dc.subject	Computer science
dc.subject	JavaScript
dc.subject	Software engineering
dc.subject	Test (biology)
dc.subject	Programming language
dc.subject	Code smell
dc.subject	Quality (philosophy)
dc.subject	Unit testing
dc.subject	Software
dc.title	Improving JavaScript Test Quality with Large Language Models: Lessons from Test Smell Refactoring
dc.type	article

Collections

Artículo Científico Publicado

Improving JavaScript Test Quality with Large Language Models: Lessons from Test Smell Refactoring

Files

Collections