StackSpot AI is transforming how developers and companies approach software development, accelerating complex processes though the power of Artificial Intelligence. In this article, we explore how the platform enabled implementing automated testing in a legacy .Net framework.
In this particular case, the team was challenged to raise test coverage from 0% to 80% in a highly complex legacy system. By adopting StackSpot AI and Jupyter Notebook, the team achieved:
- A reduction in unit test creation time from two hours to just three minutes
- Test coverage of up to 85%, exceeding the original target.
This synergy not only accelerated development but also improved the application’s overall quality and reliability. Read on to see how it was done.
How StackSpot AI Gives Development Teams Their Time Back
As artificial intelligence continues to evolve, StackSpot AI stands out as an essential tool for optimizing workflows, enhancing efficiency, and reducing development time.
In recent years, AI has emerged as one of the most impactful technology trends, reshaping everything from automation to the analysis of massive data sets. Within that landscape, StackSpot AI delivers intelligent, personalized support that fits seamlessly into the developer workflow.
The platform offers extensions for all major IDEs, providing contextual suggestions, code examples, and solutions for common development challenges. As a result, developers not only work faster but also produce higher-quality code. Additionally, StackSpot AI is designed to learn and improve continuously, becoming increasingly effective with use.
Quick Command: Streamlining Development Through Automation
Creating a Quick Command (QC) in StackSpot AI is a strategic way to automate repetitive tasks and simplify workflows. Before initiating development, clearly defining the purpose of your Quick Command is paramount.
Quick Commands offer several key advantages. Not only do they minimize manual effort and reduce human error, but also bring consistency to processes, ensuring uniform execution across teams.
Efficiency is another significant benefit, seeing as developers can complete complex tasks with a single command, freeing up time for higher-value work.
However, before a QC is released for general use, it is essential to validate its performance. This is a crucial step to avoid potential issues with the command, ensuring it is both accurate and reliable.
Case Study: Automated Testing in Legacy .Net with StackSpot AI
Modernizing legacy systems with more than 200,000 lines of code presents a formidable challenge, especially when it comes to implementing unit tests.
To address this issue, the team turned to a powerful toolset that combined StackSpot AI, especially its Remote Quick Command (RQC) feature, with Jupyter Notebook. Together, these tools streamlined prompt curation and automated testing.
Let’s dive into the details of how automated tests in legacy .Net systems became a reality with StackSpot AI.
The Legacy Challenge
The primary challenge was to achieve 80% test coverage in a short timeframe on an application that previously had no coverage whatsoever.
This highly complex application comprises more than 450 services, including controllers, diverse transactions, and integrations with mainframes (via connectors like as CICs and IMS), telephony, and media bar systems. Over more than a decade, the codebase had grown to exceed 200,000 lines.
Given its scale and complexity, initial estimates suggested that the testing effort would take over 11 months of continuous work.
Phase One: Adopting the AI-Driven Solution
To overcome this challenge, the team implemented a suite of StackSpot AI features—including Remote Quick Command, Quick Command, and Knowledge Sources—alongside Jupyter Notebook.
Using Jupyter Notebook scripts, the team extracted source files from the legacy system via StackSpot AI’s Quick Command and Knowledge Sources. Unit tests were then automatically generated based on these files.
StackSpot AI: Analyzing and Standardizing Legacy Code
StackSpot AI was critical in analyzing and standardizing the legacy .Net codebase, enabling the generation of unit tests from predefined patterns stored in the Knowledge Source.
This approach addressed the repetitive nature of the code, reduced manual workload, and ensured consistency. Prompt curation played a central role in this process, while continuous refinement helped make the tests increasingly accurate.
Remote Quick Command (RQC): Scaling Test Execution
StackSpot AI’s Remote Quick Command feature enhanced the workflow by enabling the automated submission of code files to execute the generated tests remotely.
This allowed the team to run a large volume of tests efficiently, with results being returned quickly for immediate refinement of both the scripts and the prompts.
Sending code batches via RQC also ensured tests ran in controlled environments, significantly improving the results’ reliability.
Jupyter Notebook: Orchestrating Prompt Curation
Jupyter Notebook served as the orchestration layer for prompt curation. Its interactive interface allowed developers to continuously adjust prompts and execute code in modular cells, promoting a clean and organized workflow.
This modularity made it possible to break the legacy code into more manageable parts, optimizing the test-generation process.
Setting Up Your Environment
To replicate this solution, a few steps are essential:
Installing Python
Before starting, make sure Python is installed on your system. Following that, verify that Python is working correctly by running the python –version command in Terminal.
Setting Up your Virtual Environment
Create a virtual environment to ensure an isolated and organized workspace. This will allow you to manage different dependencies and package versions without affecting the global setup.
Once it’s been set up, the name of the environment will be displayed in the terminal, indicating it is active. You’ll also be able switch between different environments as needed.
Installing Jupyter Notebook
Once you’ve set up your virtual environment, it’s time to install Jupyter Notebook, which you can do within the new environment using pip. For more information about Jupyter Notebook, check out their website.
This will work as an orchestration tool to help you execute cells and organize your code.
Integrating with VSCode
To use Jupyter Notebook in VSCode, you’ll need to install specific extensions like Jupyter, which allows you to execute cells directly from the editor, and Python, which offers support for the Python kernel and integration with virtual environments.
Having set up the kernel to use your virtual environment, you will be able to run and debug cells straight from the editor, leveraging their advanced integration and automation features.
Delivering Results: A Powerful Workflow for Legacy Testing
Combining Jupyter Notebooks with StackSpot AI’s RQC feature brought multiple benefits to automating testing in legacy .Net systems, including interactivity, integrated documentation, and data visualization.
Interactivity
- Benefit: Developers were able to experiment and iterate on code in real time. This is especially useful when adjusting prompts or testing different approaches to interact with StackSpot AI.
- How It Supports Prompt Curation: By executing code cells individually, developers can adapt and test prompts on the go, with immediate LLM feedback. This enables a more effective curation of Quick Command prompts, meaning you can tailor them to your project’s specific needs.
Integrated Documentation
- Benefit: The ability to blend code, text, and visuals in a single document makes Jupyter Notebooks a powerful documentation tool.
- How It Supports Prompt Curation: By documenting their prompt engineering process within the notebook, developers create a clear record of their chain of decisions. This helps not only with prompt curation but also with transferring knowledge to other team members and creating a source of reference for future project iterations.
Data Visualization
- Benefit: Visual tools are crucial to understanding how LLMs process and respond to prompts.
- How It Supports Prompt Curation: Developers can use notebook visualization libraries to track prompt performance, spot patterns, and identify areas for improvement. This enables more informed, data-driven prompt refinement and better results with the LLM.
Phase One Results
The synergy between these tools led to a significantly higher unit test coverage for a legacy application. Prompt refinement in Jupyter Notebook, paired with the automation enabled by RQC and the analysis capabilities of StackSpot AI, produced a streamlined, high-impact workflow.
This technical approach proved that the thoughtful integration of tools can not only accelerate development, but also enhance test quality and reduce the time required to modernize legacy systems.
This case study in implementing automated testing in legacy .Net Framework environments yielded both quantitative and qualitative results.
Quantitative Results
- Unit test creation time dropped from two hours to three minutes.
- Test coverage reached up to 85%.
Qualitative Results
- Reduced cognitive load (developers completed tasks with less effort).
- Faster development of unit tests in legacy systems.
- Improved application quality and reliability due to broader test coverage.
Phase Two: Enhanced Observability and Data-Driven Testing
In the second phase, the project focused on boosting coverage by leveraging observability data from Splunk. Logs from development and testing environments were used to generate realistic mocks and structure representative datasets.
This phase aimed to increase intelligence and autonomy but also introduced complexity, especially around tasks requiring data manipulation.
Extending Automation Beyond the LLM
While LLMs excel at generating and curating code, they fall short when handling advanced conditional logic, data transformation, or mathematical calculations.
To fill this gap, the team used Remote Quick Command to run external Python scripts in controlled environments. These scripts organized, transformed, and prepared the data for tests, supplementing the LLM. On top of that, libraries like Pandas and Numpy enabled advanced data processing, supporting everything from cleansing to transformation and mock generation. As a result, even intricate transactional data could be extracted from Splunk logs for high-fidelity test simulations.
Phase Two Results
The outcome of this more sophisticated approach was remarkable, leading to a unit test coverage of up to 94%. However, both the complexity involved in and the time required for executing the tests increased.
In contrast, the average coverage achieved using prompt-curated tests alone was around 75%, with significantly less cognitive load and post-processing.
Key Learnings from the Process
Both approaches used observability data, and prompt-based tests required minor corrections.
The main issues included non-existent code references and occasional hallucinations. There were also records of unfinished test structures, such as missing brackets or improperly closed classes, and merged test cases that broke formatting expectations.
Still, the solution delivered not only exceptional test coverage and quality, but also invaluable hands-on experience with generative AI and prompt engineering.
StackSpot AI proved intuitive and powerful, accelerating the process of creating Remote Quick Commands and integrating them into modernization workflows.
Modernizing Legacy Systems with AI: More Quality in Less Time
Through the innovative combination of StackSpot AI and Jupyter Notebook, the team successfully addressed the inherent complexity of modernizing legacy .Net systems.
This approach did more than reduce development time: it raised test coverage dramatically, ensuring greater confidence in both the process and the end product.
The results validate the effectiveness of advanced technologies in driving digital transformation, especially in systems once considered too complex to modernize efficiently.
You too can build custom solutions using StackSpot AI. Start right now by signing in with your Google or GitHub account.
Now, we want to hear from you. What did you think of this case study on automated testing in legacy .Net systems with StackSpot AI? Share your thoughts and questions in the comments below.
Content produced by Estevan Louzada Souza and Edson Massao