Search posts by keywords:
Filter posts by author:
Related Reports
Related NEAT Reports
Other blog posts
posted on Apr 02, 2025 by Gaurav Parab
The QE/testing industry has quickly adopted GenAI, and the nature of GenAI use cases has significantly changed in the last 18 months. Initially, the industry went through a discovery phase, identifying how to use LLMs for QE. The first GenAI use cases were test authoring (generating test scenarios from user stories/requirements, test cases, and test scripts), test data management, and knowledge management.
GenAI is still a new tool in the QE industry, but we are starting to see the industry ask the right questions; for instance:
- How will GenAI co-exist with AI, whether ML/DL, NLP, or even machine vision, and does Agentic AI have a role to play?
- Will GenAI replace the many tools used in software testing, resulting in tool fragmentation, inhibiting automation?
- How can we bridge the QE and software development worlds with GenAI?
We recently talked to Infosys QE (IQE), Infosys’ testing unit, about how it applies GenAI to QE/testing. Infosys’ QE unit is the second largest practice within the firm, and its word carries weight in the industry. The discussion with Infosys showed how much the firm has advanced in a very short time. Below, we discuss some of our takeaways.
Where to replace Machine Learning with GenAI
Infosys QE has started assessing when to use LLMs rather than ML or NLP. The approach is essential, as LLMs bring immediate benefits at limited costs; ML is less immediate (it requires data training for three to six months) but brings higher accuracy than LLMs. Infosys’ experience is that GenAI can replace ML in analytics (for instance, test defect classification for identifying duplicate defects) and for knowledge management (replacing bots with LLMs).
Infosys points out that other AI use cases, such as logs analytics and test suite optimization (reducing the number of test cases), will rely on ML rather than on LLMs. Log analytics, for instance, requires significant processing, and commercial LLMs have token limitations to this processing. Eventually, this token limitation will fade away.
Infosys rolls out Agentic AI concept for testing
IQE has already deployed Agentic AI to QE (‘Agentic AI Defect Engineer’), decomposing activities into tasks. Its first scenario is:
- One agent identifies similar test defects (rather than using NLP) and finds if an existing test case is relevant to the defect
- If required, another agent will create a test case/script
- A third will conduct defect RCA through defect classification.
Three requirements will favor the emergence of Agentic AI:
- Access to tools; for instance, for executing functional testing
- Knowledge, using RAG
- Memory, thanks to the recent reasoning capability found in several LLMs.
Next in line is unattended functional test execution and solving the issues (e.g., lack of synchronization between the application under test and the testing execution engine) that make testing batches fail overnight.
Alongside Agentic AI, Infosys is also using its 1m test case repository to build knowledge management in the form of RAG, in support of LLM-based activities such as test case generation, targeting industry-specific applications.
Testing tool fragmentation: where GenAI plays
One of the challenges of QE has been around tools. Testing requires many different tools, some of which are COTS, others open-source. Even in functional test execution, many engines co-exist, such as Open Text, Selenium, and, increasingly, Playwright. The challenge is not only about the tool fragmentation but also that each tool has its own script format, and script migration was not easy in the past.
Infosys does not necessarily think that GenAI will replace existing tools. For now, it has positioned GenAI to augment the capabilities of the main tools in the market. An example is test data management. GenAI is now used to generate synthetic test data at little cost. However, the industry still needs specific tools to address complicated scenarios; e.g., creating tables in SAP.
Another example is static code analysis to evaluate the quality of an application under development. Infosys is finding that LLMs bring better results than current tools. If anything, GenAI will force testing ISVs to specialize their tools further, while LLMs will replace mainstream usage. NelsonHall believes that GenAI will lead to streamlining of the tool ecosystem.
Deploying GenAI for testing on development tools
While close, application development and testing remain very different and rely on different skills, tools, and processes. To bring GenAI for QE to application development teams, Infosys launched its ‘Pair Programming’ offering (an analogy is in application development, where one engineer develops code while another reviews the code in real-time).
Rather than deploying the ChatGPTs of this world in testing, Infosys decided to use development tools already commonly used by developers. Infosys has worked on several tools, mostly GitHub, using Copilot, and also GitLab Duo, Amazon Q, and Google Duet AI.
Initial use cases developed by Infosys are like the first use cases Infosys formerly developed for testing and include (BDD) feature file generation, test script generation, test script completion, and test script conversion. The approach relies on a repository of prompts developed by Infosys to standardize output generation (LLMs are non-deterministic models, and the output of prompts varies even more if the prompts are slightly different).
Still the start of the journey for GenAI in testing
We think the QE industry’s holy grail will be moving from a greenfield approach to brownfield. The industry currently uses GenAI to generate test artifacts. However, the reality is that GenAI must consider existing investments that organizations have already made: some clients have tens of thousands of test cases and scripts.
Infosys has started to address brownfield requirements. It has developed an LLM use case for test script migration from one language to another, and the migration is as simple as moving from one syntax to another. Each testing tool has specific functions and components that require heavy human intervention. Infosys QE believes it has good results in test script migration.
Other brownfield use cases are emerging, such as creating test scenarios from RPA tools and documenting processes out of test scripts in reverse engineering approaches. We expect brownfield to drive most GenAI activity for QE in the short term, and Infosys will be part of this momentum.