12 November 2012

Reproducing an intermittent defects

Often in Software Test Engineer or QA or QC role interview question, people ask how will you reproduce an intermittent defects?


Have you experienced any intermittent issues in the project you worked on?

Let's go through the below section to understand how can we reproduce an intermittent defects:

  1. Introduction
  2. Why intermittent defects should be tracked down and cannot be ignored?
  3. Collecting more information regarding the intermittent defects
  4. Applying the information collected into action
  5. An example
  6. Introduction

    In software development process sometimes defects are raised without steps to reproduce the defects, because tester himself doesn't able to reproduce the defects once again. Why is this?.

    This is because issue itself is very inconsistent in nature and adding to that issue may happen intermittently. So we should know how to track such defect.

    Over last few years I have successfully root caused some intermittent issues. And hence I want to share my experience and some practices which may helps you as well.

    Why intermittent defects should be tracked down and cannot be ignored?

    If stake holders i.e. Product Owners, Clients or Business doesn't know why and when the defects occurs, it will be very difficult for them to prioritize the defect. If the defect is blocking in nature, it might bring the confidence level down about the software product in the minds of the stake holders. So such intermittent issues cannot be ignored by postponing the root causing and then later to fix them

    Collecting more information regarding the intermittent defects

    When you are asked to root cause and to narrow down an issue which is intermittent in nature, take it as challenge. This really helps.

    Go and talk to the tester/QA who has taken effort to raise the defect, even when he knows that people will bug him later. Try to get information on what he was trying to test during the time? or which test case he/she was trying to execute.

    Get firm information on which build, release and environment in which the defect was intermittently shown up. Intention here is to narrow down the release, build and environment, so that it helps when sit for reproducing the defect.

    Ask whether Tester/QA has when defect surfaced whether he/she made any specific changes recently around configuration the system using the application UI or in the app.config or in the web.config file.

    Ask information about the date and time at which issue occurred. With that info, see all the related logs at that time. Depending on the which module of the application the defect on which defect is seen, one can look into client application logs, web server logs, app server logs or third party software logs

    Look at the exception error logs. Looking at the exception stack trace figure out from where, which class, which method threw the exeption.Once you figure out from which method exception was thrown, think on which all the scenarios that could possibly throw that exception.

    Other important exercise which gives good information is that, go and talk to all the testers/QA who tests the same module or software layer. Ask them whether they have faced such issues earlier as well or it is something newly happening.

    Ask for the same user credentials (Username/password) as tester logged in to the system and try to reproduce the issue. This might solve the issue itself, because sometimes a particular profile might have got corrupted.

    Applying the information collected into action

    Now the real action begins!

    By now we would have all the information required to reproduce the defect. Analyse the information collected. Think of permutation and combination of the scenario which led to the defect. Here we need to be captious!. There could be daunting number of permutation and combination of scenarios. Try to narrow down the scenarios depending on the relevancy to the defect being seen intermittently. Then to try to execute valid scenarios.

    Enough of theory, lets get into an example

    Sometime back in a project which had markup editor on a drawing files, seldom what used to happen is, after drawing the markups and tried to save the markups the application was failing by throwing an error. I got an opportunity to narrow down the defect to come up with exact steps to reproduce the defect to know why and when the issue happens.

    Since I was aware of the issue, directly started root causing the issue. We were using the legacy code to help with the markups, so there no logging information was available. It made the matter worse!

    The markup editor had options to choose from different shapes available while making the markups on the files. Stared playing with all the shapes available on the markup tool, to draw them on the drawing. Drawn all the shapes on drawing and tried to save; it worked without an issue as expected. So that it meant its not an issue with the normal usage of the shapes for the markups.

    Then started playing with shapes and modified them while drawing on the files. For example, for rectangle shape I drew it vertically instead of usual horizontal and saved the file. It worked fine :(

    Later tried with the Circle shape. Instead of usual vertical oval shape, I drew it as horizontal oval shape and tried to save. Bang! Bang! Bang!. It failed and thrown me an error. Reopened the application and tried again and again. Application was failing whenever I tried doing the same. So finally able to produce the steps to reproduce the defect which was happening intermittently!. This exercise was almost took 3-4 hours.

    Narrow downing the defect with an intention to come up the solid steps to reproduce the defect is a lengthy exercise, so be patient and think logically.

    All the best next time whenever you sit for reproducing an intermittent defect!