How to do a Case Studies in Software Engineering

Why would one want to do case study research? One reason might be to examine the field to learn where the problems are so that one could do “real” reasearch. Unfortunely, in social science and in software engineering we often must use case study because statistical generalization is not practical becasue

  • we cannot replicate the conditions (every project is different with many uncontrolled factors)
  • we cannot find a suitable control, let alone do double blind
  • we cannot get enough samples to apply statistics (for example the small company problem)
  • we have enough samples, but we violate the assumptions the statistics rely on.

We might then perform a “case study”. Sadly, these are often not performed with rigor. This problem has been noticed, for example by Host and Runeson who produced a set of checklists for case study research in software engineering. They use sources from social science and software engineering. They produced checklists for the research that address design, preparation for data collection, data collection, analysis and reporting.

A short summary of their reviewer’s summary checklist is as follows:

  1. “Are the research questions, objects of study and case study context well defined?
  2. Is it motivated that the case is suitable to address the research questions?
  3. Are the hypotheses and propositions clear and relevant?
  4. Are the data collection procedures sufficient for the purpose (data sources, collection, storage, validation)?
  5. Are sufficient raw data presented to provide understanding of the case?
  6. Are the analysis procedures sufficient for the purpose (repeatable, transparent)?
  7. Is the case study based on theory and linked to existing literature?
  8. Is a clear chain of evidence established from observations to conclusions?
  9. Are threats to validity analyses addressed in a systematic way?
  10. Are different views taken on the case (multiple collection and analysis methods, multiple authors)?
  11. Are ethical issues addressed properly (personal intentions, integrity issues, consent, review board approval)?
  12. Are conclusions, implications for practice and future research, reported suitably for its audience?

Robert Yin has a handbook for case studies now in the 4th edition, that I highly recommend.
He identifies the components as
1) the study’s question
2) the propositions (if any) [these are used in causal inference]
3) the unit(s) of analysis
4) the logic linking the data to the proposition [the analysis for causal inference]
5) the criteria for interpreting the findings

The question can be framed in terms of “who”, ‘what”, “where”, “how” and “why”

The proposition helps to establish a theoretical framework that provides direction to the research. It can take the form “if…Then… *because*”. This helps to direct the study towards precisely what should be measured. If the study has quasi-experimental components or statistical analysis, the proposition can lead to precise hypothesizes.

Unit of analysis defines the “case”. For exempla, we have the individuals embedded within one of the two class formats or types of development.

Notice that the case study starts with theory. Some theory or model quides expectations on how the data should be related. The theory guides what data must be collected to test for consistency. The logic and criteria then define how we critique our proposition. Do the data fit the story? What are possible alternatives? What are other implications?

Absent statistical generalization, a theoretical generalization from the case can be made by examining each step of the chain for consistency. All parts of the narrative must tell a consistent story. The researcher needs to do the hard work of criticizing his or her own work examining plausible alternative explanations. The result could reject a theory, or sharpen the questions to be asked in the next case.

These resources are a starting point for doing useful case studies in the field. They cannot replace, but can compliment quasi-experiments, experments, and statistically derived results.

If you are like me, this is not fully satisfactory. Nonetheless, having some guidelines makes me feel a lot better about the work. I am not naive enough to believe the reseracher is unbiased, but by following guidelines I gain some confidence that the work is honest.


Robert Yin, Case Study Research Design and Methods, 4th edition Sage, 2009

Martin Host and Per Runeson, Checklists for Software Engineering Case Study Research,


About Bill Nichols

PhD in Physics from Carnegie Mellon University I'm a software team coach and instructor with the TSP Team at the Software Engineering Institute
This entry was posted in Uncategorized. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s