Quantitatively Evaluating Test-Driven Development by Applying Object-Oriented Quality Metrics to Open Source Projects (2009) by Rod Hilton, Department of Computer & Information Sciences, Regis University http://www.rodhilton.com/files/tdd_thesis.pdf
Conclusions: This study provided substantial evidence that Test-Driven Development is, indeed, an effective tool for improving the quality of source code. Test-Driven Development offered an overall improvement of 21% over code written using the test-last technique. Considering the sheer volume of the codebases being analyzed, it's reasonable to conclude that the substantial difference is noteworthy. Test-Driven Development was most effective at decreasing the complexity of the code, with an improvement of 31%. Complexity deals exclusively with the code inside of a module, ignoring any relationships with other aspects of a system. ... The next-best improvement was on cohesion metrics, an improvement of 21%. ... The smallest improvement Test-Driven Development offered was on coupling metrics, only 10%.
Realizing quality improvement through test driven development: results and experiences of four industrial teams (2008) by Nachiappan Nagappan, E. Michael Maximilien, Thirumalesh Bhat and Laurie Williams, published in Empirical Software Engineering, Volume 13, Number 3, June, 2008. Available for purchase from SpringerLink. Online at http://research.microsoft.com/en-us/groups/ese/nagappan_tdd.pdf
Abstract: Test-driven development (TDD) is a software development practice that has been used sporadically for decades. With this practice, a software engineer cycles minute-by-minute between writing failing unit tests and writing implementation code to pass those tests. Test-driven development has recently re-emerged as a critical enabling practice of agile software development methodologies. However, little empirical evidence supports or refutes the utility of this practice in an industrial context. Case studies were conducted with three development teams at Microsoft and one at IBM that have adopted TDD. The results of the case studies indicate that the pre-release defect density of the four products decreased between 40% and 90% relative to similar projects that did not use the TDD practice. Subjectively, the teams experienced a 15–35% increase in initial development time after adopting TDD.
Measuring the Effect of TDD (2008) by Keith Braithwaite http://www.keithbraithwaite.demon.co.uk/professional/presentations/2008/qcon/MeasureForMeasure.pdf
These are the slides from a talk given at Agile 2007, investigating the relationship between Cyclomatic Complexity and Test Driven Development for a number of projects. See http://peripateticaxiom.blogspot.com/search/label/test-first%20complexity for some more information on this topic. I hope that Keith will write a paper summarizing his findings to date.
On the Sustained Use of a Test-Driven Development Practice at IBM (2007) Julio Cesar Sanchez (IBM), Laurie Williams (North Carolina State University), E. Michael Maximilien (IBM) http://www.agile2007.org/downloads/proceedings/006_On%20the%20Sustained%20Use_860.pdf
- A post hoc analysis of a large-scale Java development project at IBM spanning five years and 10 releases, in which TDD was used throughout.
Evaluating the Efficacy of Test-Driven Development: Industrial Case Studies (2006) by Thirumalesh Bhat and Nachiappan Nagappan (Microsoft Center for Software Excellence & Microsoft Research) http://research.microsoft.com/en-us/projects/esm/fp17288-bhat.pdf
Abstract: This paper discusses software development using the Test Driven Development (TDD) methodology in two different environments (Windows and MSN divisions) at Microsoft. In both these case studies we measure the various context, product and outcome measures to compare and evaluate the efficacy of TDD. We observed a significant increase in quality of the code (greater than two times) for projects developed using TDD compared to similar projects developed in the same organization in a non-TDD fashion. The projects also took at least 15% extra upfront time for writing the tests. Additionally, the unit tests have served as auto documentation for the code when libraries/APIs had to be used as well as for code maintenance.
Comment: This paper presents a couple of closely measured examples where development time was measured to be 15% to 35% longer than comparable projects that had 2.6 to 4.2 times the number of defects. The description of TDD in the paper, however, describes the developers writing a "small number of automated unit test cases" prior to implementing code. I suspect, therefore, that the developers were working in much larger steps than I would recommend for TDD. My hunch is that some of the overhead is due to taking these larger steps.
Effects of Test-Driven Development: An Evaluation of Empirical Studies (2006) Philip Ritzkopf (Embedded Software Group) http://www-i11.informatik.rwth-aachen.de/fileadmin/user_upload/Redakteure/Vorlesungen/05winter/MethEmpSWTechn/Ritzkopf__Philip.pdf
- Summarizes one case study (Maximilien and Williams 2003) and four experiments (Muller and Hagner 2002, George and Williams 2003, Geras, Smith, Miller 2004, and Erdogmus, Morisio Torchiano 2005)
Test driven development: empirical body of evidence (2006) by Maria Siniaalto http://www.agile-itea.org/public/deliverables/ITEA-AGILE-D2.7_v1.0.pdf
- Summarizes 13 studies. "Seven of the studies were conducted in the University environment: six with undergraduate subjects (Steinberg 2001; Edwards 2003; Kaufmann and Janzen 2003; Pancur, Ciglaric, et al. 2003; Abrahamsson, Hanhineva, et al. 2004; Erdogmus, Morisio, et al. 2005) and one with computer science graduate students (Müller and Hagner 2002). Of the remaining five studies, conducted with professional developers, three were arranged in real industrial settings (Maximilien and Williams 2003; Lui and Chan 2004; Damn, Lundberg, et al. 2005), whereas the last three (Langr 2001; George and Williams 2004; Geras, Smith, et al. 2004) were based on voluntary participation of professional developers."
Conclusion: "Based on the findings of the existing studies, it can be concluded that TDD seems to improve software quality, especially when employed in an industrial context. The findings were not so obvious in the semiindustrial or academic context, but none of those studies reported on decreased quality either. The productivity effects of TDD were not very obvious, and the results vary regardless of the context of the study. However, there were indications that TDD does not necessarily decrease the developer productivity or extend the project leadtimes: In some cases, significant productivity improvements were achieved with TDD while only two out of thirteen studies reported on decreased productivity. However, in both of those studies the quality was improved."
On the Effectiveness of Test-first Approach to Programming by Erdogmus, H. (Proceedings of the IEEE Transactions on OEM software Engineering, 31(1). January 2005.) http://iit-iti.nrc-cnrc.gc.ca/publications/nrc-47445_e.html
Abstract: Test-Driven Development (TDD) is based on formalizing a piece of functionality as a test, implementing the functionality such that the test passes, and iterating the process. This paper describes a controlled experiment for evaluating an important aspect of TDD: In TDD, programmers write functional tests before the corresponding implementation code. The experiment was conducted with undergraduate students. While the experiment group applied a test-first strategy, the control group applied a more conventional development technique, writing tests after the implementation. Both groups followed an incremental process, adding new features one at a time and regression testing them. We found that test-first students on average wrote more tests and, in turn, students who wrote more tests tended to be more productive. We also observed that the minimum quality increased linearly with the number of programmer tests, independent of the development strategy employed.
An Initial Investigation of Test-Driven Development in Industry (2003) Boby George, Laurie Williams (North Carolina State University) http://collaboration.csc.ncsu.edu/laurie/Papers/TDDpaperv8.pdf
- A set of structured experiments with 24 professional pair programmers. Found that the TDD developers produced higher quality code, passing 18% more functional black box test cases than the waterfall-like developers, though taking 16% more time for development. The programmers which followed a waterfall-like process often did not write the required automated test cases after completing their code.
Extracted from StudiesOfAgileEffectiveness