Software Quality Assurance and Organizational Processes

(Overview)

Project Memo UIUC-2003-09, 18 March 2003

Prof. Les Gasser
Graduate School of Library and Information Science
University of Illinois at Urbana-Champaign
gasser@uiuc.edu



Errors, like straws, upon the surface flow;
He who would search for pearls must dive below.


-- John Dryden [All for Love, Prologue.]

Description:

Mistakes, errors, and problems are a common element of working life, and a part of all settings at one time or another. In software production, the work of identifying and resolving errors, bugs and mistakes plays a large part. Operating system idiosyncrasies or seemingly random "glitches" in program runs keep designers from refining programs which are "almost perfect". In the worst cases, buggy software can present major threats to security, economic health, and even lives. In the best case, it is annoying, and time-consuming. People who work with software tools often need to rationalize difficulties to others, repeat work, and invent ways to circumvent problems they face as a result of computing errors. Developers themselves need to define and negotiate what are significant versus insignificant issues, how to allocate limited resources, and how to please multiple clients in many overlapping balancing processes. These efforts sap time, energy, and resources and often even positive sentiment, both in provider-client relationships and internally, in software development teams. As complex software artifacts proliferate and become more central to---even ubiquitous in---peoples' lives, the social cost of software errors and bugs may increase. Since errors and bugs reduce the effectiveness with which we can build and use software systems, we're interested in understanding how bugs come about, how they can best be managed, and how people who build and use advanced software systems can organize their work to prevent, overcome, deal with, and accommodate problems.

Most accounts of software problems focus on flaws in technical design and usability. Surely better design, prototyping, and needs analyses can help. But there's clearly much more to the issue---specifically, the reliability of a software artifact is related to the structure of the technical and organization processes that produce it and to the technical and organizational infrastructures and constraints under which it is built and maintained. This research is probing several aspects of this mix. We're examining questions such as:

Role of Open Source Bug Repository Data

Our preliminary work on these issues has been done with small to moderate-sized datasets of qualitative data in the form of structured interviews with software developers and users. These have helped refine our idea of some of the critical problems, and have helped build some preliminary insights. But it's difficult to analyze issues such as the ones above, without comprehensive, time- and project-specific data from large projects. With over severl hundred thousand problem reports from a variety of open-source projects, widely-accessible open source bug repositories provides an extremely large and diverse dataset for analyzing issues like those above. We've reviewed a number of reported bugs via individual searches and downloads, from repositories including Gnome, Debian Gnu/Linux, OpenOffice.org, and others. These repositories appear to have the following characteristics that make them ideally suited as datasets for investigating such issues: