Mining Eclipse Developer Contributions vis Author-Topic Models

Erik Linstead, Paul Rigor, Sushil Bajracharya, Cristina Videira Lopes and Pierre Baldi

Citation

Erik Linstead, Paul Rigor, Sushil Bajracharya, Cristina Lopes, Pierre Baldi, "Mining Eclipse Developer Contributions via Author-Topic Models." Proceedings of MSR 2007: International Workshop on Mining Software Repositories , Minneapolis, MN, May 19-20, 2007.  (Voted best paper, MSR "Scale" Challenge)

Abstract

We present the results of applying statistical author-topic models to a subset of the Eclipse 3.0 source code consisting of 2,119 source files and 700,000 lines of code from 59 developers. This technique provides an intuitive and automated framework with which to mine developer contributions and competencies from a given code base while simultaneously extracting software function in the form of topics. In addition to serving as a convenient summary for program function and developer activities, our study shows that topic models provide a meaningful, effective, and statistical basis for developer similarity analysis.

Copyright (c) 2007 by IEEE. All rights reserved.

Download paper.