Download page for Sourcerer Data Set: sourcerer-maven-aug12

"sourcerer-maven-aug12" is a tarball containing about 2,232 projects from the Maven Central Repository and stored in Sourcerer's repository format. Each project contains one or more versions. The contents of the tarball is available here.

This data was collected in August 2012 using rsync from a mirror when Maven still allowed that. We are releasing this tarball so that this repository can be used as a reference collection for various research purposes. Release date: 11-14-2013

"sourcerer-maven-aug12" is a part of the UCI Source Code Data Sets.


By downloading and using this Sourcerer repository, you agree to abide by the following terms of usage.

  1. The source and byte code contained in the tarball is collected from open source projects hosted at the Maven Central Repository. You should adhere to the respective licenses that come with the projects.
  2. You will use the file strictly for non-commercial and non-profit work (eg; research or personal use). Any commercial use of this file is prohibited.

Citation Policy

This data set should be cited according to the general Citation Policy. Additionally, the following publications should be cited for this particular data set.

Publications relevant to this data set

  1. J. Ossher, H. Sajnani and C. Lopes. Astra: Bottom-up Construction of Structured Artifact Repositories. In Proceedings of the 19th Working Conference on Reverse Engineering (WCRE), pp 41-50. Oct 2012.
    author={Ossher, J. and Sajnani, H. and Lopes, C.}, 
    booktitle={Reverse Engineering (WCRE), 2012 19th Working Conference on}, 
    title={Astra: Bottom-up Construction of Structured Artifact Repositories}, 
  2. S. Bajracharya, J. Ossher and C. Lopes, Sourcerer: An infrastructure for large-scale collection and analysis of open-source code, Science of Computer Programming, Volume 79, 1 January 2014, Pages 241-259, ISSN 0167-6423,
    title = "Sourcerer: An infrastructure for large-scale collection and analysis of open-source code ",
    journal = "Science of Computer Programming ",
    volume = "79",
    pages = "241 - 259",
    year = "2014",
    issn = "0167-6423",
    doi = "",
    author = "Sushil Bajracharya and Joel Ossher and Cristina Lopes"

(c) the mondego group