Erik Linstead


Education

  • UCI (2009)
    • Ph.D. – Information and Computer Science
    • Thesis – “Statistical Machine Learning for Internet-Scale Software Repositories”
  • Stanford University (2003)
    • M.Sc. - Computer Science
  • Chapman University (2001)
    • B.Sc. - Computer Science
    • B.Sc. - Computer Information Systems


Research
I belong to a diverse group working in bioinformatics, chemoinformatics, and software engineering.  The common theme is
machine learning and information retrieval.  Currently I am working on the Sourcerer project developing methods

for mining program function and developer contributions from source code, as well as improving the way Internet-scale

code repositories are searched.  I am also working on ChemDB, applying text information retrieval to chemical
search.

 

Publications (* denotes equal contributors)

 

Journal:

E. Linstead*, S. Bajracharya*, T. Ngo*, P. Rigor, C. Lopes, P. Baldi.  Sourcerer: Mining and Searching

Internet-Scale Software Repositories.  Data Mining and Knowledge Discovery.  Volume 2, Number 18.  April 2009. (online)

 

J. Chen, E. Linstead, S. Swamidass, D. Wang, P. Baldi.  ChemDB Update: Full-text Search and Virtual

Chemical Space  Bioinformatics.  Volume 23, Number 17.  September 2007.  (advance access)

 

Conference:

E. Linstead, P. Baldi.  Mining the Coherence of GNOME Bug Reports with Statistical Topic Models.

MSR 2009: Proceedings of the Sixth Working Conference on Mining Software Repositories.

Vancouver, BC.  May 2009.  (online)

 

J. Ossher, S. Bajracharya, E. Linstead, P. Baldi, C. Lopes.  SourcererDB: An Aggregated Repository

Of Statically Analyzed and Cross-Linked Open Source Java Projects.  Proceedings of the Sixth Working

Conference on Mining Software Repositories.  Vancouver, BC.  May 2009.  (online)

 

E. Linstead, C. Lopes, P. Baldi.  An Application of Latent Dirichlet Allocation to Analyzing Software

Evolution.  Proceedings of ICMLA 2008:  International Conference on Machine Learning and

Applications.  San Diego, CA.  December 2008.  (online)

 

P. Baldi*, C. Lopes*, E. Linstead*, S. Bajracharya.  A Theory of Aspects as Latent Topics.

OOPSLA 2008.  Nashville, TN. October 2008. (online)

 

E. Linstead, P. Rigor, S. Bajracharya, C. Lopes, P. Baldi.  Mining Internet-Scale Software

Repositories.  Advances in Neural Information Processing Systems (NIPS*2007)

March 2008.  (online)

 

E. Linstead, P. Rigor, S. Bajracharya, C. Lopes, P. Baldi.  Mining Concepts from Code with

Probabilistic Topic Models.  Proceedings of ASE 2007: International Conference on Automated

Software Engineering. Atlanta, GA. November 2007.  (online)

 

Workshop:

E. Linstead, L. Hughes, C. Lopes, P. Baldi.  Exploring Java Software Vocabulary: A Search and

Mining Perspective.  Proceedings of SUITE 2009:  First International Conference on Search-Driven

Development – Users, Interfaces, Tools, and Environments.  Vancouver, BC.  May 2009. (online)

 

E. Linstead, P. Rigor, S. Bajracharya, C. Lopes, P. Baldi.  Mining Eclipse Developer Contributions via

Author-Topic Models.  Fourth International Workshop on Mining Software Repositories. Minneapolis,

MN. May 2007.  (Voted best paper, MSR “Scale” Challenge).  (online)

 

Poster:

L. Hughes, P. Baldi, E. Linstead.  The Evolution of Concerns, Scattering, and Tangling in Eclipse and

ArgoUML.  Third International Symposium on Empirical Software Engineering and Measurement.

Lake Buena Vista, FL.  October, 2009.

 

E. Linstead, L. Hughes, C. Lopes, P. Baldi.  Capturing Java Naming Conventions with First-Order Markov Models.

ICPC 2009: Proceedings of the Seventeenth International Conference on Program Comprehension.

Vancouver, BC.  May 2009.  (online)


S. Bajracharya, T. Ngo, E. Linstead, Y. Dou, P. Rigor, P. Baldi & C. Lopes.  Sourcerer: A Search Engine for Open

Source Code Supporting Structure-Based Search.  OOPSLA ’06 Poster Session.  Portland, OR. October 2006.  (online)

 

Technical Report:

S. Bajracharya, T. Ngo, E. Linstead, P. Rigor, Y. Dou, P. Baldi & C. Lopes.  A Study of Ranking Schemes in

Internet-Scale Code Search.  UCI ISR Technical Report # UCI-ISR-07-8. Nov. 2007 (online)

 

Recent Invited Talks:

Searching and Mining Internet-Scale Software Repositories.

            AI and Machine Learning Seminar.  Dept. of Computer Science.  UCI.  November 10, 2008.

            Google Tech Talk.  Irvine, CA.  May 9th, 2008.

            Chapman University Computer Science Forum.  Orange, CA.  November 15, 2007.

 

About Me
I finished my Ph.D. in June 2009, but continue to work closely with the Baldi lab at UCI.

I’m also an adjunct assistant professor in the Department of Math and Computer Science at Chapman University, where I
teach courses, when called upon, in C/C++, Data Structures, AI, Computer Architecture, Graphics, Computer Ethics, Data Mining,

Introductory Programming, and Algorithm Analysis.  I also have my own group of students now, who are particularly excited about

Publishing papers while still undergrads.

 

My wife, Jackie, is a high-school English teacher.  We’re currently having lots of fun learning how to fix up our new

house together, and more importantly, taking care of our new baby girl, Hannah.

 

Hobbies
My dad and I started restoring old Porsches when I was in high school.  I don't have time to keep up with it right now, but I still
own a 1969 911 S that we started working on my senior year.  A few years ago I became the proud owner of a new
2003 Boxster S, which I sold in 2009 to buy a car suitable for the baby.  One of my loftier goals for the future is to own a Ferrari,

but for the time being I'm very, very content!

Links

My wife and mother-in-law have recently started a business, Forever Linens Chair Covers.  They specialize in chair covers, chair
treatments, and other linens for special events.  If you’re interested, you can learn about their various chair covers and linens.

My old page at Stanford

 

Last Updated:Novemberl 6th, 2009