NSF CAREER: Tools for Analyzing, Modeling, and Comparing Protein-Protein Interaction Networks

 

NSF project award number: 0644424

NSF directorate: CISE

NSF Division: IIS

 

PI: Natasa Przulj

Project start date: January 15, 2007

Expected project end date: December 31, 2011

 

The award abstract:

The project proposes improvements in tools for analyzing and modeling of Protein-Protein Interaction (PPI) networks.  A PPI network is a mathematical representation of the physical interactions amongst proteins in a cell: it is a graph in which nodes represent proteins and edges between the nodes represent possible physical interactions between the corresponding proteins.  Currently available PPI networks are partial and mostly resulting from various high throughput proteomics experiments that attempt to discover as many PPIs as possible in a number of organisms.  The resulting PPI data sets are large including tens of thousands of interactions amongst thousands of proteins even for a unicellular eukaryotic organism such as baker’s yeast.

 

The project introduces new, more sensitive measures of network local structure that are based on enumeration of “graphlets”, small induced subgraphs of large graphs, as well as on distinguishing during the enumeration between the nodes in a graphlet that cannot be mapped to one another by an automorphism (an isomorphism of a graph with itself).  In this way, the degree distribution is generalized into the spectrum of “graphlet degree distributions”.  These new measures produce graphlet-based network “feature vectors” that are further used to heuristically compare synthetic (model based) networks to experimentally determined PPI networks.  The project’s preliminary results indicate that geometric random graphs provide a superior fit to the currently available PPI networks than do other network models including scale-free networks.  The project proposes to build upon these preliminary results towards an efficient PPI network alignment heuristic that would incorporate biological information of the proteins in the networks being aligned, such as protein functional, structural, and amino-acid sequence similarity.  Furthermore, the project proposes to use geometric random graph models of PPI networks as the basis for algorithmic strategies that would optimize the cost of experimentally exploring the entire space of protein-protein interactions in the cell. The use of geometric random graph models and associated graphlet-based network feature vectors can be extended to other network modeling applications, such as epithelial planar cell polarization, chemical informatics, social network analyses, etc.

 

It is expected that the project will lead to better algorithms for various network comparison problems on PPI and other networks.  Exploitation of a network model may significantly reduce the cost of characterizing the interactomes of various organisms.

 

The PI is involved in training high school, undergraduate and graduate students, including two female students.  This work is being included in various UC Irvine courses.  The software will be provided as a free open-source toolkit for other researchers.  Additional information will be available at http://www.ics.uci.edu/~natasha/.

 

Trainees:

 

Oleksii Kuchaiev -- Ph.D. student

Vesna Memisevic -- Ph.D. student

Tijana Milenkovic -- Ph.D. student

Marija Rasajski -- post-doc

Raymond Yu -- undergraduate assistant

 

Former group members:

 

David Hubin -- undergraduate assistant.

Jason Lai -- undergraduate assistant.

Naveen Nathan – undergraduate assistant.

 

Publications:

 

6. Tijana Milenkovic, Ioannis Filippis, Michael Lappe, and Natasa Przulj, Optimized Null Model for Residue Interaction Graphs, Submitted.

 

5. Tijana Milenkovic and Natasa Przulj, Uncovering Biological Network Function via Graphlet Degree Signatures, Technical Report No. 08-01, Donald Bren School of Information and Computer Sciences, University of California, Irvine, USA. arXiv:0802.0556v1 [q-bio MN] 5 Feb 2008. Submitted.

 

4. Desmond J. Higham, Marija Rasajskim, and Natasa Przulj, Fitting a Geometric Graph to a Protein-Protein Interaction Network, Bioinformatics, doi: 10.1093/bioinformatics/btn079, 2008.

 

3. Tijana Milenkovic, Jason Lai, and Natasa Przulj, "GraphCrunch: A Tool for Large Network Analyses", BMC Bioinformatics, 9:70, January 30, 2008.  Highly accessed.

 

2. Natasa Przulj, Geometric Local Structure in Biological Networks, Information Theory Workshop, 2007. ITW '07. IEEE, (2007), p. 402., 10.1109/ITW.2007.4313108 .

 

1. Fereydoun Hormozdiari , Petra Berenbrink, Natasa Przulj, S. Cenk Sahinalp, Not all scale-free networks are born equal: The role of the seed graph in PPI network evolution, PLoS Computational Biology, vol. 3, (2007), p. e118., doi:10.1371/journal.pcbi.0030118.

 

Presentations:

 

Invited talks:

 
7. N. Przulj, Geometric local structure in Biological Networks, 2007 IEEE Information Theory Workshop (ITW 2007), Lake Tahoe, California, September 2-6, 2007. 
 
6. N. Przulj, Graphs, Proteins, and Simulations, Physics Seminar, Petnica Research Station, Valjevo, Serbia, August 11, 2007.
 
5. N. Przulj, Geometric local structure in Biological Networks, International Conference on SCIentific Computation And Differential Equations, SciCADE 2007, Saint-Malo, France, July 9-13, 2007.
 
4. N. Przulj, Geometric local structure in Biological Networks, 39th Symposium on the Interface: Computing Science and Statistic 
(Theme: Systems Biology), Philadelphia, Pennsylvania, May 23-26, 2007.
 
3. N. Przulj, Geometric local structure in Biological Networks, Department of Defense Biotechnology HPC Software Applications Institute, Fort Detrick, Frederick, MD, May, 23, 2007.
 
 

Reviewed contributed talks:

 
2. T. Milenkovic and N. Przulj, Uncovering Biological Network Function via Graphlet Degree Signatures, BioPathways '07 pre-conference of ISMB/ECCB'07, Vienna, Austria, July 19-20, 2007.
 
1. N. Przulj, Biological Network Comparison Using Graphlet Degree Distributions, ECCB 2006, acceptance rate 18%, Eilat, Israel, 
January 21-24, 2007.

 

Posters:

 
5. T. Milenkovic, J. Lai, and N. Przulj, GraphCrunch: A Tool for Large Network Analyses, a poster at the Eight International Conference on Systems Biology, ICSB'07, Long Beach, CA, October 1-6, 2007.
 
4. T. Milenkovic, I. Filippis, M. Lappe, and N. Przulj, Optimized Null Model for Residue Interaction Graphs, a poster at the Eight International Conference on Systems Biology, ICSB'07, Long Beach, CA, October 1-6, 2007.
 
3. H. Higham, M. Rasajski, and N. Przulj, Fitting a Geometric Model to Protein-Protein Interaction Networks, a poster at the Eight 
International Conference on Systems Biology, ICSB'07, Long Beach, CA, October 1-6, 2007.
 
2. T. Milenkovic, J. Lai, and N. Przulj, GraphCrunch: A Tool for Large Network Analyses, a poster at  the 15th Annual International 
Conference on Intelligent Systems for Molecular Biology (ISMB) & 6th European Conference on Computational Biology (ECCB),  
ISMB/ECCB'07, Vienna, Austria, July 21-25, 2007.
 
1. Y. Wang and N. Przulj, Biological implications of anti-motifs in transcriptional regulation networks, a poster at European Conference on Computational Biology (ECCB'06), Eilat, Israel, January 21-24, 2007.

 

 

The project was funded under the NSF Faculty Early Career Development (CAREER) Program.

 

Page last updated on March 15, 2008.