Virtual Time and Global State in Distributed Systems
- L. Lamport, "Time, Clocks and the Ordering of Events in a Distributed System", Communications of the ACM, 1978
- K. M. Chandy and L. Lamport, "Distributed Snapshots: Determining Global States of Distributed Systems", ACM Transactions on Computer Systems, 1985
- D. Jefferson, "Distributed Simulation and the Time Warp Operating System", ACM Symposium on Operating Systems Principles, 1987
- F. Mattern, "Virtual Time and Global States of Distributed Systems", Proc. Workshop on Parallel and Distributed Algorithms, 1989
- C. Fetzer and F. Cristian, "An optimal internal clock synchronization algorithm", COMPASS 1995
- F. Cristian and C. Fetzer, "Fault-tolerant external clock synchronization", ICDCS 1995
- A. Kshemkalyanit, M. Raynalt and M. Singhals, "An introduction to snapshot algorithms in distributed computing"
Distributed Operating Systems
Remote Procedure Calls and Distributed Shared Memory
- A. Birrell, and B. Nelson, "Implementing remote procedure calls", ACM Transactions on Computer Systems, 1984
- P. G. Soares, "On remote procedure call", Proc. of the 1992 conference of the Centre for Advanced Studies on Collaborative research, 1992
- A. L. Ananda, B. H. Tay and E. K. Koh, "A survey of asynchronous remote procedure calls", SIGOPS Operating Systems Review, 1992
- A lecture of RPC, "http://www.cs.cf.ac.uk/Dave/C/node33.html"
- G. Ricart and A. Agrawala, "An optimal algorithm for mutual exclusion in computer
networks Communications of the ACM, 1981
- L. Lamport, "Mutual Exclusion Problem": part1", "part 2", Journal of the ACM, 1986
- L. Lamport, "A Fast Mutual Exclusion Algorithm", ACM Transactions on Computer Systems, 1987
- K. Raymond, "A Tree Based Algorithm for Distributed Mutual Exclusion", ACM Transactions on Computer Systems, 1989
- H. Garcia-Molina, "Elections
in a Distributed Computing Systems"
Distributed File Systems
- A. K. Elmagarmid, "A survey of distributed deadlock detection algorithms", ACM SIGMOD, 1986
- M. Singhal, "Deadlock detection in distributed systems", IEEE Computer, 1989
- M. Satyanarayanan, "A Survey of Distributed File Systems", Annual Review of Computer Science, 1989
- B. Noble and M. Satyanarayanan, "An Empirical Study of a Highly Available File System", ACM Sigmetrics, 1994
- M. Spasojevic and M. Satyanarayanan, "An Empirical Study of a Wide-Area Distributed File System", ACM Transactions on Computer Systems, 1996
- J. Kubiatowicz, "OceanStore: An Architecture for Global-Scale Persistent Storage", ACM ASPLOS 2000
- J. Kubiatowicz, "The Google File System", ACM SOSP, 2003
Processing and Load Balancing
- J. M. Smith, "A survey of process migration mechanisms", ACM SIGOPS Operating Systems, 1988
Distributed Operating Systems
- M. H. Willebeek-LeMair, A. P. Reeves, "Strategies for Dynamic Load Balancing on Highly
Parallel Computers", IEEE Transactions on Parallel and Distributed Systems, 1993
- N. Venkatasubramanian, S. Ramanathan, "Load Management in Distributed Video
Servers", ICDCS 1997
- V. Cardellini, M. Colajanni, "Dynamic
Load Balancing on Web-server Systems", Journal IEEE Internet Computing, 1999
- T. Schnekenburger, "Load Balancing
in CORBA: A Survey, Response to the Aggregated Computing RFI".
- W. J. Bolosky, R. P. Draves, R. P. Fitzgerald,
C. W. Fraser, M. B. Jones, T. B. Knoblock and R. Rashid
"Operating System Directions
for the Next Millenium", Proc. of the 6th Workshop on Hot Topics in Operating Systems, 1997
- M. Rozier, V. Abrossimov, F. Armund et al, Overview of the Chorus Distributed Operating
- Andrew S. Tanenbaum, M. Frans Kaashoek, Robert van Renesse, Henri E.
Bal, The Amoeba Distributed Operating System - A
- Distributed Computing Frameworks: DCE,
- Object-based Middleware: CORBA specification, www.omg.org
- Jini: "Architectural Overview", Sun Microsystems
- Java RMI: "Java RMI Tutorial"
- EJB: "Enterprise JavaBeans Technology", Sun Developer Network
- J2EE: "Overview", Sun Developer Network
- Service Oriented Architectures
- Web services: "Part of the lectures" by M. Fisher
- .NET: "The .NET Framework"
- SOAP: "Specification"
Messaging and Group Communication in Distributed Systems
- D. Dolev and D. Malkhi, "The Transis Approach to High Availability Cluster Communication". Other Interesting Reading: Documentation and papers about Transis
are also avaiable at "http://www.cs.huji.ac.il/labs/transis/
- Y. Amir, et al, "Group Communication as an Infrastructure for Distributed System Management", Proc. of the 3rd Workshop on Services in Distributed and Networked Environments, 1996
- Y. Amir, et al, "The Spread Wide Area Group Communication System".
- R. V. Renesse, K. P. Birman, and S. Maffeis, "Horus: A Flexible Group Communication System", Communications of the ACM, 1996
- S. Banerjee, B. Bhattacharjee and C. Kommareddy, "Scalable Application Layer Multicast", ACM SIGCOMM 2002
- Y. Amir, C. Nita-Rotaru, J. Stanton, G. Tsudik , "Secure Spread: An Integrated Architecture for Secure Group Communication", IEEE Transactions on Dependable and Secure Computing, 2005
- M. Deshpande, B. Xing, I. Lazardis, B. Hore, N. Venkatasubramanian and S. Mehrotra, "CREW: A Gossip-based Flash-Dissemination System", ICDCS 2006
- K. Kim, N. Venkatasubramanian and S. Mehrotra, "FaReCast: Fast, Reliable Application Layer Multicast for Flash Dissemination", ACM Middleware 2010
Fault Tolerance and Reliability
- M. J. Fischer, N. A. Lynch, and M. S. Paterson, "Impossibility of Distributed Consensus with One Faulty Process", Journal of ACM, 1985
- D. Dolev, C. Dwork, L. Stockmeyer, "On the Minimal Synchronism Needed for Distributed Consensus", Journal of ACM, 1987
- T. D. Chandra and S. Toueg, "Unreliable Failure Detectors for Reliable Distributed Systems", Journal of ACM, 1985
- T. D. Chandra, V. Hadzilacos and S. Toueg, "The Weakest Failure Detector for Solving Consensus", Journal of ACM, 1996
- M. K. Aguilera, W. Chen, and S. Toueg, "Heartbeat: A Timeout-Free Failure Detector for Quiescent Reliable Communication, Cornel, 1997
- H. S. Sandhu and S. Zhou, "Cluster-based file replication in large-scale distributed systems", ACM SIGMETRICS, 1992
- J. Gray, P. Helland, P. Neil and D. Shasha , "The dangers of replication and a solution", ACM SIGMOD, 1996
- A. P. Sistla and J. L. Welch, "Efficient distributed recovery using message logging", ACM SIOPS, 1989
- I. Giurgiu, O. Riva, D. Juric, I. Krivulev, and G. Alonso, Calling the Cloud: Enabling mobile phones as interfaces to cloud applications, Journal of ACM, 1985
- B. Chun, S. Ihm, P. Maniatis, M. Naik, A. Patti, CloneCloud: Elastic Execution between Mobile Device and Cloud, To appear in Proceedings of the 6th European Conference on Computer Systems (EuroSys 2011), April 2011.
- Continue updating...