CS237 Distributed Systems Middleware

Lecture Notes

  1. Lecture 1: Middleware and Distributed Systems Fundamentals
  2. Lecture 2: Virtual Time and Global States in Distributed Systems.
  3. Lecture 3: Distributed Operating Systems Concepts.
  4. Lecture 4: Distributed OS Case Studies (Amoeba).
  5. Lecture 5: Messaging Middlewares, Messaging Group, Distributed Pub/Sub
  6. Lecture 6: Fault-Tolerance,Middleware Frameworks: DCE
  7. Midterm: Midterm Review, Sample
  8. Lecture 7: Middleware Frameworks: CORBA 
  9. Lecture 8: Middleware Frameworks Java-based Technologies, Jini, EJB,
  10. Lecture 9: Middleware Frameworks XML, Web Services, Service Oriented Architectures
  11. Lecture 10: Middleware for Cloud Computing 
  12. Middleware for QoS-Enabled Environments
  13. Middleware for Embedded Environments
  14. Middleware for Secure Environments
  15. Middleware for Mobile and Ubiquitous Environments

Course Reading Materials

How to read a paper:

 

Reference Books:

 

Middleware and Distributed Systems Fundamentals (NO REVIEW REQUIRED):

  1. Middleware David E. Bakken: Encyclopedia of Distributed Computing, Kluwer Academic Publisher.
  2. Managing Complexity: Middleware Explained Andrew T. Campbell, Geoff Coulson, and Michael E. Kounavis IT Professional, Vol. 1, No. 5, September/October 1999.
  3. Middleware a model for distributed system services Philip A. Bernstein; Commun. ACM 39, 2 (Feb. 1996), Pages 86 - 98

 

Virtual Time and Global State in Distributed Systems (review required, any TWO papers):

  1. Lamport, "Time, Clocks and the Ordering of Events in a Distributed System", Communications of the ACM, 1978
  2. M. Chandy and L. Lamport, "Distributed Snapshots: Determining Global States of Distributed Systems", ACM Transactions on Computer Systems, 1985
  3. Jefferson, "Distributed Simulation and the Time Warp Operating System", ACM Symposium on Operating Systems Principles, 1987.
  4. Mattern, "Virtual Time and Global States of Distributed Systems", Proc. Workshop on Parallel and Distributed Algorithms, 1989
  5. Fetzer and F. Cristian, "An optimal internal clock synchronization algorithm", COMPASS 1995
  6. Cristian and C. Fetzer, "Fault-tolerant external clock synchronization", ICDCS 1995
  7. Kshemkalyanit, M. Raynalt and M. Singhals, "An introduction to snapshot algorithms in distributed computing"
  8. C.l Fidge, Timestamps in message-passing systems that preserve the partial ordering , Australian Computer Sci. Comm. 10 (I) (February 1988) 56-66.
  9. C.l Fidge, Fundamentals of distributed system observation, IEEE Software 13 (6) (November 1996) 77-83.
  10. Raynal M. and Singhal M., Logical time: Capturing causality in distributed systems, Computer, vol. 29, pp. 49-56, 1996.
  11. C.l Fidge, A limitation of vector timestamps for reconstructing distributed computations, in: Elsevier Science, 1998, Information Processing Letters 87-91.
  12. Mukesh Singhal and Ajay Kshemkalyani, An efficient implementation of vector clocks in Elsevier Science publishers.
  13. Facebook's Cassandra uses synchronized clocks for its 'Last Write Wins' policy for conflict resolution
  14. Spanner: Google’s Globally-Distributed Database estimates worst-case clock drift.
  15. LinkedIn's Project Voldemort uses vector clocks for versioning, conflict resolution, and repairing replicas.

 

Distributed Operating Systems (review required: any TWO papers):

Remote Procedure Calls and Distributed Shared Memory:

  1. Birrell, and B. Nelson, "Implementing remote procedure calls", ACM Transactions on Computer Systems, 1984
  2. G. Soares, "On remote procedure call", Proc. of the 1992 conference of the Centre for Advanced Studies on Collaborative research, 1992
  3. L. Ananda, B. H. Tay and E. K. Koh, "A survey of asynchronous remote procedure calls", SIGOPS Operating Systems Review, 1992
  4. A lecture of RPC, "http://www.cs.cf.ac.uk/Dave/C/node33.html"

Mutual Exclusion:

  1. Ricart and A. Agrawala, "An optimal algorithm for mutual exclusion in computer networks", Communications of the ACM, 1981
  2. Lamport, "Mutual Exclusion Problem":part1", "part 2", Journal of the ACM, 1986
  3. Lamport, "A Fast Mutual Exclusion Algorithm", ACM Transactions on Computer Systems, 1987
  4. Raymond, "A Tree Based Algorithm for Distributed Mutual Exclusion", ACM Transactions on Computer Systems, 1989

Leader Election:

  1. Garcia-Molina, "Elections in a Distributed Computing Systems"

Distributed Deadlocks:

  1. K. Elmagarmid, "A survey of distributed deadlock detection algorithms", ACM SIGMOD, 1986
  2. Singhal, "Deadlock detection in distributed systems", IEEE Computer, 1989

Distributed File Systems:

  1. Satyanarayanan, "A Survey of Distributed File Systems", Annual Review of Computer Science, 1989
  2. Noble and M. Satyanarayanan, "An Empirical Study of a Highly Available File System", ACM Sigmetrics, 1994
  3. Spasojevic and M. Satyanarayanan, "An Empirical Study of a Wide-Area Distributed File System", ACM Transactions on Computer Systems, 1996
  4. Kubiatowicz, "OceanStore : An Architecture for Global-Scale Persistent Storage", ACM ASPLOS 2000
  5. Kubiatowicz, "The Google File System", ACM SOSP, 2003.

Process Migration

  1. M. Smith, "A survey of process migration mechanisms", ACM SIGOPS Operating Systems, 1988.
  2. A Barak, O Laden, Y Yarom - Citeseer, "The NOW MOSIX and its preemptive process migration scheme", 1995.

Processing and Load Balancing:

  1. H. Willebeek-LeMair, A. P. Reeves, "Strategies for Dynamic Load Balancing on Highly Parallel Computers", IEEE Transactions on Parallel and Distributed Systems, 1993
  2. Venkatasubramanian, S. Ramanathan, "Load Management in Distributed Video Servers", ICDCS 1997
  3. Cardellini, M. Colajanni, "Dynamic Load Balancing on Web-server Systems", Journal IEEE Internet Computing, 1999
  4. Schnekenburger, "Load Balancing in CORBA: A Survey, Response to the Aggregated Computing RFI".

Distributed Operating Systems:

  1. J. Bolosky, R. P. Draves, R. P. Fitzgerald, C. W. Fraser, M. B. Jones, T. B. Knoblock and R. Rashid "Operating System Directions for the Next Millenium", Proc. of the 6th Workshop on Hot Topics in Operating Systems, 1997
  2. Rozier, V. Abrossimov, F. Armund et al,Overview of the Chorus Distributed Operating System
  3. Andrew S. Tanenbaum, M. Frans Kaashoek, Robert van Renesse, Henri E. Bal, The Amoeba Distributed Operating System - A Status Report

 

Messaging Technologies (review required: any TWO papers):

  1. A Case for Message Oriented Middleware, G. Banavar et al.
  2. Dolev and D. Malkhi, "The Transis Approach to High Availability Cluster Communication", Other Interesting Reading: Documentation and papers about Transis are also avaiable at "http://www.cs.huji.ac.il/labs/transis/
  3. Amir, et al, "Group Communication as an Infrastructure for Distributed System Management", Proc. of the 3rd Workshop on Services in Distributed and Networked Environments, 1996
  4. Amir, et al, "The Spread Wide Area Group Communication System
  5. V. Renesse, K. P. Birman, and S. Maffeis, "Horus: A Flexible Group Communication System", Communications of the ACM, 1996
  6. Banerjee, B. Bhattacharjee and C. Kommareddy, "Scalable Application Layer Multicast", ACM SIGCOMM 2002
  7. Amir, C. Nita-Rotaru, J. Stanton, G. Tsudik, "Secure Spread: An Integrated Architecture for Secure Group Communication", IEEE Transactions on Dependable and Secure Computing, 2005
  8. The Many Faces of Publish/Subscribe, PATRICK TH. EUGSTER

 

Fault Tolerance and Reliability (review required: any TWO papers):

 Consensus

  1. J. Fischer, N. A. Lynch, and M. S. Paterson, "Impossibility of Distributed Consensus with One Faulty Process", Journal of ACM, 1985
  2. Dolev, C. Dwork, L. Stockmeyer, "On the Minimal Synchronism Needed for Distributed Consensus", Journal of ACM, 1987.

 Failure Detectors

  1. D. Chandra and S. Toueg, "Unreliable Failure Detectors for Reliable Distributed Systems", Journal of ACM, 1985
  2. D. Chandra, V. Hadzilacos and S. Toueg, "The Weakest Failure Detector for Solving Consensus", Journal of ACM, 1996
  3. K. Aguilera, W. Chen, and S. Toueg, "Heartbeat: A Timeout-Free Failure Detector for Quiescent Reliable Communication", Cornel, 1997

 Replication

  1. S. Sandhu and S. Zhou, "Cluster-based file replication in large-scale distributed systems", ACM SIGMETRICS, 1992
  2. Gray, P. Helland, P. Neil and D. Shasha , "The dangers of replication and a solution", ACM SIGMOD, 1996

 Logging

  1. P. Sistla and J. L. Welch, "Efficient distributed recovery using message logging", ACM SIOPS, 1989

 

Middleware Frameworks (NO REVIEW REQUIRED):

Distributed Computing Frameworks:

  1. DCE
  2. The DCE security service, Hewlett-Packard Journal, 1995.
  3. MapReduce: simplified data processing on large clusters
  4. Hadoop: The Hadoop Distributed File System: Architecture and Design
  5. Yahoo! Hadoop Tutorial

Object-based Middleware:

  1. CORBA specification, www.omg.org
  2. RT CORBA: Realt time CORBA
  3. Fault tolerance CORBA: A Fault Tolerance Framework for CORBA
  4. ZEN: Optimizing the ORB Core to Enhance Real-time CORBA Predictability and Performance
  5. Data Access and Integration: ODBC/JDBC
  6. Java Jini: "Architectural Overview", Sun Microsystems
  7. Java RMI: "Java RMI Tutorial"
  8. EJB: "Enterprise JavaBeans Technology", Sun Developer Network
  9. J2EE: "Overview", Sun Developer Network

Service Oriented Architectures and Web Services:

  1. Web services: "Part of the lectures" by M. Fisher
  2. .NET: "The .NET Framework"
  3. SOAP Web Service: (http://www.w3.org/TR/soap/)
  4. A comparison of SOAP and REST implementations of a service based interaction independence middleware framework
  5. SOAP-binQ: high-performance SOAP with continuous quality management
  6. SOA: Service-Oriented Computing: State of the Art and Research Challenges
  7. Restful Web-Service: Original work is done by Roy Fielding at UCI as his Ph.D thesis: (http://roy.gbiv.com/vita.html)
  8. Principled design of the modern Web architecture
  9. Middleware queues for job submission, messaging, etc.

 

Cloud Computing, Mobile Cloud Computing Platforms (NO REVIEW REQUIRED):

  1. Giurgiu, O. Riva, D. Juric, I. Krivulev, and G. Alonso,Calling the Cloud: Enabling mobile phones as interfaces to cloud applications, Journal of ACM, 1985.
  2. Chun, S. Ihm, P. Maniatis, M. Naik, A. Patti,CloneCloud: Elastic Execution between Mobile Device and Cloud, To appear in Proceedings of the 6th European Conference on Computer Systems (EuroSys 2011), April 2011.
  3. Above the Clouds: A Berkeley View of Cloud Computing: Technical Report No. UCB/EECS-2009-28.
  4. Wen, W. Zhang,and H. Luo, "Energy Optimal Mobile Application Execution: Taming Resource-Poor Mobile Devices with Cloud Clones", In IEEE INFOCOM 2012.
  5. Michael P. Papazoglou, "Cloud Blueprints for Integrating and Managing Cloud Federations", In Springer Software Service and Application Engineering, 2012.
  6. Tobias Kurze, Markus Klemsy, David Bermbachy, Alexander Lenkz, Stefan Taiy and Marcel Kunze, "Cloud Federation".
  7. "Towards Characterizing Cloud Backend Workloads: Insights from Google Compute Clusters".
  8. "CloudNaaS: A Cloud Networking Platform for Enterprise Applications".
  9. "Effects of virtualization and cloud computing on data center networks".
  10. "The Hadoop Distributed File System: Architecture and Design".
  11. "MapReduce: Simplified Data Processing on Large Clusters".
  12. "The Case for Enterprise-Ready Virtual Private Clouds".
  13. What is (isn't) Google App Engine?, https://developers.google.com/appengine/training/intro/whatisgae
  14. Introducing Azure, http://azure.microsoft.com/en-us/documentation/articles/fundamentals-introduction-to-azure/