CS237 Distributed Systems Middleware
Lecture Notes
- Lecture 1: Middleware and Distributed Systems Fundamentals
- Lecture 2: Virtual Time and Global States in Distributed Systems.
- Lecture 3: Distributed Operating Systems Concepts.
- Lecture 4: Distributed OS Case Studies (Amoeba).
- Lecture 5: Messaging Middlewares, Messaging Group, Distributed Pub/Sub
- Lecture 6: Fault-Tolerance,Middleware Frameworks: DCE
- Midterm: Midterm Review, Sample
- Lecture 7: Middleware Frameworks: CORBA
- Lecture 8: Middleware Frameworks Java-based Technologies, Jini, EJB,
- Lecture 9: Middleware Frameworks XML, Web Services, Service Oriented Architectures
- Lecture 10: Middleware for Cloud Computing
- Middleware for QoS-Enabled Environments
- Middleware for Embedded Environments
- Middleware for Secure Environments
- Middleware for Mobile and Ubiquitous Environments
Course Reading Materials
How to read a paper:
- How to Read a Paper S. Keshav David R. Cheriton School of Computer Science, University of Waterloo Waterloo, ON, Canada
Reference Books:
- Coulouris et al Distributed Systems: Concepts & Design,4th ed. ISBN: 0-321-26354-5.
- Tanenbaum & van Steen Distributed Systems: Principles and Paradigms, 2nd ed. ISBN: 0-132-39227-5.
- Ben-Ari Principles of Concurrent and Distributed Programming Prentice-Hall International Series in Computer Science, 1990.
- Sape Mullender Distributed Systems Second Edition, Addison-Wesley, 1998.
- Haggit Attiya and Jennifer Welch Distributed Computing: Fundamentals, Simulations and Advanced Topics
- McGraw Hill, 1998.
- Robert Orfali and Dan Harkey Client/Server Programming with Java and CORBA, Second Edition John Wiley and Sons Inc., 1998
Middleware and Distributed Systems Fundamentals (NO REVIEW REQUIRED):
- Middleware David E. Bakken: Encyclopedia of Distributed Computing, Kluwer Academic Publisher.
- Managing Complexity: Middleware Explained Andrew T. Campbell, Geoff Coulson, and Michael E. Kounavis IT Professional, Vol. 1, No. 5, September/October 1999.
- Middleware a model for distributed system services Philip A. Bernstein; Commun. ACM 39, 2 (Feb. 1996), Pages 86 - 98
Virtual Time and Global State in Distributed Systems (review required, any TWO papers):
- Lamport, "Time, Clocks and the Ordering of Events in a Distributed System", Communications of the ACM, 1978
- M. Chandy and L. Lamport, "Distributed Snapshots: Determining Global States of Distributed Systems", ACM Transactions on Computer Systems, 1985
- Jefferson, "Distributed Simulation and the Time Warp Operating System", ACM Symposium on Operating Systems Principles, 1987.
- Mattern, "Virtual Time and Global States of Distributed Systems", Proc. Workshop on Parallel and Distributed Algorithms, 1989
- Fetzer and F. Cristian, "An optimal internal clock synchronization algorithm", COMPASS 1995
- Cristian and C. Fetzer, "Fault-tolerant external clock synchronization", ICDCS 1995
- Kshemkalyanit, M. Raynalt and M. Singhals, "An introduction to snapshot algorithms in distributed computing"
- C.l Fidge, Timestamps in message-passing systems that preserve the partial ordering , Australian Computer Sci. Comm. 10 (I) (February 1988) 56-66.
- C.l Fidge, Fundamentals of distributed system observation, IEEE Software 13 (6) (November 1996) 77-83.
- Raynal M. and Singhal M., Logical time: Capturing causality in distributed systems, Computer, vol. 29, pp. 49-56, 1996.
- C.l Fidge, A limitation of vector timestamps for reconstructing distributed computations, in: Elsevier Science, 1998, Information Processing Letters 87-91.
- Mukesh Singhal and Ajay Kshemkalyani, An efficient implementation of vector clocks in Elsevier Science publishers.
- Facebook's Cassandra uses synchronized clocks for its 'Last Write Wins' policy for conflict resolution
- Spanner: Google’s Globally-Distributed Database estimates worst-case clock drift.
- LinkedIn's Project Voldemort uses vector clocks for versioning, conflict resolution, and repairing replicas.
Distributed Operating Systems (review required: any TWO papers):
Remote Procedure Calls and Distributed Shared Memory:
- Birrell, and B. Nelson, "Implementing remote procedure calls", ACM Transactions on Computer Systems, 1984
- G. Soares, "On remote procedure call", Proc. of the 1992 conference of the Centre for Advanced Studies on Collaborative research, 1992
- L. Ananda, B. H. Tay and E. K. Koh, "A survey of asynchronous remote procedure calls", SIGOPS Operating Systems Review, 1992
- A lecture of RPC, "http://www.cs.cf.ac.uk/Dave/C/node33.html"
Mutual Exclusion:
- Ricart and A. Agrawala, "An optimal algorithm for mutual exclusion in computer networks", Communications of the ACM, 1981
- Lamport, "Mutual Exclusion Problem":part1", "part 2", Journal of the ACM, 1986
- Lamport, "A Fast Mutual Exclusion Algorithm", ACM Transactions on Computer Systems, 1987
- Raymond, "A Tree Based Algorithm for Distributed Mutual Exclusion", ACM Transactions on Computer Systems, 1989
Leader Election:
- Garcia-Molina, "Elections in a Distributed Computing Systems"
Distributed Deadlocks:
- K. Elmagarmid, "A survey of distributed deadlock detection algorithms", ACM SIGMOD, 1986
- Singhal, "Deadlock detection in distributed systems", IEEE Computer, 1989
Distributed File Systems:
- Satyanarayanan, "A Survey of Distributed File Systems", Annual Review of Computer Science, 1989
- Noble and M. Satyanarayanan, "An Empirical Study of a Highly Available File System", ACM Sigmetrics, 1994
- Spasojevic and M. Satyanarayanan, "An Empirical Study of a Wide-Area Distributed File System", ACM Transactions on Computer Systems, 1996
- Kubiatowicz, "OceanStore : An Architecture for Global-Scale Persistent Storage", ACM ASPLOS 2000
- Kubiatowicz, "The Google File System", ACM SOSP, 2003.
Process Migration
- M. Smith, "A survey of process migration mechanisms", ACM SIGOPS Operating Systems, 1988.
- A Barak, O Laden, Y Yarom - Citeseer, "The NOW MOSIX and its preemptive process migration scheme", 1995.
Processing and Load Balancing:
- H. Willebeek-LeMair, A. P. Reeves, "Strategies for Dynamic Load Balancing on Highly Parallel Computers", IEEE Transactions on Parallel and Distributed Systems, 1993
- Venkatasubramanian, S. Ramanathan, "Load Management in Distributed Video Servers", ICDCS 1997
- Cardellini, M. Colajanni, "Dynamic Load Balancing on Web-server Systems", Journal IEEE Internet Computing, 1999
- Schnekenburger, "Load Balancing in CORBA: A Survey, Response to the Aggregated Computing RFI".
Distributed Operating Systems:
- J. Bolosky, R. P. Draves, R. P. Fitzgerald, C. W. Fraser, M. B. Jones, T. B. Knoblock and R. Rashid "Operating System Directions for the Next Millenium", Proc. of the 6th Workshop on Hot Topics in Operating Systems, 1997
- Rozier, V. Abrossimov, F. Armund et al,Overview of the Chorus Distributed Operating System
- Andrew S. Tanenbaum, M. Frans Kaashoek, Robert van Renesse, Henri E. Bal, The Amoeba Distributed Operating System - A Status Report
Messaging Technologies (review required: any TWO papers):
- A Case for Message Oriented Middleware, G. Banavar et al.
- Dolev and D. Malkhi, "The Transis Approach to High Availability Cluster Communication", Other Interesting Reading: Documentation and papers about Transis are also avaiable at "http://www.cs.huji.ac.il/labs/transis/
- Amir, et al, "Group Communication as an Infrastructure for Distributed System Management", Proc. of the 3rd Workshop on Services in Distributed and Networked Environments, 1996
- Amir, et al, "The Spread Wide Area Group Communication System
- V. Renesse, K. P. Birman, and S. Maffeis, "Horus: A Flexible Group Communication System", Communications of the ACM, 1996
- Banerjee, B. Bhattacharjee and C. Kommareddy, "Scalable Application Layer Multicast", ACM SIGCOMM 2002
- Amir, C. Nita-Rotaru, J. Stanton, G. Tsudik, "Secure Spread: An Integrated Architecture for Secure Group Communication", IEEE Transactions on Dependable and Secure Computing, 2005
- The Many Faces of Publish/Subscribe, PATRICK TH. EUGSTER
Fault Tolerance and Reliability (review required: any TWO papers):
Consensus
- J. Fischer, N. A. Lynch, and M. S. Paterson, "Impossibility of Distributed Consensus with One Faulty Process", Journal of ACM, 1985
- Dolev, C. Dwork, L. Stockmeyer, "On the Minimal Synchronism Needed for Distributed Consensus", Journal of ACM, 1987.
Failure Detectors
- D. Chandra and S. Toueg, "Unreliable Failure Detectors for Reliable Distributed Systems", Journal of ACM, 1985
- D. Chandra, V. Hadzilacos and S. Toueg, "The Weakest Failure Detector for Solving Consensus", Journal of ACM, 1996
- K. Aguilera, W. Chen, and S. Toueg, "Heartbeat: A Timeout-Free Failure Detector for Quiescent Reliable Communication", Cornel, 1997
Replication
- S. Sandhu and S. Zhou, "Cluster-based file replication in large-scale distributed systems", ACM SIGMETRICS, 1992
- Gray, P. Helland, P. Neil and D. Shasha , "The dangers of replication and a solution", ACM SIGMOD, 1996
Logging
- P. Sistla and J. L. Welch, "Efficient distributed recovery using message logging", ACM SIOPS, 1989
Middleware Frameworks (NO REVIEW REQUIRED):
Distributed Computing Frameworks:
- DCE
- The DCE security service, Hewlett-Packard Journal, 1995.
- MapReduce: simplified data processing on large clusters
- Hadoop: The Hadoop Distributed File System: Architecture and Design
- Yahoo! Hadoop Tutorial
Object-based Middleware:
- CORBA specification, www.omg.org
- RT CORBA: Realt time CORBA
- Fault tolerance CORBA: A Fault Tolerance Framework for CORBA
- ZEN: Optimizing the ORB Core to Enhance Real-time CORBA Predictability and Performance
- Data Access and Integration: ODBC/JDBC
- Java Jini: "Architectural Overview", Sun Microsystems
- Java RMI: "Java RMI Tutorial"
- EJB: "Enterprise JavaBeans Technology", Sun Developer Network
- J2EE: "Overview", Sun Developer Network
Service Oriented Architectures and Web Services:
- Web services: "Part of the lectures" by M. Fisher
- .NET: "The .NET Framework"
- SOAP Web Service: (http://www.w3.org/TR/soap/)
- A comparison of SOAP and REST implementations of a service based interaction independence middleware framework
- SOAP-binQ: high-performance SOAP with continuous quality management
- SOA: Service-Oriented Computing: State of the Art and Research Challenges
- Restful Web-Service: Original work is done by Roy Fielding at UCI as his Ph.D thesis: (http://roy.gbiv.com/vita.html)
- Principled design of the modern Web architecture
- Middleware queues for job submission, messaging, etc.
Cloud Computing, Mobile Cloud Computing Platforms (NO REVIEW REQUIRED):
- Giurgiu, O. Riva, D. Juric, I. Krivulev, and G. Alonso,Calling the Cloud: Enabling mobile phones as interfaces to cloud applications, Journal of ACM, 1985.
- Chun, S. Ihm, P. Maniatis, M. Naik, A. Patti,CloneCloud: Elastic Execution between Mobile Device and Cloud, To appear in Proceedings of the 6th European Conference on Computer Systems (EuroSys 2011), April 2011.
- Above the Clouds: A Berkeley View of Cloud Computing: Technical Report No. UCB/EECS-2009-28.
- Wen, W. Zhang,and H. Luo, "Energy Optimal Mobile Application Execution: Taming Resource-Poor Mobile Devices with Cloud Clones", In IEEE INFOCOM 2012.
- Michael P. Papazoglou, "Cloud Blueprints for Integrating and Managing Cloud Federations", In Springer Software Service and Application Engineering, 2012.
- Tobias Kurze, Markus Klemsy, David Bermbachy, Alexander Lenkz, Stefan Taiy and Marcel Kunze, "Cloud Federation".
- "Towards Characterizing Cloud Backend Workloads: Insights from Google Compute Clusters".
- "CloudNaaS: A Cloud Networking Platform for Enterprise Applications".
- "Effects of virtualization and cloud computing on data center networks".
- "The Hadoop Distributed File System: Architecture and Design".
- "MapReduce: Simplified Data Processing on Large Clusters".
- "The Case for Enterprise-Ready Virtual Private Clouds".
- What is (isn't) Google App Engine?, https://developers.google.com/appengine/training/intro/whatisgae
- Introducing Azure, http://azure.microsoft.com/en-us/documentation/articles/fundamentals-introduction-to-azure/