Welcome to the Schark Research Group

at the University of California, Irvine





Principal Investigator

Isaac D. Scherson

Research Interests

-Parallel Computing Architectures
-Operating Systems for Parallel Computers
-Parallel Algorithms
-Simulation Models
-Performance Evaluation

Research Associates

Graduated Students

PhD

Elisha Caspari
Sener Ilgan
David Kramer
Brian Alleyne
Peter Corbett
Raghu Subramanian
Chi-Kai Chien
Umesh Krishnaswamy
Veronica Reis
Luis Miguel Campos
Vara Ramakrishnan
Fabricio Silva
David Wangerin
Shean T. McMahon
Daniel Valencia

M.S.

Sandeep Sen
Yiming Ma
Li-Wei Gary Chen
John LastMinute Zapisek

Current Students

Enrique Cauich
Richert Wang
John Duselis
Martin Pieters


Research Review and Current Research

Our group's research interests fall in the general areas of Parallel and Distributed computer architectures and concurrent computations and its applications. The research group is currently working on five main subjects: operating systems for parallel computers, interconnection networks, performance evaluation, parallel algorithms, simulation models, predictive parallelism, and resource management and discovery.

Service Address Routing (SAR)

The idea behind SAR is that nodes are not aware of each other, but of functionality there is for them to use. Nodes do not explicitly send messages to each other, but instead request instances of a particular function (service) to be used. In this manner, addresses do not identify nodes directly, but rather identify types of services nodes can provide. The intelligence related to finding and selecting which node will receive a service request (when there are multiple nodes capable of providing such service) is in the network. Although the network is envisioned as a single switching entity, a hierarchical, multi-module implementation is proposed for scalability purposed using the Least Common Ancestor Network (LCAN) as the interconnection design. The LCAN architecture inherently provides fast simultaneous service discovery and redundancy, which increases performance and fault tolerance within the system. The paradigm has been shown to be feasible using a combination of hardware/firmware fabric that can efficiently perform the service name search necessary for discovery. The approach does not use the conventional DNS service and performs the equivalent of an associative, location independent search

The LCAN-embedded Hierarchical Service Directory is an embodiment of a SAR network. We have shown the feasibility of this idea and also that it can be efficiently mapped to the amorphous GRID with highly desirable properties such as scalability, fault tolerance, and fully distributed operation.

Other advantages, obtained in tightly coupled implementations where the switching network contains the resource management functionality for the cluster, are also present in loosely-coupled systems with systems in either software and/or hardware.

Transparent Remote Execution (TREx)


TREx is an environment destined to use idle CPU cycles in a network to provide cost-effective, high-performance distributed computing. Typically, clerical stations are underutilized in an institutional network. Processes from overloaded nodes on the network might utilize this idle computing power. TREx is a mechanism to federate lightly used stations to perform useful calculations. Furthermore, TREx provides execution speedup, increases productivity, and can create a low-cost, high-performance cluster.

TREx allows programs to execute locally or remotely without user intervention with a properly administered workstation. Process execution is transparent to the user and remote execution appears as if it was done locally. We refer to a set of participating computers as a Federated Cluster of Workstations. TREx consists of three components: the first one is a daemon that will subscribe itself to the federated cluster if the computer is underutilized and unsubscribe it otherwise; the second is a "server program" that organizes the nodes into hierarchical structures handling the static service discovery and static load distribution inside the federated cluster; the third component is a fully distributed resource manager that handles the number of federations defined in a single network, dynamic membership in the federations, and dynamic load distribution and migration within federations.

This project involves two major challenges. The first one is how to dynamically organize the nodes and how to structure the network in order to provide efficient discovery and executions. The second involves maintaining a secure group membership in the federated cluster in order to avoid malicious behavior such as denial of service attacks or mal-intentioned code execution.

Low-level Resource Management

In high speed networks, individual resources, such as solid state memory, are being employed by a remote user individually, as opposed to having to utilize the whole remote system. This implementation is done at the driver level.

Advanced Quality of Service (AQoS)

Quality of Service is an important feature when dealing with real-time applications such as VOIP or Streaming Media. However, due to the internet's heterogeneous architecture, controlling the QoS is a challenging task if a receiver is located outside of a private network and must traverse other service providers' domains. Overlay networks are a solution to improve QoS when sending information outside a controlled network. The Advanced Quality of Service (AQoS) project extends TREx's resource manager by monitoring the network as a resource in order to use its local information to find better quality routes than the underlying internet backbone. This is a novel, purely distributed, adaptive, and scalable solution for improving routes regardless of a node's physical location on the internet using an overlay network.


Research Interests

Operating Systems

The design goal of operating systems for parallel computers is to provide a level of support to the programmer similar to that provided by current uniprocessor operating systems. The programmer programs a virtual machine with as many virtual processors necessary to exploit the inherent parallelism of the application. The operating system emulates this virtual machine, making parallel programs portable. In this context various problems are being addressed: spatial and temporal scheduling of virtual processors, efficient synchronization techniques, virtual memory management and I/O issues.

Interconnection Networks

Our work on interconnection networks for massively parallel systems involves the development of cost-effective high performance networks capable of supporting thousands or millions of processing elements. Included in this study is the performance analysis of Expanded Delta Networks (EDNs) and Least Common Ancestor Networks (LCANs) under commonly occurring sets of processor to processor communication patterns. As a result of the effort, efficient off-line routing algorithms for EDNs were developed and applied to commercially available massively parallel computers.

Performance Evaluation

Current research in performance evaluation deals with the development of models and methodologies for a general supercomputer performance evaluation theory. Such methods are being developed bottom up by building on known computational models and benchmarks.

Parallel Algorithms

Algorithmic topics include parallel models of computation and algorithms. Additionally, based on the development of Shearsort, our research group is looking for a proper taxonomy of parallel sorting. Many similarities can be found among sorting techniques and a unified framework is needed to enable further advances in this important area of study.

Simulation Models

Current research focuses on the development of all areas of advanced simulation models and techniques. Topics of interest include general purpose discrete-events simulation, synthetic workload generation, performance evaluation, and analysis of scheduling and load balancing algorithms through simulation.

Synthetic Workload Generation

The NASA Remote Exploration and Experimentation Project (REE) is interested in supporting academic research projects in areas that are needed to facilitate the development of fault-tolerant, COTS-based, parallel processors for use in space. The Schark research group is investigating the development of synthetic workload models for use in characterizing REE system performance over a wide range of application-types and fault conditions.


Software

Load Balancing/Scheduling Simulator

Selected Publications

School of Information and Computer Science,
University of California, Irvine CA 92697-3425