Project 2 for CS222 (Principles of Database Management)

Professor Chen Li
Fall 2007, UC Irvine

We have made some changes to the source code of the file system. Please download the new version from here.

Index

  1. Introduction
  2. IX Interface
  3. Implementation Details

1. Introduction

In this project you will implement an Indexing (IX) component. The IX component provides classes and methods for managing persistent indexes over unordered data records stored in files. Each data file may have any number of (single-attribute) indexes associated with it. The indexes ultimately will be used to speed up processing of relational selections, joins, and condition-based update and delete operations. Like the data records themselves, the indexes are also stored in files. Hence, in implementing the IX component, you will use the file system provided in project 0, similarly to the way you used it for project 1. In this database system architecture, you can think of the IX component and the record manager as sitting side by side above the file system.

The indexing technique you will implement in the IX component is B+ trees. Because a "perfect" implementation of B+ trees turns out to be quite complex, we are allowing some simplifications as discussed in the Implementation Details section below.

All class names, return codes, constants, etc. in this component should begin with the prefix IX. Each B+ tree index can be stored in one file in the extent-based file system (called "File Manager"). Some specific implementation suggestions are given later in this document.

You can download the source code.

Note: You can certainly find pseudo code and perhaps even software packages for B+ trees available publicly. Here is an ACM SIGMOD Record paper specifying B+ tree deletion algorithms (pdf). You are welcome to use anything you find, as long as you provide proper acknowledgment in your readme file when you turn in this part of the project. However, we do warn against simply copying available code and then trying to modify it to fit the specification of the project. That approach is very likely to be more difficult than using available code or pseudo code for algorithmic ideas and reference.

2. IX Interface

The IX interface you will implement consists of three classes: the IX_Manager class, the IX_IndexHandle class, and the IX_IndexScan class. In addition, there is an IX_PrintError routine for printing messages associated with nonzero IX return codes. As usual, all IX component public methods (except constructors and destructors) should return 0 if they complete normally and a nonzero return code otherwise.

2.1 IX_Manager Class

The IX_Manager class handles the creation, deletion, opening, and closing of indexes. Your program should create exactly one instance of this class. All necessary initialization of the IX component should take place within the constructor for the IX_Manager class. This constructor takes a disk name for the file system. Any necessary clean-up in the IX component should take place within the destructor for the IX_Manager class.
class IX_Manager {
 public:
  IX_Manager   (const char *diskName);         // Constructor
  ~IX_Manager  ();                             // Destructor

  RC CreateIndex(const char *tableName,      // create new index
		 const char *attributeName);

  RC DestroyIndex(const char *tableName,      // destroy an index
		  const char *attributeName);

  RC OpenIndex(const char *tableName,      // open an index
	       const char *attributeName,
	       IX_IndexHandle &indexHandle);

  RC CloseIndex(IX_IndexHandle &indexHandle); // close index
};

RC CreateIndex(const char *tableName, const char *attributeName);

This method creates an index on a given attribute of a given table. The index should be stored in a file with a unique name for this attribute.

RC DestroyIndex(const char *tableName, const char *attributeName);

This method should destroy the index on the attribute of the table (if any).

RC OpenIndex(const char *tableName, const char *attributeName, IX_IndexHandle &indexHandle);

This method should open the index on the specified attribute. If the method is successful, the indexHandle object should become a handle for the open index. The index handle is used to insert into and delete entries from the index (see the IX_IndexHandle methods below), and it can be passed into an IX_IndexScan constructor (see below) for performing a scan using the index. As with RM component files, clients should be able to open an index more than once for reading using a different indexHandle object each time. However, you may make the assumption (without checking it) that if a client is modifying an index, then no other clients are using an indexHandle to read or modify that index.

2.2 IX_IndexHandle Class

The IX_IndexHandle class is used to insert and delete index entries. To perform these operations, a client first creates an instance of this class and passes it to the IX_Manager::OpenIndex method described above. In the following, "RID" is a type defined for record ids.
class IX_IndexHandle {
  public:
       IX_IndexHandle  ();                             // Constructor
       ~IX_IndexHandle ();                             // Destructor

       RC InsertEntry(void *key, const RID &rid);  // Insert new index entry
       RC DeleteEntry(void *key, const RID &rid);  // Delete index entry
 };

RC InsertEntry(void *key, const RID &rid); // Insert new index entry

For this and the following two methods, it is incorrect if the IX_IndexHandle object for which the method is called does not refer to an open index. This method should insert a new entry into the index associated with IX_IndexHandle. Parameter key points to the attribute value to be inserted into the index, and parameter rid identifies the record with that value to be added to the index. Hence, this method effectively inserts an entry for the pair (*key,rid) into the index. (The index should contain only the record's RID, not the record itself.) The format for the passed key value is the following: (1) For INT and REAL: use 4 bytes; (2) For CHAR and VARCHAR: use 4 bytes for the length followed by the characters. This method should return a nonzero code if there is already an entry for (*key,rid) in the index.

Notice this function and the following one only deal with the index, and they do NOT modify the record in the record manager. They should not change the record id.

RC DeleteEntry (void *key, const RID &rid)

This method should delete the entry for the (*key,rid) pair from the index associated with IX_IndexHandle. Although clients of the IX Component typically will ensure that DeleteEntry is not called for entries that are not in the index, for debugging purposes you may want to return a (positive) error code if such a call is made.

2.3 IX_IndexScan Class

The IX_IndexScan class is used to perform condition-based scans over the entries of an index.
class IX_IndexScan {
 public:
  IX_IndexScan();  // Constructor
  ~IX_IndexScan(); // Destructor

  RC OpenScan(const IX_IndexHandle &indexHandle, // Initialize index scan
	      CompOp      compOp,
	      void        *value);           

  RC GetNextEntry(RID &rid);  // Get next matching entry
  RC CloseScan();             // Terminate index scan
};

RC OpenScan (const IX_IndexHandle &indexHandle, CompOp compOp, void *value)

This method should initialize a condition-based scan over the entries in the open index referred to by parameter indexHandle. Once underway, the scan should produce the RIDs of all records whose indexed attribute value compares in the specified way with the specified value. The different values for compOp are defined in rm/rm.h.

The only exception is that for B+ tree scans, you may choose to disallow comparison operator NE_OP (not-equal). The format of the parameter value is the same as that in IX_IndexHandle::InsertEntry().

RC GetNextEntry (RID &rid)

This method should set output parameter rid to be the RID of the next record in the index scan. This method should return IX_EOF (a positive return code that you define) if there are no index entries left satisfying the scan condition. You may assume that IX component clients will not close the corresponding open index while a scan is underway.

RC CloseScan ()

This method should terminate the index scan.

2.4 IX_PrintError

void IX_PrintError (RC rc);

This routine should write a message associated with the nonzero IX return code rc onto the stderr output stream. This routine has no return value.

3. Implementation Details


Acknowledgements: Part of this project was based on course materials of Prof. Jennifer Widom at Stanford University.

For any problems, questions or suggestions about this page, please contact chenli + AT + ics.uci.edu. rev. Thursday, November 15, 2007 - 09:53:40