Homework 4: Bigger files for xv6

In this assignment you'll increase the maximum size of an xv6 file. Currently xv6 files are limited to 140 sectors, or 71,680 bytes. This limit comes from the fact that an xv6 inode contains 12 "direct" block numbers and one "singly-indirect" block number, which refers to a block that holds up to 128 more block numbers, for a total of 12+128=140. You'll change the xv6 file system code to support a "triply-indirect" block in each inode, containing 128 addresses of doubly-indirect blocks, each of which can contain up to 16384 addresses of data blocks.

Question 1
Using the triple indirect system what is the maximum number of blocks one inode can address?
This corresponds to how many bytes on disk?

Question 2

If we were to implement a doubly-indirect system instead of a triple-indirect system what would be the maximum number of blocks one inode could address?
This corresponds to how many bytes on disk?


Modify your Makefile's CPUS definition so that it reads:

CPUS := 1

And add -snapshot to the definition of QEMUOPTS:

QEMUOPTS = -drive file=fs.img,index=1,media=disk,format=raw -drive file=xv6.img,index=0,media=disk,format=raw -smp $(CPUS) -m 512 $(QEMUEXTRA) -snapshot

The above two steps speed up qemu tremendously when xv6 creates large files.

mkfs initializes the file system to have fewer than 1000 free data blocks, too few to show off the changes you'll make. Modify the code at the start of param.h to read (thats 128*128*128 blocks or 1GB)

#define FSSIZE       2097152

You don't really need 1GB since we only be writing 160MB of data, so you can play with this number if generating a file system take too long. Note:
During make mkfs.c has to build a 1GB File system. This takes some time, sometimes up to a minute depening on the machine, be patient.

Download big.c into your xv6 directory, add them to the UPROGS list (same as you did for ps), start up xv6, and run big. It writes 4 files: 16MB, 32MB, 48MB, and 64MB, and then reads them --- this will help you to test your solution. Big writes 1mb at a time, which is significantly faster (and yet it takes quite a bit of time).

What to Look At

The format of an on-disk inode is defined by struct dinode in fs.h. You're particularly interested in NDIRECT, NINDIRECT, MAXFILE, and the addrs[] element of struct dinode. Look here for a diagram of the standard xv6 inode.

Since you're using a triple indirect systemyou will have to add more defines to fs.h.

The code that finds a file's data on disk is in bmap() in fs.c. Have a look at it and make sure you understand what it's doing. bmap() is called both when reading and writing a file. When writing, bmap() allocates new blocks as needed to hold file content, as well as allocating an indirect block if needed to hold block addresses.

bmap() deals with two kinds of block numbers. The bn argument is a "logical block" -- a block number relative to the start of the file. The block numbers in ip->addrs[], and the argument to bread(), are disk block numbers. You can view bmap() as mapping a file's logical block numbers into disk block numbers.

Your Job

Modify bmap() so that it implements a triple-indirect system. The first 12 elements of ip->addrs[] should be direct blocks; the 13th should be a triple-indirect block system. A ascii examble below:
            struct inode
         |               |
         |     type      |
         |               |
	 |     major     |
         |     ....      |
         |               |
         |     size      |
         |    addrs[0]   | -> Physical
         |    addrs[1]   | -> Physical
         |    addrs[2]   | -> Physical
         |    addrs[3]   | -> Physical
         |      ....     |
         |    adrs[12]   | -> 	_________________
         -----------------	|     tbl1[0]   | -> _________________
                                -----------------    |     tbl2[0]   | -> _______________
                                |     .....   	|    -----------------    |   tbl3[0]   | -> Physical
         			-----------------    |     .....     |    ---------------
                                |     .....     |    -----------------    |   .......   |
				-----------------    |    ........   |    ---------------
                                |     .....   	|    -----------------    |   tbl3[127] | -> Physical
         			-----------------    |     .....     |    ---------------
                                |     .....     |    -----------------
				-----------------    |    tbl2[127]  |
                                |     .....   	|    -----------------
                                |     .....   	|    
                                |     .....     |    
                                |     .....   	|   
                                |     .....     |    
                                |     .....   	|  
                                |    tbl1[127]  | ->_________________
				-----------------   |     tbl2[0]   | -> _______________
         					    -----------------    |   tbl3[0]   | -> Physical
         					    |     .....     |    ---------------
						    -----------------    |   .......   |
						    |    ........   |    ---------------
         					    -----------------    |   tbl3[127] | -> Physical
         					    |     .....     |    ---------------
						    |    tbl2[127]  |

One thing to remember is each entry with a ... points to a table as well, I just couldn't make the graph in ascii like that.
Another way to think about it is a 3D array table[128][128][128].

After you have modified bmap() you need to go into mkfs.c and modify iappend(). This c file gets run during the make process and builds the file system image which gets run with QEMU. The code and logic will be extremely similar to the code you had written in bmap().

You will have to implement these changes, these changes allow for you to write large files into the xv6 filesystem.

You don't have to modify xv6 to handle deletion of files with triple-indirect blocks.


Make sure you understand bmap(). Write out a diagram of the relationships between ip->addrs[], the direct blocks, the triple-indirect blocks. Make sure you understand why adding a triple-indirect block increases the maximum file size by such a large factor.

Think about how you'll index the triple-indirect block, and the double blocks and single blocks it points to, with the logical block number.

If your file system gets into a bad state, perhaps by crashing, you may need to delete fs.img (do this from Unix, not xv6).

Don't forget to brelse() each block that you bread().

You should allocate indirect blocks and doubly-indirect blocks only as needed, like the original bmap().

The TA's solution was done in around 70 lines of code

Extra Credit 1 (10%)
Writing a 1gb file to disk in xv6 is incredibly slow. It gets slower the fuller the disk gets(hint). There are two ways to speed up writing a file to disk. In extra_credit.txt explain why when the disk gets fuller writing gets slower-- Explain how you can mitigate this problem. Second, explain the other way to speed up writes and why that works (hint log.c).

Extra Credit 2 (20%)
Implement these changes, in xv6, resolving all panics and failed assertions that arise. Do a crappy timing experiment using a stop watch. Explain how much speed up occured.


Submit your answers to the Class Dropbox (143A HW 4: Big files) as a compressed tar file of your xv6 source tree, and as a separate PDF file with answers for questions and challenges. You can use the following command to create a compressed tar file

vagrant@odin$ cd /vagrant/ics143a
vagrant@odin$ tar -czvf hw4.tgz xv6-public
Before making the tar file do
make clean
then remove any large test files that you have created.
Updated: March, 2017