143A Principles of Operating Systems

In this assignment you'll increase the maximum size of an xv6 file. Currently xv6 files are limited to 140 sectors, or 71,680 bytes. This limit comes from the fact that an xv6 inode contains 12 "direct" block numbers and one "singly-indirect" block number, which refers to a block that holds up to 128 more block numbers, for a total of 12+128=140. You'll change the xv6 file system code to support a "triply-indirect" block in each inode, containing 128 addresses of doubly-indirect blocks, each of which can contain up to 16384 addresses of data blocks.

Question 1
Using the triple indirect system what is the maximum number of blocks one inode can address?
This corresponds to how many bytes on disk?

Question 2

If we were to implement a doubly-indirect system instead of a triple-indirect system what would be the maximum number of blocks one inode could address?
This corresponds to how many bytes on disk?

Preliminaries

mkfs initializes the file system to have fewer than 1000 free data blocks, too few to show off the changes you'll make. Modify the code at the start of param.h to read (thats 128*128*128 blocks or 1GB)

You don't really need 1GB since we only be writing 160MB of data, so you can play with this number if generating a file system take too long. Note:
During make mkfs.c has to build a 1GB File system. This takes some time, sometimes up to a minute depening on the machine, be patient.

Download big.c into your xv6 directory, add them to the UPROGS list (same as you did for ps), start up xv6, and run big. It writes 4 files: 16MB, 32MB, 48MB, and 64MB, and then reads them --- this will help you to test your solution. Big writes 1mb at a time, which is significantly faster (and yet it takes quite a bit of time).

What to Look At

Note
Since you're using a triple indirect systemyou will have to add more defines to fs.h.

The code that finds a file's data on disk is in bmap() in fs.c. Have a look at it and make sure you understand what it's doing. bmap() is called both when reading and writing a file. When writing, bmap() allocates new blocks as needed to hold file content, as well as allocating an indirect block if needed to hold block addresses.

bmap() deals with two kinds of block numbers. The bn argument is a "logical block" -- a block number relative to the start of the file. The block numbers in ip->addrs[], and the argument to bread(), are disk block numbers. You can view bmap() as mapping a file's logical block numbers into disk block numbers.

Your Job

One thing to remember is each entry with a ... points to a table as well, I just couldn't make the graph in ascii like that.
Another way to think about it is a 3D array table[128][128][128].

After you have modified bmap() you need to go into mkfs.c and modify iappend(). This c file gets run during the make process and builds the file system image which gets run with QEMU. The code and logic will be extremely similar to the code you had written in bmap().

You will have to implement these changes, these changes allow for you to write large files into the xv6 filesystem.

You don't have to modify xv6 to handle deletion of files with triple-indirect blocks.

Hints

Make sure you understand bmap(). Write out a diagram of the relationships between ip->addrs[], the direct blocks, the triple-indirect blocks. Make sure you understand why adding a triple-indirect block increases the maximum file size by such a large factor.

Think about how you'll index the triple-indirect block, and the double blocks and single blocks it points to, with the logical block number.

If your file system gets into a bad state, perhaps by crashing, you may need to delete fs.img (do this from Unix, not xv6).

You should allocate indirect blocks and doubly-indirect blocks only as needed, like the original bmap().

Extra Credit 1 (10%)
Writing a 1gb file to disk in xv6 is incredibly slow. It gets slower the fuller the disk gets(hint). There are two ways to speed up writing a file to disk. In extra_credit.txt explain why when the disk gets fuller writing gets slower-- Explain how you can mitigate this problem. Second, explain the other way to speed up writes and why that works (hint log.c).

Extra Credit 2 (20%)
Implement these changes, in xv6, resolving all panics and failed assertions that arise. Do a crappy timing experiment using a stop watch. Explain how much speed up occured.

Submit

Submit your answers to the Class Dropbox (143A HW 4: Big files) as a compressed tar file of your xv6 source tree, and as a separate PDF file with answers for questions and challenges. You can use the following command to create a compressed tar file

vagrant@odin$ cd /vagrant/ics143a
vagrant@odin$ tar -czvf hw4.tgz xv6-public

Before making the tar file do

make clean

then remove any large test files that you have created.


Home
Homework 4: Bigger files for xv6 In this assignment you'll increase the maximum size of an xv6 file. Currently xv6 files are limited to 140 sectors, or 71,680 bytes. This limit comes from the fact that an xv6 inode contains 12 "direct" block numbers and one "singly-indirect" block number, which refers to a block that holds up to 128 more block numbers, for a total of 12+128=140. You'll change the xv6 file system code to support a "triply-indirect" block in each inode, containing 128 addresses of doubly-indirect blocks, each of which can contain up to 16384 addresses of data blocks. Question 1 Using the triple indirect system what is the maximum number of blocks one inode can address? This corresponds to how many bytes on disk? Question 2 If we were to implement a doubly-indirect system instead of a triple-indirect system what would be the maximum number of blocks one inode could address? This corresponds to how many bytes on disk? Preliminaries Modify your Makefile's `CPUS` definition so that it reads: CPUS := 1 And add `-snapshot` to the definition of `QEMUOPTS`: QEMUOPTS = -drive file=fs.img,index=1,media=disk,format=raw -drive file=xv6.img,index=0,media=disk,format=raw -smp $(CPUS) -m 512 $(QEMUEXTRA) -snapshot The above two steps speed up qemu tremendously when xv6 creates large files. `mkfs` initializes the file system to have fewer than 1000 free data blocks, too few to show off the changes you'll make. Modify the code at the start of `param.h` to read (thats 128128128 blocks or 1GB) #define FSSIZE 2097152 You don't really need 1GB since we only be writing 160MB of data, so you can play with this number if generating a file system take too long. Note: During make mkfs.c has to build a 1GB File system. This takes some time, sometimes up to a minute depening on the machine, be patient. Download big.c into your xv6 directory, add them to the UPROGS list (same as you did for ps), start up xv6, and run `big`. It writes 4 files: 16MB, 32MB, 48MB, and 64MB, and then reads them --- this will help you to test your solution. Big writes 1mb at a time, which is significantly faster (and yet it takes quite a bit of time). What to Look At The format of an on-disk inode is defined by `struct dinode` in `fs.h`. You're particularly interested in `NDIRECT`, `NINDIRECT`, `MAXFILE`, and the `addrs[]` element of `struct dinode`. Look here for a diagram of the standard xv6 inode. Note Since you're using a triple indirect systemyou will have to add more defines to fs.h. The code that finds a file's data on disk is in `bmap()` in `fs.c`. Have a look at it and make sure you understand what it's doing. `bmap()` is called both when reading and writing a file. When writing, `bmap()` allocates new blocks as needed to hold file content, as well as allocating an indirect block if needed to hold block addresses. `bmap()` deals with two kinds of block numbers. The `bn` argument is a "logical block" -- a block number relative to the start of the file. The block numbers in `ip->addrs[]`, and the argument to `bread()`, are disk block numbers. You can view `bmap()` as mapping a file's logical block numbers into disk block numbers. Your Job Modify bmap() so that it implements a triple-indirect system. The first 12 elements of ip->addrs[] should be direct blocks; the 13th should be a triple-indirect block system. A ascii examble below: struct inode _________________ \| \| \| type \| ----------------- \| \| \| major \| ----------------- \| .... \| ----------------- \| \| \| size \| ----------------- \| addrs[0] \| -> Physical ----------------- \| addrs[1] \| -> Physical ----------------- \| addrs[2] \| -> Physical ----------------- \| addrs[3] \| -> Physical ----------------- \| .... \| ----------------- \| adrs[12] \| -> _________________ ----------------- \| tbl1[0] \| -> _________________ ----------------- \| tbl2[0] \| -> _______________ \| ..... \| ----------------- \| tbl3[0] \| -> Physical ----------------- \| ..... \| --------------- \| ..... \| ----------------- \| ....... \| ----------------- \| ........ \| --------------- \| ..... \| ----------------- \| tbl3[127] \| -> Physical ----------------- \| ..... \| --------------- \| ..... \| ----------------- ----------------- \| tbl2[127] \| \| ..... \| ----------------- ----------------- \| ..... \| ----------------- \| ..... \| ----------------- \| ..... \| ----------------- \| ..... \| ----------------- \| ..... \| ----------------- \| tbl1[127] \| ->_________________ ----------------- \| tbl2[0] \| -> _______________ ----------------- \| tbl3[0] \| -> Physical \| ..... \| --------------- ----------------- \| ....... \| \| ........ \| --------------- ----------------- \| tbl3[127] \| -> Physical \| ..... \| --------------- ----------------- \| tbl2[127] \| ----------------- One thing to remember is each entry with a ... points to a table as well, I just couldn't make the graph in ascii like that. Another way to think about it is a 3D array `table[128][128][128]`. After you have modified `bmap()` you need to go into `mkfs.c` and modify `iappend()`. This c file gets run during the make process and builds the file system image which gets run with QEMU. The code and logic will be extremely similar to the code you had written in bmap(). You will have to implement these changes, these changes allow for you to write large files into the xv6 filesystem. You don't have to modify xv6 to handle deletion of files with triple-indirect blocks. Hints Make sure you understand `bmap()`. Write out a diagram of the relationships between `ip->addrs[]`, the direct blocks, the triple-indirect blocks. Make sure you understand why adding a triple-indirect block increases the maximum file size by such a large factor. Think about how you'll index the triple-indirect block, and the double blocks and single blocks it points to, with the logical block number. If your file system gets into a bad state, perhaps by crashing, you may need to delete `fs.img` (do this from Unix, not xv6). Don't forget to `brelse()` each block that you `bread()`. You should allocate indirect blocks and doubly-indirect blocks only as needed, like the original `bmap()`. The TA's solution was done in around 70 lines of code Extra Credit 1 (10%) Writing a 1gb file to disk in xv6 is incredibly slow. It gets slower the fuller the disk gets(hint). There are two ways to speed up writing a file to disk. In extra_credit.txt explain why when the disk gets fuller writing gets slower-- Explain how you can mitigate this problem. Second, explain the other way to speed up writes and why that works (hint log.c). Extra Credit 2 (20%) Implement these changes, in xv6, resolving all panics and failed assertions that arise. Do a crappy timing experiment using a stop watch. Explain how much speed up occured. Submit Submit your answers to the Class Dropbox (143A HW 4: Big files) as a compressed tar file of your xv6 source tree, and as a separate PDF file with answers for questions and challenges. You can use the following command to create a compressed tar file vagrant@odin$ cd /vagrant/ics143a vagrant@odin$ tar -czvf hw4.tgz xv6-public Before making the tar file do make clean then remove any large test files that you have created.

Updated: March, 2017

Homework 4: Bigger files for xv6

Preliminaries

What to Look At

Your Job

Hints