Indexed and Extensible Files

The basic file system allocates files as a single extent, making it vulnerable to external fragmentation, that is, it is possible that an n-block file cannot be allocated even though n blocks are free (i.e. external fragmentation). Eliminate this problem by modifying the on-disk inode structure.

기본 파일 시스템은 파일들을 단일 면적에 할당하기 때문에 외부 단편화에 취약합니다. 즉, n개 블록이 비어 있는데도 불구하고 (외부 단편화가 되어 서로 떨어져 있다면) n개 길이의 블록이 할당될 수 없다는 말입니다.

</aside>

In practice, this probably means using an index structure with direct, indirect, and doubly indirect blocks. In previous semesters, most students adopted something like what you have learned as Berkeley UNIX FFS with multi-level indexing. However, to make your life easier, we make you implement it in an easier way: FAT. You must implement FAT with given skeleton code. Your code MUST NOT contain any code for multi-level indexing (FFS in the lecture). You will receive 0pts for file growth parts.

</aside>

NOTE: You can assume that the file system partition will not be larger than 8 MB. You must support files as large as the partition (minus metadata). Each inode is stored in one disk sector, limiting the number of block pointers that it can contain.

알아두세요: 당신은 파일 시스템 파티션이 8MB보다 크지 않을 것이라고 가정해도 좋습니다. 당신의 구현은 메타데이터를 제외한 파티션 크기만큼의 큰 파일들을 지원해야 합니다. (FILESIZE) ≤ (PARTITION SIZE) - (METADATA) 각 inode는 하나의 디스크 섹터에 저장됩니다. 그 섹터가 담을 수 있는 만큼으로 블록 포인터의 수가 제한됩니다.

</aside>

Indexing large files with FAT (File Allocation Table)

Warning: this document assumes you have already understood the basic principle of general filesystem and FAT from the lecture. If not, please go back to the lecture note and understand what is filesystem and FAT.

<aside> 👉🏻

🚨 경고 : 이 문서는 여러분이 강의를 통해 일반적인 파일 시스템과 FAT의 기본 이론을 이해했다는 가정하에 작성되었습니다. 해당 내용에 대한 이해가 없다면, 강의 노트로 돌아가서 파일 시스템과 FAT가 무엇인지 이해하십시오.

~~하지만 우린 강의 노트가 없는걸요?!~~

</aside>

In the basic filesystem that you have used for the previous projects, a file was stored as a contiguous single chunk over multiple disk sectors. Let's call contiguous chunk as cluster, since a cluster (chunk) can contain one or more contiguous disk sectors. In this point of view, the size of a cluster in the basic file system was equal to the size of a file that stored in the cluster.

**자꾸 엿보지 마세요. 불🔥쾌❄️해☀️요. ←(근데 이거는 준호가 썼어요) 왜 쓸데없이 고퀄이죠? 노션의 동기화 능력은 정말 데다내!!!

이 문장들을 신경쓰지 마세요. 단지 장난을 좋아하는 동기들의 짓궂은 코멘트일 뿐이니까요.**

</aside>

To mitigate external fragmentation, we can shrink the size of cluster (recall the page size in virtual memory). For simplicity, in our skeleton code, we fixed number of sectors in a cluster as 1. When we use smaller clusters like it, a cluster might not enough to store the entire file. In this case, we need multiple clusters for a file, so we need a data structure to index the clusters for a file in the inode. One of the easiest way is to use linked-list (a.k.a chain). An inode can contain the sector number of the first block of the file, and the first block may contain the sector number of the second block. This naïve approach, however, is too slow because we have to read every block of the file, even though what we really need was only the last block. To overcome this, FAT (File Allocation Table) puts the connectivity of blocks in a fixed-size File Allocation Table rather than the blocks themselves. Since FAT only contains the connectivity value rather than the actual data, its size is small enough to be cached in DRAM. As a result, we can just read corresponding entries in the table.

<aside> <img src="https://s3-us-west-2.amazonaws.com/secure.notion-static.com/8c5d0660-8016-49f8-947b-05d691fd4c39/moon_3.png" alt="https://s3-us-west-2.amazonaws.com/secure.notion-static.com/8c5d0660-8016-49f8-947b-05d691fd4c39/moon_3.png" width="40px" /> 외부 단편화를 경감(👮‍♀️←이 경감 아님)시키기 위해, 우리는 클러스터의 사이즈를 줄일 수 있습니다 (가상 메모리에서의 페이지 사이즈를 떠올려보세요). 단순화를 위해, 우리의 스켈레톤 코드에 있는 클러스터 내의 섹터 개수를 1로 수정했습니다. 이런 작은 클러스터를 쓰면, 전체 파일을 담기에는 클러스터의 크기가 충분치 않을 수 있습니다. 이 경우, 우리는 한 파일을 위한 여러 개의 클러스터를 필요로 하며, 그렇기 때문에 inode에 있는 파일을 위한 클러스터를 인덱싱하기 위한 자료구조가 필요합니다. 가장 쉬운 방법은 링크드 리스트 (체인이라고도 불리는)를 사용하는 것입니다. inode는 파일의 첫번째 블록의 섹터 넘버를 포함할 수 있고, 첫 번째 블록은 두 번째 블록의 섹터 넘버를 포함할 수 있습니다. 이런 순진한 접근 따위로는, 너무 느려 (おそい… << ~~퍽퍼퍽~~) .** 왜냐하면 만약 우리에게 필요한 블록이 마지막 블록 뿐일지라도, 파일의 모든 블록을 읽어야 하기 때문입니다. 이것을 극복하기 위해, FAT (File Allocation Table, 파일 할당 테이블)을 사용하는데, 이렇게 하면 각 블록들이 자신의 구조 안에 연결정보를 담는 대신에 고정 크기의 FAT에 블록들의 연결정보를 저장하게 됩니다. FAT이 실제 데이터가 아닌 연결정보 값만 담고 있기 때문에, DRAM에 캐시될 수 있을 만큼 충분히 작은 크기를 가지게 됩니다. 이렇게 해서, 우리는 테이블에서 상응하는 엔트리만 읽으면 되게 되었습니다.

</aside>

You will implement inode indexing with the skeleton code provided in filesys/fat.c. This section briefly describes what already-implemented functions in fat.c do and what you should implement.

</aside>

First of all, 6 functions in fat.c (i.e. fat_init(), fat_open(), fat_close(), fat_create(), and fat_boot_create()) initialize and format disks at the boot time, so you don't have to modify them. However, you'll need to write fat_fs_init(), and briefly understanding what they do could be helpful.

</aside>

cluster_t fat_fs_init (void);