Sunday, December 21, 2003
Filesystem Brain Damage for FoxFS
Copy-on-Write File Size Reduction
So, I was thinking about filesystems, and compression, and fork() implimentations, and I suddenly come up with the idea of Copy-on-Write File Size Reduction (CWrFSzR, "CWeer Seizure"). The basic concept is that of copy-on-write: two independant pieces of data just happen to be identical, and thus exist in the same space. When one changes independant of the other, the other must remain the same. The solution is to copy the data before changing it, thus creating two copies.
The concept could only be efficiently applied to a filesystem with an active background scanner that ran a heuristic to identify identical segments of files and fragment the files around them, then combine the two fragments into one and free the other. CWr fragments would be marked for Copy-on-Write, thus encountering a Copy-on-Write fragment during a write to a file (delete, overwrite, truncate) would be doable without any driver enhancements (other than those that allow the driver to recognize and act on the data). Exact duplicates of files (those made with cp) would effectively (not literally) be hardlinks (a hard link is a directory entry that points at an inode) to the original data (not the original inode) until one of the copies was altered.
I'll mill over how to do this safely later. It'll be beneath compression, though, and act on the original data. I think I'll inherit the compression rules by compressing the segment if it would normally be compressed by any of the inodes it belongs to. The data for how many CWr links are pointing at it will need to be a part of the file data itself if the CWr acted directly on the data (fast way, but expensive worst-case scenario). If the CWr points at another inode, the best/worst case scenarios get pretty bad (it'd be like seeking to multiple places in many files, rather than in one), so I think I'll try to find an efficient way to work this in with the CWr data in the actual file data, and then write it to the specs.
What I may do is turn the filesystem into an object-oriented filesystem, wihch will give me so much more flexibility at the cost of a little overhead. Of course, at levels of 1T+, we're talking gigabytes of potential overhead (4 billion files, for each 1 byte overhead, 4 gigabytes of space is lost; and I'm going for >32 bit inodes natively), but I hope I can cancel it out.