4 Replies Latest reply: Oct 16, 2012 3:19 PM by Avi Miller-Oracle RSS

    Trying to understand how BTRFS works

    Catch~22
      Hello,

      I read most if not all of the articles, presentations and podcasts about Btrfs at https://btrfs.wiki.kernel.org/index.php/Main_Page and other sites. All is very nice. I had to read some of the information a couple of times though to more or less understand and I'm not sure if I really do.

      I'm trying to explain BTRFS in my own words, but wonder if it is correct. What about the following:

      The storage space of a BtrFS file system is comprised of file data blocks and metadata blocks. The BtrFS metadata is organized in a B-tree fashion and describes the version and location of data on disk. The BtrFS file system uses a copy-on-write (CoW) storage strategy. An existing file under BtrFS is never initially overwritten. Instead, when modifying a file, data blocks are copied, modified and written anywhere on disk according to metadata to prevent overwriting of data required by snapshots. When a file under Btrfs is modified, data is not necessarily overwritten. Modified data blocks of files can be written anywhere on disk according to metadata and requirement to maintain existing snapshots. A snapshot does initially not impose additional storage space until data blocks are modified.


      Is this correct? And if not, where am I wrong please?

      Thanks!

      Edited by: Dude on Oct 16, 2012 10:38 AM
        • 1. Re: Trying to understand how BTRFS works
          bobthesungeek76036
          I don't know anything about BTRFS but I believe it has it's roots from ZFS. You're summary sounds very good. The only thing I would change is the description of how Copy-on-Write works. My understanding is that data blocks are not really "copied". Rather, when modified data blocks are written, they are written to a new location and the metadata is updated at time of write. The old data is never "copied" to the new location first which is what your description sounds like to me.
          • 2. Re: Trying to understand how BTRFS works
            Catch~22
            Thanks for the feedback. Actually that's one of the parts I'm not sure if I understand it correctly. It is my understanding that Btrfs works on the data block level, which I think is the smallest amount of data that can be allocated. Similar to Oracle database. So in order to modify data, the whole block needs to be written. For example, let's say I change a the word "teh" to "the" in a file. Doesn't it have to read the whole block, modify the content, and check the metadata before writing the complete block?

            But reading it again, I think you are right. It can be misunderstood. How about the following (modified previous thread).
            • 3. Re: Trying to understand how BTRFS works
              Avi Miller-Oracle
              bobthesungeek76036 wrote:
              I don't know anything about BTRFS but I believe it has it's roots from ZFS.
              No, it doesn't. They have similarities, but there is no history between btrfs and ZFS. The filesystems achieve similar goals though on different platforms.
              My understanding is that data blocks are not really "copied". Rather, when modified data blocks are written, they are written to a new location and the metadata is updated at time of write.
              Correct.
              • 4. Re: Trying to understand how BTRFS works
                Avi Miller-Oracle
                Dude wrote:
                Thanks for the feedback. Actually that's one of the parts I'm not sure if I understand it correctly. It is my understanding that Btrfs works on the data block level, which I think is the smallest amount of data that can be allocated. Similar to Oracle database. So in order to modify data, the whole block needs to be written. For example, let's say I change a the word "teh" to "the" in a file. Doesn't it have to read the whole block, modify the content, and check the metadata before writing the complete block?
                Yes. But it does tight packing, so it probably wouldn't write an entire block out for "the" -- it would just squash it into one of the leaves of the b-trees.