Operating System and data separation on a drive is important for failure isolation.

We will begin by motivating the need for OS and data separation. Consider the following situation: Bob has a single disk on his laptop with a single partition containing a Windows installation. One day Bob is no longer able to boot into this partition; let's say that he's stuck in the loading screen. Now, Bob has the option of reformatting the partition or restoring a backup. Both options may cause data loss. Bob could have easily prevented the data loss if he had simply separated his operating system and data into separate partitions! Of course, I happened to be Bob a few times and thus I decided to solve the problem once and for all.

In this post, we will cover my grid method for designing partition schemes for dual-booted Linux and Windows. This partition scheme is not for everyone; in particular, this is intended for programmers on laptops that are limited to a single drive that must be easily managed and vertically scaled.

Partition Scheme Objectives

Here is the criteria that we would like to optimize with our dual boot partition scheme:

Transparency
While operating in an OS, the OS is able to "read" as many partitions as possible. We will see soon that partitions can be "hidden" depending on its filesystem.
Modularity
The partitioning scheme should have fixed-sized blocks aligned to byte-boundaries so that it is intuitive and easy to dynamically resize.
Scalability
The partitioning scheme should scale with the drive size e.g. 128 GiB, 256 GiB, 512 GiB should have the same initial schemes.

A partitioning scheme with these properties makes particular decisions easy: how much to resize partitions by and when to purchase a larger disk. We begin the design of this partition scheme by first making note of the partitions we want and then diving into the details such as the size of each partition.

Naive Dual Boot Partition Schemes

For systems dual-booting Linux and Windows, there are at least two partitions: one for each operating system trivially. We can visualize this as follows:

There are two problems associated with this naive partition scheme. We evolve this current partition scheme to address the problems.

Separate Data Repository Partition

Linux and Windows have inherently different default filesystems. This leads to the problem that Windows cannot mount the Linux partition!

Both Windows and Linux can read NTFS-formatted partitions, but only Linux can read Ext4-formatted partitions.

In order to resolve this problem, we separate the data from each operating system and store it in its own partition. For now, we call the data partition Repository.

Note that for the problem to be solved, the Repository partition must use a filesystem that both Windows and Linux can read which is NTFS.

Separate Linux and Windows Data Repository Partitions

What happens when we try to story binaries in our Repository partition such as .exe files? Only Windows can execute .exe files. Then, what is the point of having .exe files visible to Linux? Furthermore, should Linux dotfiles be visible to Windows? Only Linux uses dotfiles. We handle these problems by simply separating the data repositories between Linux and Windows.

This seems like a good partition scheme so far, but if we simply went forward with this, we may naively set arbitrary partition sizes across each partition. Instead, we will use the grid method which protects us from inaccurate data distribution expectations.

Partition Scheme Grid Method

To simplify the size of each partition, we use a grid of fixed-sized blocks relative to the size of the disk, $N$. Since the number of partitions is usually small, we let the number of cells in the grid be 8 arbitrarily. Furthermore, since, the size of disks are approximately powers of two, division by eight yields nice, integer values. For example, a 512 GiB disk can be easily partitioned into blocks of size 64 GiB.

We can thus visualize our disk as a set of eight blocks of size $\frac{N}{8}$:

Certainly, the block size can be configured to give greater granularity, however, I recommend powers of two so that integer block sizes can be used.

Given blocks of size $\frac{N}{8}$, our abstract partition scheme is now modular. We can arbitrarily allocate a set of blocks to each partition as necessary. Note, however, that we reserve a partition for swap space and dealing with the problem of missing drive space which is explained by binary and decimal definitions of gigabytes.

After some configuration, we decide that the following partition scheme is appropriate given how we intend to distribute our data:

Given this configuration, the size distribution is as follows:

Thus we have constructed our final partition scheme. When a partition requires more space, resize accordingly. When the disk generally requires more space, purchase a larger disk, recreate the partition scheme using the grid method, and transfer the data.

Conclusion

The purpose of the grid method is to simplify all notions of prior distributions of data across partitions while using nice, byte-aligned boundaries. When someone has a new drive, there is usually some prior notion of how they intend to distribute data across its partitions; however, the precise boundaries are unknown. Using byte-aligned boundaries keeps the partition sizes nice and tidy while providing enough flexibility to accommodate varying notions of prior distributions of data.

In terms of internal fragmentation, the effectiveness of the partition scheme is dependent upon the accuracy of data distribution across each partition. That is, if we experience the need to resize partitions over purchasing a new disk, then it is either a problem of an inaccurate notion of a prior distribution or block size granularity. Less granularity protects us from inaccurate prior distributions which reinforces my argument for having block sizes of $\frac{N}{8}$.

Although this partition scheme satisfies my needs, each person seems to have their own general approach. Please share your own interesting partition schemes.