The Dangers of Cloning

Some time back I added a SSD (Solid-State Drive, a hard drive which has no moving parts) to my main computer. There are major advantages to SSDs despite their minor disadvantages. In doing so, I made the choice to move my root to it, and rather than complete a fresh install on it, I simply copied the data from my old root.

This created an interesting situation. For the purpose of the discussion to follow I’ll refer to the old root as Foo and the new root as Bar. Foo and Bar had the exact same UUIDs (Universally Unique IDentifiers). That was my mistake, and I should’ve known better than to think running with duplicated FS UUIDs for months wouldn’t ever cause problems.

I did recognize it as a minor problem, but mostly I figured if the system treated them consistently, I’d know if Foo was being used when it shouldn’t be.  That was a bad assumption on my part.  One could figure out which one was root, but only by close inspection.

Edit: Okay, PEBCAK in this case, no weird behavior.  Apparently I had installed the different fstab on Foo, which means I had it backwards: when the boot didn’t have the mounts, it was actually on the correct partition.  Needless to say the overall lesson remains unchanged: change the clone’s UUID as soon as it’s cloned.

I’m still not sure the reality of what occurred. My best guess is some kind of quantum madness, where different parts of the environment operated on each of the two partitions in different orders.

The main thing was that Foo had a different fstab, so it seemed reasonable that if the proper mounts occurred upon boot, then Bar must have been mounted as root. This was seemingly confirmed by a few instances where I booted to find the Bar mounts not present. It seems that wasn’t exactly the case.

Today I booted, only to find that gdm3 didn’t execute. I logged into the virtual terminal only to see that /home hadn’t mounted either. Checked the fstab, and found it to be Foo‘s. Curious. Finally, checked dmesg, and found that there had been a crash loading a kernel module, but it hadn’t taken the kernel or much else with it.

So I rebooted, watched the grub2 menu closely and saw that it included two entries for the same kernel, but one from whatever the local kernel might’ve been and another from Foo‘s partition. Dawns on me that means both have been updating, so more boots than I thought were using Foo instead of Bar.

Decide it’s time to delete Foo entirely. Reboot.

grub2 goes to rescue mode. Ouch. For those unfamiliar, rescue mode is a minimal environment. It always reminds me of the State of Nature or maybe the part in the creation myth in Genesis before the gods created very much. Thankfully the bootloader is called grub and not grue, or I might have been eaten in that dark, dank rescue mode.

Nudge grub2 along by manually telling it to use Bar‘s grub2 config. It works, phew.

grub2‘s menu shows I’m using an old kernel, circa when I installed the SSD.

Finish booting, aptitude shows 700-odd packages to update.

So I’d been booting with Foo as root the whole time. But it was using Bar‘s fstab. Quantum weirdness.

The lesson is: if you clone, go ahead and delete or remove the original from the system. While you’re at it, generate and set a new UUID for the clone.

I’m amazed that I didn’t really break anything. I’m not the least surprised that six years into using Linux there are things I have yet to learn. That’s half the fun.