Warning: Respect the power of the root shell; How a one character command-line mistake annihilated my linux backup server in 10.6 seconds flat
Warning: Respect the power of the root shell; How a one character command-line mistake annihilated my linux backup server in 10.6 seconds flat.
So here’s the backstory; Back in April 2012, my laptop, a 13″ mid-2010 MacBook Pro, suffered an electrical problem while I was doing homework and listening to music at around 2:00 AM one day. At first I thought my computer had just crashed as the screen had frozen completely, but when I tried to reboot my computer I was greeted by the stylized Mac version of an error that no computer-loving geek ever wants to hear. “Operating System Not Found“ Oh, the Horror!
After first conducting a few very scientific tests that included; giving the computer a good, loving smack against the table; trying to reboot a few times; and placing my ear to the side of my laptop to listen for clicking and grinding noises; it was obvious that the hard drive had indeed given up the ghost.
As I saw it, I had two options I could take to fix my computer. I could either send my laptop back to the Apple Store and wait a few days for it to be repaired, or I could replace the hard drive here and now, and have a working computer in a matter of hours. Given that I had a used, but still working spare hard drive on hand I opted to attempt the latter. Little did I know, this choice would prove to be a huge mistake that left me struggling without not one usable computer, but two for the entire rest of the week.
After opening up my laptop and switching out the dead hard drive for a used spare, I tried to reinstall MacOSX using the TimeMachine backup I had on my Linux-based Netatalk TimeMachine server. Theoretically, this should have been an easy and straightforward process. Oddly enough though, after I replaced the broken hard drive with my spare, the reinstallation failed right from the get-go and gave me a “disk cannot be written to” error.
“Huh,” I thought, “I wonder why that is…. Perhaps the disk is corrupted? Perhaps I should try wiping the drive clean before reinstalling OSX. Maybe that will fix the error!”
The problem was, at that point, I only had one working computer: my linux-based TimeMachine backup server. It only had a command line interface, but should have been more than capable of erasing my replacement hard drive sector-by-sector using the command-line utility DD. So I hooked up the hard drive to my server, hopped into the command-line and then typed in the following command, copied and pasted from a linux forum post:
$ sudo dd if=/dev/zero of=/dev/sda bs=1M
[Can you spot the error?]
Now technically, this command should have done what I wanted if I had remembered that the hard drive that I wanted to erase was not “/dev/sda”, but “/dev/sdc” (the third hard drive connected to the computer). As it was, it took me a full 10.6 seconds after entering the command until I realized my mistake and killed the script’s execution. By then though, dd had already zeroed out the first 942MB of my server’s main hard drive, including its Master Boot Record, Grub Bootloader, and part of my encrypted home folder.
winter@Ubuntu-ServerBox:~$ sudo dd if=/dev/zero of=/dev/sda bs=1M
942+1 records in
0+0 records out
942340000 bytes (942.34 MB) copied, 10.6 s, 88.9 MB/s
But hold on a second! –What’s this? The server hasn’t crashed yet? And it’s actually still able to serve up files? WHAA?!? What electronic zombified devilry is this?
So, even though the hard drive had been corrupted beyond repair, most of the files needed to keep the server running were already loaded into RAM and were left untouched by DD. So as long as the server didn’t need to access any erased portions of the disk or I didn’t restart the server, the server would probably still be able to limp along. At least… until I the moment when I get some time to erase and reinstall everything on the server. –But back to the issue at hand!
So, with all of my working computers crippled at the moment, I decided to try and get my laptop working again using a backup hard drive plugged in through a USB to SATA adapter. It wasn’t pretty, but the solution got my computer working again so I could finish my English paper and reserve an appointment with a Mac Genius at the 5th Avenue Apple Store the next day. There, the man took a quick look at my computer using Apple’s proprietary diagnostic software and traced my computer’s issues down to a faulty SATA ribbon cable. –One of the only things I couldn’t swap-out and test due to it not being a standard SATA cable, but a SATA cable with a proprietary Apple-designed motherboard connector.
So please, although it’s fine to fix things yourself, do be careful around the Unix command-line and double check all destructive commands! Seriously folks! If you’ve gained nothing else by reading this post, just remember this simple rule of thumb:
Double Check All Unix Commands Starting With “sudo”!
Happy (Safe) Hacking!