How to recover partitions and data using Linux - Tutorial

Updated: April 26, 2012

Losing important personal data is one of the biggest pains a user can experience in the digital world. No matter how hard we try, accidentals and hardware failures can happen and will inevitably happen. It's the matter of statistics, and just like casinos, the house always wins. In this case, data loss is the house.

To prevent tragedies, you must, I repeat, you must have multiple data backups, proven, consistent, restorable, verified now and then. But what if you don't? Most of computers users never bother with backups, and they only realize the enormity of their ways when it's too late. All right. So if you're down to tears and despair, there are a few things you can try before declaring Chapter 13 over your personal and emotional assets. This tutorial will try to teach you how to recover partitions and data stored on hard disks, as well as how to fix damaged pictures. Follow me.

Teaser

Let's get a few things straight first

Before we proceed, there are several things you must understand. First, there's no guarantee. Data recovery is a guesswork, no matter how accurate and scientific. While my examples will make it look as if it's geek magic working its miracles, it's luck more than anything else. Even if you stick to the most methodical approaches and use the best recovery tools, there's always a chance you will fail. You must be ready for that.

Data recovery is a fairly complex procedure. You should never attempt to do this if you're not extremely comfortable with the command line, partition notations, hard disk geometry, compilations, and even looking at files using a hex editor. In fact, if an inexperienced person tries to use the recovery tools, they might cause even more damage or even ruin their perfectly healthy rest of the system. Do not attempt anything until you're absolutely certain you know what you're doing.

Even if you manage to recover lost disk partitions, the data may be lost forever. Furthermore, working with damaged or failing hard disks is a sore gamble. Things may not turn up fine, despite your best efforts. It is important that you understand this.

We are ready to move on.

A few important tips

Before we go any further, here's a couple of tips you might want to consider. One, you should stop using the suspected damaged device the moment you learn about the potential data loss. If you have just invalidated the presence of your important data, the fact it does not show in system tools like the file manager does not mean the data is actually gone, just not logically visible. This meas the system may think it has free space available where your precious files are located. And it may want to write there.

Therefore, any further attempts to use a device with a suspected corruption or loss will only complicate matters further. You should stop working immediately, lest you mangle your data sectors and forever kill any chance of recovery of whatever might be there.

Now, you should not panic. But if you are aware of your situation, then you should carefully proceed. This means unmounting the partitions with suspected data loss, if they are still visible in some way. You should also turn off swap. By all means, you should stop any maintenance operations, like disk defragmentation, cleanups, copying, moving, or anything that might be using your data devices either as a source or target for any disk-related activities. But there's a catch.

Doing any of the above could actually make things worse. Or not. Unfortunately, I do not have a golden recipe for how to do this properly, as each case is different. In fact, in some situations, all of the things that could work magically for one scenario might render exactly the opposite results in another.

To sum it, do NOT CHANGE things. If there's a problem, make sure you do not exacerbate it. If you do not understand what might have happened, call in an expert who could help you. One thing is certain; under no circumstances should you close your eyes and reset the machine. Reboots can solve software logic problems in memory, but they rarely solve hardware-related issues. On the contrary; on reboot, the operating system might blithely assume that all is well with your stuff and try to access the damaged disk or partition. This normally ends in extra tears and a significantly reduced chance of data recovery.

The second piece of advice is a bit crazy, so feel free to disregard it. If you happen to have a hard disk that has stopped working or is no longer recognized by the operating system, placing it in a freezer for about 15-20 minutes might help revive just long enough to quickly salvage data. In many cases, hard disks fail due to mechanical errors; the freezer-induced difference in temperature can sometimes cause minute changes in the physical alignment of parts, which could free stuck or dodgy bits, making your disk work again, at least for a short while. This is pure voodoo, so you might as well forget this. But if you're desperate, all means justify the end.

Recover lost partition

There could be many reasons why your partitions no longer show up. For example, you may accidentally create a new partition table on the wrong hard disk. This might happen when setting up a multi-boot setup.

To be able to show you a real-life would-be disaster scenario, I will simulate the loss of partition. Our test box will be Fedora 16 Verne, with KDE. We will try our little disaster game a secondary disk, /dev/sdb, which is used for data, although this kind of problem could also happen on system partitions. In that case, you will have to use a live CD to try to recover your box.

We will destroy the partition table for /dev/sdb, by creating a new one in GParted. We will ignore the fact there already is one on the disk. This way, we will vanish the partitions, making them invisible to the system. To a casual user, it will look as if the data is forever lost. All right, here's how it's supposed to be, on a healthy system:

Partition state

So we destroy the partition table. Now, we will use TestDisk, an awesome forensics tool developed by cgsecurity.org. I have listed this program many times before in a variety of articles, but we've never really used it in anger. Today, we will explore its capabilities, as well as learn how to use it.

TestDisk runs as a text wizard inside the shell. It's an interactive tool that will ask you a few questions to try to salvage your data. The first question is to decide whether you wish to keep the log for future examination. If you're doing forensics, then you probably want to do this.

Testdisk, begin

We will begin with the analysis. We do not know what the situation is, or how bad it is. Since you probably do not know the exact disk geometry by heart, using the analysis, notice the proper English spelling, is your best bet in figuring out the existence and location of previous partitions.

Analyze

The next step is to choose which device we want to work with. In our case, /dev/sdb.

Choose device

Now, choose the partition table type. For most people, it will be Intel/PC.

Partition table type

And we begin the analysis:

Searching for partitions

We are lucky. TestDisk was able to find the partition. Even though the old partition table was destroyed, it was just a pointer to the start and end addresses of the actual data, so to speak. The disk surface was not harmed, and therefore, our data should be there.

Found partition

Found partition, bottom message

Now, we need to write the partition information to disk. It is also possible to change the partition characteristics, like type and flags.

Write to dosk

And it's worked! Boom, we're back in business. Now, this means we have a sane partition table and our partitions can be used, but this does not mean some of the data has not been permanently overwritten or destroyed forever. We will further explore this second part of the partition and data recovery below. For now, things are looking good.

Successful recovery

Now, a sad example

All right, we had a successful recovery, now let's see something else. I will show an example where I was not able to save my data. Perhaps a forensics expert will be able to figure it out, but I failed after 17 minutes of work.

Like before, we will do something destructive, only a bit more. I am going to use dd to write an unspecified number of zero bytes to the raw device /dev/sdb, then supposedly realize my mistake and break the operation. But it will be too late.

Now that the partition is ruined, let's pretend nothing happened and try to mount it. Of course, this step will fail. Now, the error message is not unique to this kind of problem, but it does indicate something is wrong. After additional exploration, we come to the conclusion that our disk is messed up and needed recovery.

Mount manually

Indeed, checking with fdisk, we see:

No valid partition table

So we try TestDisk once more. Here, we learn that the partition does not have the endmark. This is not a good thing, but perhaps we can recover, after all. You will notice that in the bottom left corner, TestDisk offers to search for partitions.

No endmark

Search

We will do that - and fail. So we can try a deeper search.

Search failed

Deeper search

As the last desperate option, you may choose to manually add a partition, by specifying the start and end cylinders, heads and sectors. Finally, you will have to choose the partition type. For instance, Linux is 83, including all Ext, Reiserfs, BTRFS, and others.

Add partition type by hand

In our case, this did not work, unfortunately. No luck there.

Recover files (and images)

In case you cannot recover partitions, you might still want to try to salvage data. In other words, try to copy information from the disk, without really relying on the partition table structure before disaster.

PhotoRec

We will do this using another tool created by cgsecurity.org, called PhotoRec. While the tool was conceived to handle photo images on digital camera memory cards, it now works with many file types. Effectively, it will recover pretty much anything from your disks, provided the data has some integrity.

PhotoRec can also be used on healthy partitions, as it will dig deep under the layers of zeros and bytes and looked for deleted files. Indeed, apart from the purely disaster recovery mode, I also tested against my system partitions.

This tool is quite similar to TestDisk. You begin by choosing the partition type you want to work with, as well as the filesystem type that you believe was used on the deleted and/or destroyed partitions. And we're underway.

Recovering

In the case of my system, it uncovered all kinds of files, including text, images, archives, database files, and others. Moreover, PhotoRec unearthed image files that were probably setup during the installation. I have no idea what these files are, but they probably belong to the Fedora installer in some way.

Overall, PhotoRec seems to be quite effective. Just to be sure, I low-level formatted about 80% of my second hard disk, created a new partition table, a new partition and formatted it with Ext4 filesystem, and then attempted another recovery. True, the program could not find most of what was annihilated under all those zeroes, but it still pulled an impressive chunk of files from the unscrubbed bits of the disk. So if you thought simple formats are good enough to hide data, well not quite.

Recoverjpeg

Another useful tool for the recovery of pictures - JPEG to be specific, is recoverjpeg. It is designed to salvage files from SD cards, camera memory chips and alike. Like PhotoRec, I used it against healthy partitions to see whether I could unearth any weird, unknown files, and indeed I did. Again, must be Fedora installer leftovers.

This little tool is not available in the YUM repositories, but you can download the sources and compile them. The procedure is not that difficult. You will need gcc-c++ package for that.

recoverjpeg working

Successful recovery

One of recovered images

Manual recovery of images

GIMP error

OK, so we can clearly see that our image is corrupted. GIMP cannot open it. But we get our clue right there. It's supposed to be an image file, but it is not. It begins with FF 00, which is not the first two-byte sequence in the structure of a JPG/JPEG file.

The next step is to examine the file using a hex editor. Yup. We are going down to byte level and we will try to figure out if there's something wrong with the image structure. This is super-geeky, but we have no other option. To that end, we will use a hex editor. In KDE, there's Okteta, although you can use anything, including vi.

Okteta shows errors

The header is messed up, although in theory, it can't really be called header, but it serves the same purpose. Consulting the literature, we learn that the first two bytes of any JPG/JPEG image is the SOI marker, and it's always FF D8 in hexadecimal notation.

hex fixed

Zoomed

All right, so we will now change the second two-byte sequence from 00 to D8, which is what the JPG image should read. Save the change and test. And it does.

GIMP works now

In this case, the recovery was easy. Now, it is possible that the image structure may be intact, but some of the data might be lost, so you might end with files that partially contain garbage. For images, these may turn out to be white or black pixels or entire sections filled with random noise.

Good advice

Backups are a must. Backups are unto computers what oxygen is unto your brain. If you have backups, then you have little to worry about. A good example is the death of my second hard disk in the old and now replaced computer approx. a year back. During a routine mail session, the disk went kaput. All I had to do was power down the box, remove the failed device, place a brand new hard disk into the slot, power on, and then copy all of the data from the daily backup archive. End of story.

If you have backups, data recovery becomes an uninteresting corner case that should probably never have to deal with. In fact, statistically, data recovery is the bad way of doing things. While it MAY work, backups DO work.

You may want to consider both system imaging and data backups.

More reading

A few more links you might want to read:

TestDisk step-by-step instructions

PhotoRec step-by-step instructions

Ubuntu data recovery

ddrescue forensics wiki

Outlook Inbox Repair Tool

Windows recovery tools

Conclusion

I must admit this tutorial is probably not the easiest one to chew. Some of the stuff written here is sci-fi technobabble. Most computer users will not feel comfortable using any of the methods listed. Good recovery begins with good skills, and even then, it is not guaranteed. My personal experience dates back to 2009, a single case where I tried to recover files from a disk well clobbered with a botched reinstallation over existing and valuable data. It wasn't pretty. Backups are preferable.

Still, if you feel like impressing girls or charging tons of money from plebes for fixing image headers, then you have learned a useful lesson today. You are now familiar with a variety of command line tools, you can use TestDisk, PhotoRec, recoverjpeg, and maybe some other tools mentioned in the links above. Indeed, I have not elaborated on several other tools, so we might yet have a sequel. The reason I cut it short is the fact this article is already fairly complex. And let's not forget the magic of the hex editor.

Well, if the three Rambo tools listed above do not help you, then you probably won't have too much success with the rest either. Besides, using too many tools at the same time can cause confusion and possible add to the damage. So TestDisk and PhotoRec, your first and foremost choice. And they probably are the best tools around. Recoverjpeg is another goodie, so you might want that one, too. Once you get comfortable using them and gain some positive experience, you could take a look at several other useful programs, including those specifically designed to work with Windows-based filesystems and partitions.

Well, that would be all. Forget recovery. It's so 70s. Go for backups!

Cheers.