Moved 2 New Site (Go There)

Index:

LINUX - How to clone a disk with ddrescue - gnu ddrescue also known as gddrescue - the better ddrescue tool

HOW TO CLONE A DISK WITH DDRESCUE - GNU DDRESCUE ALSO KNOWN AS GDDRESCUE - THE BETTER DDRESCUE TOOL
###################################################################################################

** First off I am not responsible for any data loss that can be caused by any of this, if your reading this this is strictly notes for myself that I like to write down so I can organize my thoughts.

BEFORE YOU BEGIN
################
* Read this whole guide!
* Get new drives of the same size as the original (although possible cloning to a bigger drive is unrecommended as its not a true clone and logically might not react equally afterwards)
* Note the serial number of the source and destination drive. Note just because I got asked before, the source is the drive your cloning FROM and the destination is the drive your cloning TO. So in a perfect world before the clone the source drive has all of your meaningful data (And maybe some disk errors, that the clone procedure will attempt to clean up) and the destination drive has unimportant stuff on it, or its empty. After the clone the source drive remains untouched (it has only been read from, nothing has been writen to it) and the destination drive hopefully is an exact copy of the source (just hopefully it didnt copy over the errors)
* Personally I consider a bad drive - and this based on my research and pure opinion - to have 50 reallocated sectors and 1 ata errors - of course exceptions exist, for example GREEN DRIVES like to have alot of ATA errors and still function properly because their GREEN features (power saving aka randomly turning off) like to cause em. 
* I would only clone to a drive that has 0 ata errors and 0 reallocated sectors before the clone. If its gets errors after the clone it might be time to consider another drive and clone again.

NOTES
######
* IDE drives have the notation hd# and SATA drives have the notation sd#, where # is a letter which is the unique identifier for the drive for the current session. If your drives show up as hds then change all of my commands from /dev/sd<whatever> to /dev/hd<whatever>
* Note we are using the command ddrescue here (which is not to be confused with dd_rescue) which is the newer and better clone commands
* I noticed that usb drive enclosures work best with virtualbox, as SATA disks are weird with VirtualBox
* https://sites.google.com/a/kossboss.com/main/linux---disk-cloning-guide
* My old guide on cloning which uses the older dd_rescue command, which is not as good, but it has extra notes: https://sites.google.com/a/kossboss.com/main/linux---disk-cloning-guide
* Notice its very important to be 100% certain which file is the right drive (yes in a linux everything is a file so a full drive is just a file like /dev/sda unlike windows where a disk is a magical entity that looks like this C: or DRIVE_LETTER:) thats why I have you run dmesg and cat and fdisk and smartctl commands to verify the drives by drive letter and also by serial number.

THE STEPS
#########

Pick the virtualbox or boot cd method, you will know which to pick after reading the guide. I recommend virtualbox, but it does not require reboots of the cloning computer. However Virtualbox does require USB Drive enclousures (unless you figure out how to make virtualbox see your whole drive as a device then be my guess just be warned if you lose all of your data it isnt my fault.)

VIRTUAL BOX
===========

1. Download and install VirtualBox (made by Oracle) with default options 
2. Download Knoppix that matches your computers architecture (the computer where you are going to do the clone procedures)
3. Run Virtual Box
4. Start up a VM with the Knoppix image

* With virtual box its best to use USB Drive enclosures --- as stated before - a few open and closed parenthesis ago - good luck trying to get Virtualbox to recognize your directly connected SATA or IDE drives.

BOOT CD STYLE
=============

1. Download Knoppix that matches your cloning computers architecture (the computer where you are going to do the clone procedures)
2. Burn the ISO to the CD
3. Turn off the computer that will be used for the cloning and Put in that CD
2. Boot into the Knoppix CD (i

* With a linux computer

IN KNOPPIX
##########

1. Open up a terminal so that you can start typing commands

2. Here we go:

ATTACH DRIVES AND FIND OUT INFO
===============================

This will go like this, we will attach the source drive, run commands to get information, attach the destination drive, run the same commands to get the updated information, thus helping us differentiate between the first(source drive) and second drive(destination). (Note the order I plug the drives in doesnt matter in real world, its just to flow with this guide better) Finally run a mass command to check on every connected drives serial number and errors (which we will use the errors later to see if they grew after the clone, hopefully they didnt grow, especially on the destination drive)

(Step 1)
See current drives:
# cat /proc/partitions

(Step 2)
Attach source drive:
See current drives and how the newly added drive appended to the system messages(dmesg) and the drive is now shown in /proc/partitions:
# dmesg | tail
# cat /proc/partitions
# fdisk -l

(Step 3)
Attach destination drive

(Step 4)
See current drives again:
# dmesg | tail
# cat /proc/partitions
# fdisk -l

(Step 5)
Verify drives one by one like this by serial number:
Example to check sda:
# smartctl -a /dev/sda

(Step 6)
Check every drive and make note of the serial number and the errors on the drives - hopefully your new drives, the destination drives, have 0 ata errors and 0 reallocated sectors (if not I would consider getting new drives to clone to, although exceptions do exist):
# for i in a b c d e f g h i j k l m n o p q r s t u v w x y z; do echo "===drive sd$i==="; smartctl -a /dev/sd$i | egrep -i "reallocated_sector|ata error|serial|model|user capacity"; done;

(Step 7)
Once you know what disk is the source and the destination the clone can begin, make sure to not include the partition number. For example here is a disk: sda, or hdc. Here is a partition: sda1, or hdc1. We dont want the partition form.

Again: Realize which is the source drive and the destination drive. Make a note of it, "like sdc is the source drive and sdd is the destination drive"

MOVING TO GOOD FOLDER
=====================

(Step 8)
For this example lets pretend sdc is the source, and sdd is the destination.

# cd ~
# pwd

Record what folder you are in, we are going to create the log into here

THE CLONE
==========

(Step 9)
Here come the cloning commands "ddrescue -n <source> <destination> <logfile>" and then again without "-n" but with "-r1"
# ddrescue -v -n /dev/sdc /dev/sdd ddrlog.txt
# ddrescue -v -r1 /dev/sdc /dev/sdd ddrlog.txt

If the system crashes you can restart the command that it crashed on and it will restart where it left off from because the logfile keeps track of progress for the program.

(optional step 10)
OPTIONALY: If you encounter a crash, you can clone from the back of the drive retrying all of the troubled areas from the back of the disk
# ddrescue -v -R -n /dev/sdc /dev/sdd ddrlog.txt
# ddrescue -v -R -r1 /dev/sdc /dev/sdd ddrlog.txt
This program will not recover the same sector twice (if its already been recovered and logged) because of the way it keeps the log file, so it will not be a waste of time to just repeat the same forward commands over and over if you experience crashes - however it doesnt hurt to just run a reverse command.

Curious to know what all of these command line switches mean? Well -v is verbose so you see more output, -R is to try from the back of the disk, -r1 is to retry bad areas once. "-n" is to try the good areas first, it skips errors. Thats why we first cover the whole disk clone with "-n" it will clone all the good areas and log all of the areas it skipped(which will be the errored areas, because "-n" skips errors) then we run "-r1" to retry on the skipped areas which happen to be the skipped areas.

THE END
=======

(Step 11)
When its done make note of the errors see if the source drives grew in error numbers (if the error numbers grow it just proved the better reason that the clone needed to happen, the drive was getting bad and errors grow on bad drives even faster) also make note on the destination drives (their errors numbers shouldnt of grown): 
# for i in a b c d e f g h i j k l m n o p q r s t u v w x y z; do echo "===drive sd$i==="; smartctl -a /dev/sd$i | egrep -i "reallocated_sector|ata error|serial|model|user capacity"; done;

(Step 12)
When its done just make sure the process isnt running anymore, you shouldnt see ddrescue in any of the outputs.
# ps aux
# ps aux | grep "ddrescue" | grep -v "grep"

(Step 13)
Then just shutdown knoppix this ensures all of the connections to the disks are done - honestly just unplugging the drive after the clone is done, and "ps aux" out says ddrescue is no longer running is okay.

To shutdown gracefully:
# shutdown -h now


DDRESCUE COMMAND USEAGE AND CLONE ALGORITHM 
###########################################

Exerpts from:http://www.gnu.org/software/ddrescue/manual/ddrescue_manual.html

THE USAGE
=========
`-v' `--verbose' Verbose mode. Further -v's (up to 4) increase the verbosity level. 

`-r n' `--retries=n' Exit after given number of retry passes. Defaults to 0. -1 means infinity. Every bad sector is tried only one time per pass. To retry bad sectors detected on a previous run, you must specify a non-zero number of retries. 

`-R' `--reverse' Reverse direction of copying, retrying, and the sequential part of splitting, running them backwards from the end of the input file. 

There are alot more options we are just using those.

THE ALGORITHM
=============
The algorithm of ddrescue is as follows (the user may interrupt the process at any point, but be aware that a bad drive can block ddrescue for a long time until the kernel gives up):

1) Optionally read a logfile describing the status of a multi-part or previously interrupted rescue. If no logfile is specified or is empty or does not exist, mark all the rescue domain as non-tried.

2) (First phase; Copying) Read the non-tried parts of the input file, marking the failed blocks as non-trimmed and skipping beyond them, until all the rescue domain is tried. Only non-tried areas are read in large blocks. Trimming, splitting and retrying are done sector by sector. Each sector is tried at most two times; the first in this step as part of a large block read, the second in one of the steps below as a single sector read.

3) (Second phase; Trimming) Read forwards one sector at a time from the leading edge of the largest non-trimmed block, until a bad sector is found. Then read backwards one sector at a time from the trailing edge of the same block, until a bad sector is found. For each non-trimmed block, mark the bad sectors found as bad-sector and mark the rest of that block as non-split. Repeat until there are no more non-trimmed blocks.

4) (Third phase; Splitting) Read forwards one sector at a time from the center of the largest non-split block, until a bad sector is found. Then read backwards one sector at a time from the center of the same block, until a bad sector is found. If the logfile is larger than `--logfile-size', read the smallest non-split blocks until the number of entries in the logfile drops below `--logfile-size'. Repeat until all remaining non-split blocks have less than 5 sectors. Then read the remaining non-split blocks sequentially.

5) (Fourth phase; Retrying) Optionally try to read again the bad sectors until the specified number of retries is reached.

6) Optionally write a logfile for later use.

EXTRA DOCUMENTATION
###################
http://www.kossboss.com/linux---dd_rescue-vs-ddrescue
http://www.gnu.org/software/ddrescue/manual/ddrescue_manual.html


Comments