LINUX & CYGWIN - RSYNC with Linux Scripts Scheduled Backups in Windows

-Read this through a couple of times first, and it will make more sense-

STEP 1) First Download and install CYGWIN its basically a LINUX program for WINDOWS and it gives you the linux SHELL.

Note1: This program does take forever to install

Note2http://cygwin.com/install.html go download cygwin. Its the setup.exe that you need.
Make sure that when you install it, that you do select to install rsync with it.

Quick note on how CYGWIN works:

When your all CYGWINed in: 
You will see a linux like directory system. Everything starts at /
However / is actually at C:\Cygwin\ or where you installed cygwin
HOWEVER you will see a /cygdrive/ thats kind of like "My Computer" there you will see "c","d",etc. You will see the drive letters you own. I personally see "c" "d" and "g". c for my C: drive, etc.

ADVISABLE STEP: This doesnt affect our RSYNC stuff, but its a good next step, since cygwin doesnt really install it self, it just copies itself from the internet in a wierd way. You need to change the Environment Variable to include Cygwin. Windows has an Environment Variable called path and its where it looks for when you type commands, like if I wrote the word "dir" or "cd", it looks for those programs in a predefined set of paths, and if its not in any of those then it fails. It will not look through the whole system.

a) Right click on Computer->Properties->Advanced System Settings->Advanced Tab->Environment Variables button

b) Scroll through the System Variables until you see path. Select it and hit Edit

c) You will change the value by appending(adding) something to it:

Originally mine looked like so, and I added the bold thing:
 %SystemRoot%\system32;%SystemRoot%;%SystemRoot%\System32\Wbem;%SYSTEMROOT%\System32\WindowsPowerShell\v1.0\;C:\Program Files\ATI Technologies\ATI.ACE\Core-Static;C:\Program Files\Microsoft\Web Platform Installer\;C:\Program Files\Microsoft ASP.NET\ASP.NET Web Pages\v1.0\;C:\Program Files\Windows Kits\8.0\Windows Performance Toolkit\;C:\Program Files\Microsoft SQL Server\110\Tools\Binn\;C:\cygwin\bin

STEP 2) DECIDE ON THE SOURCE and Make the destination directories.

In my example im backing up 2 things, so these will be my sources:
My Documents folder: C:\Users\koss Stuff\Documents
My Evernote Database: C:\Users\koss Stuff\Evernote

Note in CYGWIN when you put in a space you have to put in a backslash and a space like so "\ " but without the quotes, the \ is an escaping characted meaning "ignore the real meaning of the next characted" because in linux a space means next parameter or argument and we are trying to say hey man this is still part of the directory name. However the interesting part is that if you put the whole thing in quotes you dont have to worry about the backslash business. Thats why in my scripts below I will use the quotes. So here are the two ways to get there in CYGWIN, notice how I have to use the /cygdrive/driveletter/path notation:

cd /cygdrive/c/Users/koss\ Stuff/Documents

or

cd "/cygdrive/c/Users/koss Stuff/Documents"

And on the destination side of things:
And so on my BIG EXTERNAL DRIVE I make a folder called
G:\Backups\
and in it I make the folders Documents and Evernote so that over all I get this structure
G:\Backups\Documents\
G:\Backups\Evernote\

The C Documents folder will get backed up into Documents on G and similarily the C Evernote will get dumped into the Evernote folder on G.

STEP 3) Next we make the scripts. I did it in three scripts, two linux cygwin scripts and one windows bat scripts. It can be done in less, but im a .NET class guy so the more files the happier I is :-)

a) The first script is the backup.sh which houses the RSYNC jobs. The RSYNC jobs make log files that get appended to automatically.

b) The second script is the start.sh which starts backup.sh in the backgroup with the nohup command and the & at the end of the statement. The nohup really just means if I turn off the cygwin shell it will keep running. In detail Nohup lets you run things in a background and if you close the terminal it still runs, and & just lets it run in the background. I make the output of the nohup of backup.sh point to a file, so that there is no garbage on the screen. However if you do want to view the logs its as simple as running the following command "tail -f LOG_FILE" where LOG_FILE is replaced by the log file name.

c) The third script is the start-from-windows.bat a windows/dos/cmd/batch script which starts start.sh by pointing at bash.exe (I know its at C:\cygwin\bin\bash.exe since I ran whereis bash in cygwin and it returned /bin/bash) and then telling bash what script to launch which is the start.sh. Even though it points at start.sh which ran backup.sh in nohup, dont dare close the cmd window that comes up, it will just cancel you RSYNC. If you did do that on accident, no biggy just restart the start-windows.sh, that the beauty of rsync is it leaves off where it starts and doesnt mess up since its block level on file copy.

Use nano to write the scripts or notepad++ in the destination root folder. The scripts are below so modify them to your liking.

So the final Destination structure looks like this:
G:\Backup\backup.sh
G:\Backup\start.sh
G:\Backup\start-from-windows.bat
G:\Backup\---logs will go here--- (three logs, 2 from backup.sh, and 1from start.sh)
The Log files will get auto-made during the runs, so you dont actually have to "premake" them. Im just pointing them out:
G:\Backup\DocBackup.log -- these get auto appeneded to by RSYNC
G:\Backup\EverNote.log -- these get auto appeneded to by RSYNC
G:\Backup\Full-Log-Both-Jobs-CURRENT-DATE-AND-TIME.log
G:\Backup\Full-Error-Log-CURRENT-DATE-AND-TIME.log
Also we have two empty folders:
G:\Backup\Evernote\
G:\Backup\Documents\

And this is how they will look like in CYGWIN linux notation:
/cygdrive/g/Backup/backup.sh
/cygdrive/g/Backup/start.sh
/cygdrive/g/Backup/start-from-windows.bat
/cygdrive/g/Backup/Evernote/
/cygdrive/g/Backup/Documents/
And the log files looks like this:
/cygdrive/g/Backup/DocBackup.log
/cygdrive/g/Backup/Evernote.log
/cygdrive/g/Backup/Full-Log-Both-Jobs-$DATE.Log
/cygdrive/g/Backup/Full-Error-Log-$DATE.log


HERE ARE THE SCRIPTS:
The script is color coded them from the important stuff, to the optional stuff(that can be modified to fit your setup) and the none important so that its easier to understand

koss@god8 /cygdrive/g/Backup

$ nano backup.sh

#!/bin/bash

RSYNC=/bin/rsync

S1="/cygdrive/c/koss Stuff/Documents/"
S2="/cygdrive/c/koss Stuff/Evernote/"

### Note you dont have to include these 
### HASH mark comments, they dont get read 
### by the program either way
### NOTE IF DIDNT HAVE / at the end of 
### the S1 and S2 then it would of grabbed Documents and Evernote
### Then we would of had redundant
### child folders like /Documents/Documents/ and /Evernote/Evernote/ 
### over in the Backup destination folder
### Note and if I had S1 and S2 be like /* it 
### would fail. So you must end it on /

D1="/cygdrive/g/Backup/Documents/"
D2="/cygdrive/g/Backup/EvernoteDB/"

LOG1="/cygdrive/g/Backup/DocsBackup.log"
LOG2="/cygdrive/g/Backup/EverNote.log"


echo "#######################################################"
echo
echo Starting the FISRTRSYNC backup job of Documents
echo
echo "#######################################################"
echo
echo THE FIRST COMMAND TO BACKUP DOCUMENTS:
echo $RSYNC -ahv --stats --progress --log-file="$LOG1" "$S1" "$D1"

$RSYNC -ahv --stats --progress --log-file="$LOG1" "$S1" "$D1"

echo
echo "#######################################################"
echo
echo "#######################################################"
echo
echo "#######################################################"
echo
echo Starting the SECOND RSYNC backup job of Evernote Database
echo
echo "#######################################################"
echo
echo THE SECOND COMMAND TO BACKUP EVERNOTE DB:
echo $RSYNC -ahv --stats --progress --log-file="$LOG2" "$S2" "$D2"
echo


$RSYNC -ahv --stats --progress --log-file="$LOG2" "$S2" "$D2"


echo
echo "#######################################################"



koss@god8 /cygdrive/g/Backup

$ nano start.sh

#!/bin/bash
DT=`date "+%D-%T"`
DATE=`echo "$DT" | tr [/:] _`
# THIS WORKS JUST TRYING ANOTHER WAY::: DATE=`date "+%D-%T" | tr [/:] _`
echo $DATE
LOG=/cygdrive/g/Backup/Full-Log-Both-Jobs-$DATE.Log
ERR=/cygdrive/g/Backup/Full-Error-Log-$DATE.log
echo "### $DT ###"
echo Starting the backing up of EVERNOTE and DOCUMENTS from C drive to G drive
echo This Log file will be made, appended with this date:
echo $LOG
echo This Error log will be made:
echo $ERR
nohup /cygdrive/g/Backup/backup.sh > $LOG 2> $ERR &
echo
echo ==WARNING:==
echo IF RUNNING IN WINDOWS THRU start-from-windows.bat THEN DO NOT CLOSE THE cmd WINDOW
echo OR ELSE JOB WONT FINISH. IF YOU DID ON ACCIDENT CLOSE IT, THEN JUST RESTART IT
echo IT WILL PICK UP WHERE IT LEFT OFF
echo
echo ==NOTE:==
echo YOU CAN SEE THE PROGRESS by \"tail -f $LOG\"



koss@god8 /cygdrive/g/Backup

$ nano start-from-windows.bat

C:\cygwin\bin\bash.exe -l -c "/cygdrive/g/Backup/start.sh"



NOTE:

THE backup.sh IS THE RSYNC SCRIPT, THEY EACH GENERATE LOGS THAT AUTO APPEND EACH OTHER
THE start.sh IS THE LINUX START SCRIPT, NEW LOGS ARE MADE EACH TIME WITH DATE AND TIME, backup IS nohupped
THE start-from-windows.bat IS FOR WINDOWS STARTING, cmd COMES UP, DONT CLOSE IT, it will autoclose when the process is done
NOTE HAD TO HAVE D1 and D2 Folders already made
TO MAKE THE TASK IN WINDOWS 1 of two ways: I used the WINDOWS TASK SCHEDULER instead of the CYGWIN cron, because cron didnt work


STEP 4) Start the tasks in a schedule. 

You have two options to start tasks in windows on a schedule, I used the second way without cron:

  1. Install cron as a windows service, using cygrunsrv:

    cygrunsrv -I cron -p /usr/sbin/cron -a -D

    net start cron

    Research cron jobs and write your own entry into a cron tab. All you will need to do is type "crontab -e" and then type the single line entry which specifies the schedule and the command to run. You would point the cron job at the start.sh

  2. Notice how we had to run the .sh scripts thru bash by using a bat script. We then will run the bat script through the WINDOWS TASK SCHEDULER to make the job happen on a schedule

    C:\cygwin\bin\bash.exe -l -c "./full-path/to/script.sh" 

    Note again: I know where bash is, because in cygwin terminal you can type "whereis bash" or "which bash" and it will tell you its at /bin/ thus its at C:\cygwin\bin\

    Notice our bat script is 

    C:\cygwin\bin\bash.exe -l -c "./full-path/to/script.sh" 

    And our Windows Task Scheduler will point to the bat script, since windows better understands bat scripts


RUN THE WINDOWS TASK SCHEDULER and set up a task.

a) Click Run
            
b) Search for Task Scheduler, it should come up, if it doesnt do a quick google search for how to                 make it come up

c) Create a NEW TASK

b) Set the TRIGGER to be a SCHEDULE, like once a night (2:30 am or something) when your not using the system. Either way dont worry about performance only the first copy takes a long time.

NOTE ON THE TIME THIS PROCEDURE TAKES:

One example of a person doing it across the internet 40 Gigs took 8 hours and then after that it only takes a minute every time it runs

For me since my rsync job is on the same computer It copied 12 gigs in like 10 minutes or so and then every update takes less than a minute.

c) Set the ACTION to run a PROGRAM/SCRIPT and point it at our start-from-windows.bat script




Update on 11-25-2012:
This is optional but I updated backup.sh so that it shows times and gives you an idea of how fast the whole process is, it saves the times into the backup log. The other 2 script files startup.sh and start-from-windows.bat stay the same

Here it is (Notice I highlighted the additions in light blue, the main commands that make it work are bold green):

###############################
###############################
backup.sh script changed to give time stats
===============================
###############################
###############################


#!/bin/bash
RSYNC=/bin/rsync
S1="/cygdrive/c/koss Stuff/Documents/"
S2="/cygdrive/c/koss Stuff/Evernote/"

### NOTE IF DIDNT HAVE / at the end of the S1 and S2 then it would of grabbed Documents and Evernote
### Then we would of had redundant child folders like /Documents/Documents/ and /EvernoteDB/Evernote/
D1="/cygdrive/g/Backup/Documents/"
D2="/cygdrive/g/Backup/EvernoteDB/"
LOG1="/cygdrive/g/Backup/DocsBackup.log"
LOG2="/cygdrive/g/Backup/EverNote.log"

echo "#######################################################"
echo
DT1s=`date "+%D-%T"`
echo "$DT1s FIRST JOB START TIME"

echo "Starting the FISRTRSYNC backup job of Documents"
echo
echo "#######################################################"
echo
echo "THE FIRST COMMAND TO BACKUP DOCUMENTS:"
echo $RSYNC -ahv --stats --progress --delete --log-file="$LOG1" "$S1" "$D1"
$RSYNC -ahv --stats --progress --delete --log-file="$LOG1" "$S1" "$D1"
echo
echo "#######################################################"
echo
DT1e=`date "+%D-%T"`
echo "$DT1e FIRST JOB FINISH TIME"

echo
echo "#######################################################"
echo "~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~"
echo "#######################################################"
echo
DT2s=`date "+%D-%T"`
echo "$DT2s SECOND JOB START TIME"

echo "Starting the SECOND RSYNC backup job of Evernote Database"
echo

echo "#######################################################"
echo
echo "THE SECOND COMMAND TO BACKUP EVERNOTE DB:"
echo $RSYNC -ahv --stats --progress --delete  --log-file="$LOG2" "$S2" "$D2"
echo
$RSYNC -ahv --stats --progress --delete  --log-file="$LOG2" "$S2" "$D2"
echo
echo "#######################################################"
echo
DT2e=`date "+%D-%T"`
echo "$DT2e SECOND JOB FINISH TIME"

echo
echo "#######################################################"
echo
echo "Recap of Times:"
echo
echo "FIRST job Start  : $DT1s"
echo "FIRST job End    : $DT1e"
echo "------------------"
echo "SECOND job Start : $DT2s"
echo "SECOND job End   : $DT2e"

echo
echo "#######################################################"



SelectionFile type iconFile nameDescriptionSizeRevisionTimeUser
Comments