Backup Windows Directory

Here at Breaking Par, we make daily backups of all our customer's HTML files to a backup machine. The backup was done through a Windows batch file using the XCOPY command to copy all the files that had been modified. We have been doing this for a long time without much thought. Recently, we took a look at the directories and noticed there was extra files in the backup subdirectory that didn't exist in the main directory. After some research, we determined that the files were deleted in the main directory and never deleted in the subdirectory. After a little research, we were able to figure out how to get a true backup using only Windows batch files.

All these statements will go into a windows batch file (extension .bat) that is scheduled through the Windows scheduler (you could also use the Windows "at" command to schedule the task, but the scheduler provides a graphical interface to scheduled tasks). The program is scheduled on the backup server.

The first statement connects to the production server by mapping a network drive.

net use p: \\production_server\data_share_name my_password /USER:Administrator /PERSISTENT:No

p: indicates the drive letter that will be used on the backup server (where the batch file will be running).
production_server is the host name (an IP address can be used) of the production server.
data_share_name is the share name of the data directory (where the customer HTML files are stored) on the production server. This is the name given to the share - you shouldn't use the default shares (C$, etc.) that come with Windows. Instead, define your own share and specify the users that can access that share.
my_password is the password needed for the user to log in. VERY IMPORTANT - the password is listed in the batch file. This may or may not be a security issue for your environment.
/USER:Administrator is the user name that will be accessing the share. This should be an ID that has read access to the HTML directory on the production server. There is no need for the name to actually be "Administrator" - in fact, I'd say that you should NOT use the administrator user name for this process (especially since the password is in plain view to anyone looking at the batch file). If you use an ID that has read-only access to the HTML directory, then there isn't as much of an issue.
/PERSISTENT:No indicates that the drive should not be reconnected automatically on login. This drive is only going to be used during the duration of the batch file.

After the drive has been mapped, the next step is to copy all the files from the production HTML directory to the backup HTML directory.

XCOPY p:\html\*.* e:\html\*.* /E /D /C /Q /Y

p:\html\*.* is the source directory (the directory on the mapped drive). We will be copying all files.
e:\html\*.* is the destination directory.
/E indicates that we will be copying directories and subdirectories, including empty subdirectories.
/D says to copy files after a certain date. Since no actual date is specified, then it copies files where the destination date is older than the source date. This is exactly what we want for a backup - only copy the files that have been modified on the production server.
/C says to continue copying even if errors occur.
/Q indicates "quiet" mode - the file names will not be shown during copying. Since the batch file is running as a Windows scheduled task, there is no need to show the file names.
/Y tells the copy to not prompt when files are being overwritten.

Now comes the tricky part. We need to delete the files and directories from the backup server that are no longer on the production server. This is done in two phases - the files first and the directories second. Each phase has three steps:

dir e:\html\*.* /A:-D /B /O:N /S >> e:\filelist.txt

This is the first step. Make a directory listing of all the files in the backup HTML directory.
/A:-D says that directories are NOT to be included. So we are only including files.
/B lists only the file names (instead of a regular directory listing that shows file sizes and lots of other stuff).
/O:N sorts the file names in alphabetical order.
/S traverses subdirectories.
>> e:\filelist.txt redirects the output to a file called e:\filelist.txt.

Here is the second step of the first phase:

for /F "tokens=2,* delims=\" %%e in (e:\filelist.txt) do if NOT EXIST p:\html\%%f del e:\html\%%f

This statement goes through all the entries listed in the file just created. It finds out if the file exists on the production server. If the file does not exist on the production server, it is deleted from the backup server.
for /F is a special type of looping statement in the batch programming. It will go through all the lines in a text file (the file name in parentheses).
tokens=2,* specifies which tokens (a "token" will be explained in more detail later) from the text file will be read. 2 says to read the 2nd token, and * says to take everything after the 2nd token. Those 2 values (the 2nd token and everything after the 2nd token) will be placed into variables.
delims=\ specifies the delimiter for defining the tokens. The entries from the directory will be something like e:\html\file1.htm or e:\html\subdir1\file2.htm. The "\" character is used to split the string up into tokens. So the 1st token will be e:, the 2nd token will be html and everything after the 2nd token will be file1.htm in the first example and subdir1\file2.htm in the second example. Notice how in the 2nd example the "\" which previously was a delimiter, is now part of the string. That's because the * was used in the token statement.
%%e specifies the variable name. Two % signs are used because it's inside a batch file. If you were testing this out in a command prompt window, then you should use only one % sign. The first % sign is an "escape" character (just like "\" in Notes formula language). The way the variables work, this letter specifies the first variable (the first part of "tokens=") and every other variable will be sequentially named. So the 2nd token will go into the "%e" variable and everything after the 2nd token will go into a variable called "%f".
in (e:\filelist.txt) do these are required by the for /F statement. The filename is inside the parentheses.
if NOT EXIST for each line of the text file, we are going to check the existance of a file name. We want to check to see if the file does not exist.
p:\html\%%f this is the file we are looking for. Note how we're using the "%f" variable explained above. Also note that again we have to use "%%" inside the batch file. So in the first example above, we will be checking for p:\html\file1.htm and in the second example, we will be checking for p:\html\subdir1\file2.htm. This is where the advantage of using the "*" in the token really shows up. No matter how many subdirectories deep the file is located, all the slashes will be included. If we specified exact tokens, then it would get a lot more complicated with nested subdirectories.
del e:\html\%%f says to delete the file off the backup server if it doesn't exist on the production server. Again, the "%%f" works the same way - the variable is being used.

For the final step of the first phase, we delete the temporary file now that it has been processed:

del e:\filelist.txt

The next phase takes the same three steps, but applies it to subdirectories. I'll just list the three statements and highlight (in blue) the statements that are different:

dir e:\html\*.* /A:D /B /O:N /S >> e:\filelist.txt
for /F "tokens=2,* delims=\" %%e in (e:\filelist.txt) do if NOT EXIST p:\html\%%f rmdir e:\html\%%f
del e:\filelist.txt

This time through, we make a text file listing of only subdirectories (/A:D). If the directory does not exist on the production server, then remove the directory from the backup server.

Finally, after the files have been copied and the deleted files removed from the backup server, the last statement of the batch file will disconnect the mapped network drive:

net use p: /DELETE

The same drive letter that was specified earlier must be used and the /DELETE statement says to disconnect the drive.

So that's our batch file. It allows the backup server to daily (or more often, depending on how often you run the batch file through a program) have a duplicate of the production HTML directory for disaster recovery purposes.

Breaking Par Consulting

exceeding expectations