Remove all redundant files in a directory
I have directoryA that gets populated as a replic开发者_运维知识库a of directoryB, and some files are changed or added. I want to automate the process of deleting all files from directoryA that have redundant copies in directoryB.
Both directories have several layers of sub-directories, so the solution will likely have to be recursive.
My first thought is to create a batch script, but I'm new to the microsoft command prompt, and it seems to be widely different from bash scripting, with which I have some limited experience.
I am using Windows XP, but would like a solution that also worked on Windows 7.
In your situation I would take the lazy man's way out, install mingw, and use
find directoryA directoryB -type f -exec md5sum '{}' ';' |
my-bash-script
to find every file in directoryA that has the same MD5 signature as a file in directoryB, then remove it.
Or if you prefer a less lazy solution but one that does not require mingw, install Lua and the Lua POSIX library (which I think can be installed on Windows). You can google for the MD5 library and do the entire operation in Lua, and it will be portable. And unlike the mingw solution, it will be easy to deploy to anybody's Windows box; you can make a standalone binary.
If you want a solution that does not require third-party software to be installed, use the script below. It only uses built-in command-line tools.
The script first checks some common error condition. Then it iterates recursively thru all the files in the cleanup directory. If it finds an equally named filed in the backup directory it does a binary comparison to determine if the file is redundant.
@echo off
rem delete files from a directory that have a redundant copy in a backup directory
setlocal enabledelayedexpansion
rem check arguments
if "%~2"=="" (
echo.Usage: %~n0 cleanup_dir backup_dir
echo.Delete files from cleanup_dir that have a redundant copy in backup_dir
exit /b 1
)
set CLEANUP_DIR=%~f1
if not exist "%CLEANUP_DIR%" (
echo."%CLEANUP_DIR%" does not exist.
exit /b 1
)
set BACKUP_DIR=%~f2
if not exist "%BACKUP_DIR%" (
echo."%BACKUP_DIR%" does not exist.
exit /b 1
)
rem ensure that dirs are different
if "%CLEANUP_DIR%" == "%BACKUP_DIR%" (
echo.backup directory must not be the same as cleanup directory.
exit /b 1
)
rem ensure that backup_dir is not a sub dir of cleanup_dir
if not "!BACKUP_DIR:%CLEANUP_DIR%=!" == "%BACKUP_DIR%" (
echo.backup directory must not be a sub directory of cleanup directory.
exit /b 1
)
rem iterate recursively thru files in cleanup_dir
for /R "%CLEANUP_DIR%" %%F in (*) do (
set FILE_PATH=%%F
set BACKUP_FILE_PATH=!FILE_PATH:%CLEANUP_DIR%=%BACKUP_DIR%!
if exist "!BACKUP_FILE_PATH!" (
rem binary compare file to file in backup dir
fc /B "!FILE_PATH!" "!BACKUP_FILE_PATH!" >NUL 2>&1
if not errorlevel 1 (
rem if files are identical delete file from cleanup_dir
echo.delete redundant "!FILE_PATH!".
del "!FILE_PATH!"
) else (
echo.keep modified "!FILE_PATH!".
)
) else (
echo.keep added "!FILE_PATH!".
)
)
I make a wide berth around Windows, but you'll likely find the powerful scripting capabilities you're looking for in Windows PowerShell (see also Microsoft's documentation).
PowerShell takes an object-oriented approach to entities in the file system and elsewhere. It should be easy to whip up a script to do what you need, but you'd need to learn PowerShell first, of course.
EDIT: Microsoft is offering a download for PowerShell for Windows XP and a few others, but I don't see one for Windows 7. Ah... Wikipedia says it's already integrated in Windows 7. So that should cover your requirements, and it's already on-board with the newest versions of Windows.
精彩评论