开发者

How to optimize a bash script? (Find files, ignore those on whitelist, report rest)

I wrote this script to find all files/directories to which $WWWUSER has write permissions. At first I stored the remaining, matching items in a temporary file. I new there must be a way without using files, so this is my "solution". It works, but it's pretty slow. Any tips?

Update: On a directory structure containing about 7k directories and 30k files (~8k whitelistings) the script takes about 15 minutes... (ext3 filesystem, UW320 SCSI harddisk).

#!/usr/bin/env bash
# Checks the webroot for files owned by www daemon and
# writable at the same time. This is only needed by some files
# So we'll check with a whitelist

WWWROOT=/var/www
WWWUSER=www-data
WHITELIST=(/wp-content/uploads
/wp-content/cache
/sitemap.xml
)
OLDIFS=$IFS
IFS=$'\n'

LIST=($(find $WWWROOT -perm /u+w -user $WWWUSER -o -perm /g+w -group $WWWUSER))
IFS=$OLDIFS

arraycount=-1
whitelist_matches=0

for matchedentry in "${LIST[@]}"; do
        arraycount=$(($arraycount+1))

        for whitelistedentry in "${WHITELIST[@]}"; do
                if [ $(echo $matchedentry | grep -c "$whitelistedentry") -gt 0 ]; then
                        unse开发者_如何学编程t LIST[$arraycount]
                        whitelist_matches=$(($whitelist_matches+1))
                fi
        done
LISTCOUNT=${#LIST[@]}
done

if [ $(echo $LISTCOUNT) -gt 0 ]; then
        for item in "${LIST[@]}"; do
                echo -e "$item\r"
        done
        echo "$LISTCOUNT items are writable by '$WWWUSER' ($whitelist_matches whitelisted)."
else
        echo "No writable items found ($whitelist_matches whitelisted)."
fi


(I don't have a setup handy to test this on, but it should work...)

#!/usr/bin/env bash
# Checks the webroot for files owned by www daemon and
# writable at the same time. This is only needed by some files
# So we'll check with a whitelist

WWWROOT=/var/www
WWWUSER=www-data
WHITELIST="(/wp-content/uploads|/wp-content/cache|/sitemap.xml)"

listcount=0
whitelist_matches=0

while IFS="" read -r matchedentry; do
    if [[ "$matchedentry" =~ $WHITELIST ]]; then
        ((whitelist_matches++))
    else
        echo -e "$matchedentry\r"
        ((listcount++))
    fi
done < <(find "$WWWROOT" -perm /u+w -user $WWWUSER -o -perm /g+w -group $WWWUSER)

if (( $listcount > 0 )); then
        echo "$listcount items are writable by '$WWWUSER' ($whitelist_matches whitelisted)."
else
        echo "No writable items found ($whitelist_matches whitelisted)."
fi

Edit: I've incorporated Dennis Williamson's suggestions on the math; also, here's a way to build the WHITELIST pattern starting from an array:

WHITELIST_ARRAY=(/wp-content/uploads
/wp-content/cache
/sitemap.xml
)

WHITELIST=""
for entry in "${WHITELIST_ARRAY[@]}"; do
    WHITELIST+="|$entry"
done
WHITELIST="(${WHITELIST#|})"  # this removes the stray "|" from the front, and adds parens

Edit2: Sorpigal's comment about eliminating new processes got me thinking -- I suspect most of the speedup in this version comes from not running ~40 invocations of grep per scanned file, and just a little bit from removing the array manipulation, but it occurred to me that if you don't need the totals at the end, you could remove the main while loop and replace it with this:

find "$WWWROOT" -perm /u+w -user $WWWUSER -o -perm /g+w -group $WWWUSER | grep -v "$WHITELIST"

...which does run grep, but only once (and runs the entire file list through that single instance), and once it's started grep'll be able to scan the list of files faster than a bash loop...


There is another possibility. Changing the whitelist to a regex pattern you could use the =~ bash regex operator (version 3 and up) to match any found word quickly against the list: if($word=~$pattern) $pattern could be "^(whitelistentry1|whitelistentry2|whitelistentry3|...)$".

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜