开发者

Get filename from Unix "ls -la" command with regexp?

How can I produce a regular expressions pattern that returns the filename from any one of these lines? (I will search one line at a time).

drwxrwxrwx  4 apache      apache       4096 Oct 14 09:40 .
drwxrwxrwx 11 apache      apache       4096 Oct 13 11:33 ..
-rwxrwxrwx  1 apache      apache      16507 Oct 17 10:16 .bash_history
-rwxrwxrwx  1 apache      apache         33 Sep  1 09:36 .bash_logout
-rwxrwxrwx  1 apache      apache        176 Sep  1 09:36 .bash_profile
-rwxrwxrwx  1 apache      apache        124 Sep  1 09:36 .bashrc
-rwxrwxrwx  1 apache      apache        515 Sep  1 09:36 .emacs
-rw-------  1 christoffer christoffer 11993 Sep 18 10:00 .mysql_history
drwxrwxrwx  3 apache      apache       4096 Sep  1 09:48 .subversion
-rwxrwxrwx  1 christoffer christoffer  9204 Oct 14 09:40 .viminfo
drwxrwxrwx 14 apache      apache       4096 Oct 12 07:39 www

The search is done using PHP, but I guess that doesn't really make a difference. :)

EDIT: The file listing is retrieved by a SSH connectio开发者_运维百科n and that is why I don't use a built in PHP-function. I need this full listing to see whether or not a file is actually a directory.


Try ls -a1F instead. That will list you all entries (-a), one per line (-1), with additional information about the file type appended to the name (-F).

You will then probably get something like this for your directory:

./
../
.bash_history
.bash_logout
.bash_profile
.bashrc
.emacs
.mysql_history
.subversion/
.viminfo
www/

The directories have a slash / at the end.


The main question is... Why? Use readdir and stat instead.

<?php

$directory = './';
$dh = opendir($directory);

while (($file = readdir($dh)) !== false)
{
    $stat = stat($directory.$file);
    echo '<b>'.$directory.$file.':</b><br/>';
    var_dump($stat);
}


If you are looking for directories, rather than parsing ls output, just use find.

find -maxdepth 1 -mindepth 1 -type d

This will list the directories like this:

./Documents
./.gnupg
./Download

You no longer have to parse the data to determine what is a directory and what isn't.

If you're actually wanting the files, and not the directories, you use -type f instead.

Your parsing of the ls output may very well break on symlinks...


I wouldn't use regex

Given a line, you could explode and pop the last element from the array

if (preg_match('/^d/', $line)) {
    $name = array_pop(explode(' ', $line));
}

EDIT: none of your examples have embedded spaces but a later comment suggests that it IS important to find filenames


Adding to what Matthew said, there's plenty of reasons to not parse ls output. You might have spaces in file names - or even delete characters. The format of the date part of the listing, especially for older files, is different, the size of the large files can break the listing.

If you must use regex, and you really have no spaces in file names, then just tie to the end of the line and get the non-spaces you find there

(\S+)$


There's a nicer way to do this in php5 using the spl and DirectoryIterator

$dir = '.';
foreach (new DirectoryIterator($dir) as $fileInfo) {
    echo $fileInfo->getFilename() . "<br>\n";
}


Given your constraint of using the full directory listing I would do it this way:

ls -l | egrep '^d' | awk '{print $NF}'

Egrep command would search for the letter "d" at the beginning of the line. Awk by default uses spaces as seperators and the $NF will get you the last element. The only edge case I can think of where this wouldn't always work 100% of the time is when the file name would have spaces in it.

I would suggest using the find command:

find . -maxdepth 1 -type d | awk -F '/' '{print $NF}'

The find command above will get you only the files/directories in your current directory (b/c of -maxdepth 1 arg). The awk command will split the line using the '/' and will only retrieve the last token ($NF).

Because, the awk command

awk -F '/' '{print $NF}'

will get you the last element you can essentially use:

find . -maxdepth x -type d

where x is a number of your choice >= 1, you'll still get what you want, the filename and/or the directory name.


\S+\s+\S+\s+\S+\s+\S+\s+\S+\s+\S+\s+\S+\s+\S+\s+(\S+)

Each string is built of 9 parts separated by whitespace. You are looking for the 9th part.


Use glob('*') instead?


Instead of trying to parse difficult output, how about generating some more helpful output in the first place. For example:

ssh user@machine 'cd /etc; for a in *; do [ -f "$a" ] && echo "$a"; done'

will generate a list of non-directory files in /etc on the remote machine. This should be much easier for you to parse.


Displays hidden files too, try it if you don't believe me.

 glob('{,.}*', GLOB_BRACE);
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜