Regex Verification of Line in /etc/passwd
I need to verify an /etc/passwd file is valid, and thought regex would be a good idea to verify the lines that are not comments. How would I verify a line like:
root:*:0:0:System Administrator:/var/root:/bin/sh
After some research, the 5th field (System administrator) can contain other data like email 开发者_如何学Goand address, the second field could contain anything but a :
, the last 2 fields are full paths.
Any clues how I would create a regex expression for this?
without wishing to be facetious - Passwd::Unix is probably your best bet.
Do you need to use Perl? The normal way to inspect the password file is using awk as a database query language. For example:
awk -F: '$3 ~ /pattern/'
Of course, you could use perl -lane
instead. But if you’re using Perl, you should probably be using the standard User::pwent
module.
You want a regular expression? Ok fine then, I’ll give you a regular expression: it’s in the $is_valid_pwent_rx
variable.
Enjoy.
IMPORTANT: This must not be misconstrued to be a semantic checker of a sane passwd file. It is a syntactic checker only.
Currently configured for OpenBSD.
#!/usr/bin/env perl
use 5.010;
use strict;
use warnings;
our $PASSWD = "/etc/passwd";
our $Errors = 0;
sub is_valid_pwent(_);
sub main();
#########################################################
main();
exit($Errors != 0);
#########################################################
sub main() {
open(PASSWD) || die "can't open $PASSWD: $!";
while (my $line = <PASSWD>) {
chomp $line;
## NEXT LINE IS WRONG: NO "COMMENTS" ALLOWED!!!
next if $line =~ /^#/;
next if is_valid_pwent($line);
say "$0: Invalid entry at $PASSWD $.: $line";
$Errors++;
}
close(PASSWD) || die "can't close $PASSWD: $!";
say "$0: $PASSWD appears ok." unless $Errors;
}
#########################################################
INIT {
state $is_valid_pwent_rx = qr{
^ (?&any_pwent) $
###############################################
(?(DEFINE)
(?<any_pwent> (?&yp_pwent) | (?&pwent) )
# The `+' token may also be alone in the name field, which causes all users
# from the passwd.byname and passwd.byuid YP maps to be included.
#
# If the entry contains non-empty uid or gid fields, the specified numbers
# will override the information retrieved from the YP maps. Additionally,
# if the gecos, dir, or shell entries contain text, it will override the
# information included via YP. On some systems, the passwd field may also
# be overridden. It is recommended that the standard way to enable YP
# passwd support in /etc/master.passwd is:
#
# +:*::::::::
(?<yp_pwent>
(?&PLUS) # substitute in YP
: (?&EMPTY) | (?&pw_passwd) # user's encrypted password.
: (?&EMPTY) | (?&pw_uid) # user's login user ID.
: (?&EMPTY) | (?&pw_gid) # user's login group ID.
: (?&EMPTY) | (?&pw_gecos) # Honeywell login info.
: (?&EMPTY) | (?&pw_dir) # user's home directory.
: (?&EMPTY) | (?&pw_shell) # user's login shell.
)
# A normal password entry
(?<pwent>
(?&pw_name) # user's login name.
: (?&pw_passwd) # user's encrypted password.
: (?&pw_uid) # user's login user ID.
: (?&pw_gid) # user's login group ID.
: (?&pw_gecos) # Honeywell login info.
: (?&pw_dir) # user's home directory.
: (?&pw_shell) # user's login shell.
)
# A master password entry
(?<master_pwent>
(?&pw_name) # user's login name.
: (?&pw_passwd) # user's encrypted password.
: (?&pw_uid) # user's login user ID.
: (?&pw_gid) # user's login group ID.
: (?&pw_class) # user's general classification (see login.conf(5))
: (?&pw_change) # password change time.
: (?&pw_expire) # account expiration time.
: (?&pw_gecos) # general information about the user.
: (?&pw_dir) # user's home directory.
: (?&pw_shell) # user's login shell.
)
# The name field is the login used to access the computer account, and the
# uid field is the number associated with it. They should both be unique
# across the system (and often across a group of systems) since they con-
# trol file access.
#
# While it is possible to have multiple entries with identical login names
# and/or identical user IDs, it is usually a mistake to do so. Routines
# that manipulate these files will often return only one of the multiple
# entries, and that one by random selection.
#
# The login name may be up to 31 characters long. For compatibility with
# legacy software, a login name should start with a letter and consist
# solely of letters, numbers, dashes and underscores. The login name must
# never begin with a hyphen (`-'); also, it is strongly suggested that nei-
# ther uppercase characters nor dots (`.') be part of the name, as this
# tends to confuse mailers. No field may contain a colon as this has been
# used historically to separate the fields in the user database.
(?<pw_name>
(?= (?&NON_COLON){1,31} )
(?: (?&UNDERSCORE)
| (?&LETTER)
)
(?: (?&LETTER)
| (?&number)
| (?&HYPHEN)
| (?&UNDERSCORE)
){0,30}
)
# The password field is the *encrypted* form of the password. If the
# password field is empty, no password will be required to gain access to
# the machine. This is almost invariably a mistake. By convention, ac-
# counts that are not intended to be logged in to (e.g. bin, daemon, sshd)
# have a star (`*') in the password field. Note that there is nothing spe-
# cial about `*', it is just one of many strings that is not a valid en-
# crypted password (see crypt(3)). Because master.passwd contains the en-
# crypted user passwords, it should not be readable by anyone without ap-
# propriate privileges.
#
# Which type of cipher is used to encrypt the password information depends
# on the configuration in login.conf(5). It can be different for local and
# YP passwords.
(?<pw_passwd>
(?&STAR)
| (?&NON_COLON) +
| (?&EMPTY) # should not allow this!
)
# The uid field is the numeric user ID assigned to this login name.
# It need not strictly be unique.
(?<pw_uid>
(?&number) +
)
# The group (gid) field is the group that the user will be placed in
# upon login. Since this system supports multiple groups (see groups(1))
# this field currently has little special meaning.
(?<pw_gid>
(?&number) +
)
(?<pw_class>
(?&EMPTY)
| (?&any_text)
)
(?<pw_change>
(?&EMPTY)
| (?&number)
)
(?<pw_expire>
(?&EMPTY)
| (?&number)
)
(?<pw_gecos>
# (?&EMPTY) | (?&gecos_fields)
(?&any_text)
)
# some have an extra field in them after hphone
(?<gecos_fields>
(?&gecos_name) # User's full name.
(?&COMMA)
(?&gecos_office) # User's office location.
(?&COMMA)
(?&gecos_wphone) # User's work phone number.
(?&COMMA)
(?&gecos_hphone) # User's home phone number.
)
(?<gecos_name> (?&gecos_text) )
(?<gecos_office> (?&gecos_text) )
(?<gecos_wphone> (?&gecos_text) )
(?<gecos_hphone> (?&gecos_text) )
(?<pw_dir>
(?&EMPTY) # bad idea
| (?&directory_name)
)
(?<pw_shell>
(?&EMPTY) # means "/bin/sh"
| (?&filename)
)
#########################
(?<directory_name> (?&pathname) )
(?<filename> (?&pathname) )
(?<pathname>
(?&SLASH)
(?&any_text)
)
(?<LETTER> [a-z] ) # \p{Ll} && \p{ASCII}
(?<DIGIT> [0-9] ) # \p{Nd} && \p{ASCII}
(?<ZERO> 0 )
(?<NON_ZERO> [1-9] )
(?<PLUS> \x2B ) # PLUS SIGN
(?<COMMA> \x2C ) # COMMA
(?<HYPHEN> \x2D ) # HYPHEN-MINUS
(?<SLASH> \x2F ) # SOLIDUS
(?<COLON> \x3A ) # COLON
(?<STAR> \x2A ) # ASTERISK
(?<UNDERSCORE> \x5F ) # LOW LINE
(?<NON_COLON> [^\x3A] )
(?<EMPTY> (?# this space intentionally left blank) )
(?<number>
(?&ZERO)
| (?&NON_ZERO) (?&DIGIT) *
)
(?<any_text>
(?&NON_COLON) *
)
(?<gecos_text>
(?:
(?! (?&COMMA) )
(?! (?&COLON) )
.
) *
)
)
}x;
sub is_valid_pwent(_) {
my $pwent = shift();
return $pwent =~ $is_valid_pwent_rx;
}
}
Something like this?
^(#.*|[a-z]*:[^:]*:[0-9]*:[0-9]*:[^:]*:/[^:]*:/[^:]*)$
(assuming that the username consists of lowercase letters)
精彩评论