How do I parse the people's first and last name in Python?
So basically I need to parse a name and find the following info:
First Name
First Initial (if employee has initials for a first name like D.J., use both initials)
Last Name (include if employee has a suffix such as Jr. or III.)
So here's the interface I'm working with:
Input:
names = ["D.J. Richies III", "John Doe", "A.J. Hardie Jr."]
for name in names:
print parse_name(name)
Expected Output:
{'FirstName': 'D.J.', 'FirstInitial': 'D.J.', 'LastName': 'Richies III' }
{'FirstName': 'John', 'FirstInitial': 'J.', 'LastName': 'Doe' }
{'FirstName': 'A.J.', 'FirstInitial': 'A.J.', 'LastName': 'Hardie Jr.' }
Not really good at Regex, and actually that's probably overkill for this. I'm just guessing:
if name[1] == ".": # we have a name like开发者_JS百科 D.J.?
I found this library quite useful for parsing names. https://code.google.com/p/python-nameparser/
It can also deal with names that are formatted Lastname, Firstname.
There is no general solution and solution will depend on the constraints you put. For the specs you have given here is a simple solution which gives exactly what you want
def parse_name(name):
fl = name.split()
first_name = fl[0]
last_name = ' '.join(fl[1:])
if "." in first_name:
first_initial = first_name
else:
first_initial = first_name[0]+"."
return {'FirstName':first_name, 'FirstInitial':first_initial, 'LastName':last_name}
names = ["D.J. Richies III", "John Doe", "A.J. Hardie Jr."]
for name in names:
print parse_name(name)
output:
{'LastName': 'Richies III', 'FirstInitial': 'D.J.', 'FirstName': 'D.J.'}
{'LastName': 'Doe', 'FirstInitial': 'J.', 'FirstName': 'John'}
{'LastName': 'Hardie Jr.', 'FirstInitial': 'A.J.', 'FirstName': 'A.J.'}
Well, for your simple example names, you can do something like this.
# This separates the first and last names
name = name.partition(" ")
firstName = name[0]
# now figure out the first initial
# we're assuming that if it has a dot it's an initialized name,
# but this may not hold in general
if "." in firstName:
firstInitial = firstName
else:
firstInitial = firstName[0] + "."
lastName = name[2]
return {"FirstName":firstName, "FirstInitial":firstInitial, "LastName": lastName}
I haven't tested it, but a function like that should do the job on the input example you provided.
This is basically the same solution as the one Anurag Uniyal provided, only a little more compact:
import re
def parse_name(name):
first_name, last_name = name.split(' ', 1)
first_initial = re.search("^[A-Z.]+", first_name).group()
if not first_initial.endswith("."):
first_initial += "."
return {"FirstName": first_name,
"FirstInitial": first_initial,
"LastName": last_name}
精彩评论