Splitting a url into a list in python
I am currently working on a project that involves splitting a url. I have used the urlparse module to break up the url, so now I am working with just th开发者_开发技巧e path segment.
The problem is that when I try to split() the string based on the delimiter "/" to separate the directories, I end up with empty strings in my list.
For example, when I do the following:
import urlparse
url = "http://example/url/being/used/to/show/problem"
parsed = urlparse.urlparse(url)
path = parsed[2] #this is the path element
pathlist = path.split("/")
I get the list:
['', 'url', 'being', 'used', 'to', 'show', 'problem']
I do not want these empty strings. I realize that I can remove them by making a new list without them, but that seems sloppy. Is there a better way to remove the empty strings and slashes?
I do not want these empty strings. I realize that I can remove them by making a new list without them, but that seems sloppy. Is there a better way to remove the empty strings and slashes?
What? There's only one empty string and it's always first, by definition.
pathlist = path.split("/")[1:]
Is pretty common.
A trailing slash can mean an "empty" filename. In which case, a default name may be implied (index.html, for example)
It may be meaningful.
"http://example/url/being/used/to/show/problem"
The filename is "problem"
"http://example/url/being/used/to/show/problem/"
The directory is "problem" and a default filename is implied by the empty string.
I am not familiar with urllib and its output for path but think that one way to form new list you can use list comprehension the following way:
[x for x in path.split("/") if x]
Or something like this if only leading '/':
path.lstrip('/').split("/")
Else if trailing too:
path.strip('/').split("/")
And at least if your string in path always starting from single '/' than the easiest way is:
path[1:].split('/')
pathlist = paths.strip('/').split("/")
remove the empty items?
pathlist.remove('')
I added this as a comment to a comment, so just in case: Couldn't you use a list comprehension to exclude the empty elements returned from the split, i.e.
path_list = [(p) for p in path.split('/') if len(p)]
精彩评论