Python script for URL split
I'm new to python,learning the basics.
My Query : I have multiple pages accessed as a request from a log file like the below,
"GET /img/home/search-user-ico.jpg HTTP/1.1"
"GET /SpellCheck/am.tlx HTTP/1.1"
"GET /img/plan-comp-nav.jpg HTTP/1.1"
"GET /ie6.css HTTP/1.1"
"GET /img/portlet/portlet-content-bg.jpg HTTP/1.1"
"GET /SpellCheck/am100k2.clx HTTP/1.1"
"GET /SpellCheck/am.tlx HTTP/1.1"
My question is i want only the file part from the page,
For example,
Let us consider "GET /img/home/search-user-ico.jpg HTTP/1.1" ,"GET /ie6.css HTTP/1.1"
as a page then from the above i want to split search-user-ico.jpg HTTP
, ie6.css HTTP
.
so experts please help me in writing the py开发者_JS百科thon script for the above to split.
Assuming that you don't have spaces in the filenames and that you don't want "HTTP" at the end.
You can split the line by space.
parts = line.split(" ")
and then use the os
module to get the filename from the path.
filename = os.path.basename(parts[1])
For example.
>>> line = "GET /img/home/search-user-ico.jpg HTTP/1.1"
>>> parts = line.split(" ")
>>> parts[1]
'/img/home/search-user-ico.jpg'
>>> os.path.basename(parts[1])
'search-user-ico.jpg'
data = [
"GET /img/home/search-user-ico.jpg HTTP/1.1",
"GET /SpellCheck/am.tlx HTTP/1.1",
"GET /img/plan-comp-nav.jpg HTTP/1.1" ,
"GET /ie6.css HTTP/1.1",
"GET /img/portlet/portlet-content-bg.jpg HTTP/1.1",
"GET /SpellCheck/am100k2.clx HTTP/1.1" ,
"GET /SpellCheck/am.tlx HTTP/1.1"
]
for url in data:
print url.split(' ')[1].split('/')[-2]
data = [
"GET /img/home/search-user-ico.jpg HTTP/1.1",
"GET /SpellCheck/am.tlx HTTP/1.1",
"GET /img/plan-comp-nav.jpg HTTP/1.1" ,
"GET /ie6.css HTTP/1.1",
"GET /img/portlet/portlet-content-bg.jpg HTTP/1.1",
"GET /SpellCheck/am100k2.clx HTTP/1.1" ,
"GET /SpellCheck/am.tlx HTTP/1.1"
]
for url in data:
print url.split(' ')[1].split('/')[-1]
If the format of your links is similar. Another solution would be:
request = "GET /img/home/search-user-ico.jpg HTTP/1.1"
parts = request.split("/")
parts[-2] //returns search-user-ico.jpg HTTP
精彩评论