transforming url's, python
I have a txt file which contains some url's:
[http://igu.org.ru/ International Geographical Union - Russian National Committee]
[http://www.geografos.org Colegio de Geógrafos - España]
[http://www.geografs.org Col.legi de Geògrafs - Catalunya]
[http://www.geografs.org]
now I want to transform this external links in the following way (in the fixed order):
replace "[url any text]
" with "any text
", where "url
" is an URL (e.g., starts with "http://").
replace "[url]
" with "url
"
import re
def openfile(filename):
w开发者_StackOverflowith codecs.open(filename, encoding="utf-8") as F:
replace = F.read()
replace = re.sub(r'\[http://.+ ...) # should replace "[url any text]" with "any text"
replace = re.sub(...) # should replace "[url]" with "url"
any suggestions?
re1 = re.compile(r'\[(http[^\s]*)\s(.*)\]')
re2 = re.compile(r'\[(http[^\s]*)\]')
with codecs.open(filename, encoding='utf-8') as F:
text = F.read()
pre_filter = re1.sub('\g<2>', text)
result = re2.sub('\g<1>', pre_filter)
to process the your text. For further informations in the background you can read: http://docs.python.org/howto/regex.html#search-and-replace
精彩评论