script to delete text after keyword
I have a txt file with a bunch of URLs like:
http://url1.com/folder1/folder2
http://url3.com/folder1/folder2
http://url2.com/folder1/folder2
开发者_如何转开发
Im looking for a script that will delete everything after ".com". Seems there would be a simple applescript for this, but I cant seem to find what I am looking for. Any ideas?
sed -e "s,.com/.*$,.com,g" < infile > outfile
building on cbz answer you could run his command through applescript with some manipulation
-- choose afile you could also set this to a variable
set infile to choose file
--lets find out where the file lives so we know where to save the output to
tell application "Finder" to set outpath to quoted form of POSIX path of (container of infile as text)
-- convert the path of file to something shell can understand
set infile to quoted form of POSIX path of infile
--add file name to outpth
set outfile to outpath & "parsed_text.txt"
--put it all together and go, note I rearrange the command so that it will generate the results to a new file
do shell script "sed -e 's,.com/.*$,.com,gw " & outfile & "' " & infile
Here's one in PHP. You could also work with regular expressions but I feel that using parse_url will let you get on with it without having to worry about any special cases popping up.
<?php
$lines = file('yourfile.txt');
$sites = '';
foreach($lines as $url) {
$parsed = parse_url($url);
$sites .= $parsed['scheme']."://".$parsed['host']."/\n";
}
file_put_contents('yournewfile.txt', $sites);
?>
精彩评论