How to delete string inside a text file if do not match with user input string?
Lets say : I have a user input "placeofjo.blogspot.com"
My code extracts links from this website and place the links in the text file.
Now the text file has this contents :
http://www.twitter.com/jozefinfin/
http://www.facebook.com/jozefinfin/
http://placeofjo.blogspot.com/2008_08_01_archive.html
http://placeofjo.blogspot.com/2008_09_01_archive.html
http://placeofjo.blogspot.com/2008_10_01_archive.html
http://placeofjo.blogspot.com/2008_11_01_archive.html
http://placeofjo.blogspot.com/2008_12_01_archive.html
http://placeofjo.blogspot.com/2009_01_01_archive.html
http://placeofjo.blogspot.com/2009_02_01_archive.html
http://placeofjo.blogspot.com/2009_03_01_archive.html
http://placeofjo.blogspot.com/2009_04_01_archive.html
http://placeofjo.blogspot.com/2009_05_01_archive.html
http://placeofjo.blogspot.com/2009_06_01_archive.html
http://placeofjo.blogspot.com/2009_07_01_archive.html
http://placeofjo.blogspot.com/2009_08_01_archive.html
http://placeofjo.blogspot.com/2009_09_01_archive.html
http://placeofjo.blogspot.com/2009_10_01_archive.html
http://placeofjo.blogspot.com/2009_11_01_archive.html
http://placeofjo.blogspot.com/2010_01_01_archive.html
http://placeofjo.blogspot.com/2010_02_01_archive.html
http://placeofjo.blogspot.com/2010_04_01_archive.html
http://placeofjo.blogspot.com/2010_06_01_archive.html
http://placeofjo.blogspot.com/2010_07_01_archive.html
http://placeofjo.blogspot.com/2010_08_01_archive.html
http://placeofjo.blogspot.com/2010_10_01_archive.html
http://placeofjo.blogspot.com/2010_11_01_archive.html
http://placeofjo.blogspot.com/2011_01_01_archive.html
http://placeofjo.blogspot.com/2011_02_01_archive.html
http://placeofjo.blogspot.com/2011_03_01_archive.html
http://endlessdance.blogspot.com
http://blogskins.com/me/aaaaaa
http://weheartit.com
I would like to delete
http://www.twitter.com/jozefinfin/
http://www.facebook.com/jozefinfin/
http://endlessdance.blogspot.com
http://blogskins.com/me/aaaaaa
http://weheartit.com
and left it with only the strings which is only similar to the user's input. How do I do this?
Desired contents of the text file :
http://placeofjo开发者_运维问答.blogspot.com/2008_08_01_archive.html
http://placeofjo.blogspot.com/2008_09_01_archive.html
http://placeofjo.blogspot.com/2008_10_01_archive.html
" "
" "
- Read the file Line by Line
- Check the line if it contains User input
- If so, write it to new File
Assuming that you can hold the whole list of links in memory at the same time, which you likely can since its the links from a website...
- Read in the file, split on newlines, and generate a List of links.
- Filter the list to remove any non-matching links
- Write the resulting filtered list back to the file, replacing the old contents of the file
For the matching in the filter, my thought would be to use
string.indexOf(inputToMatch) > 0 // it matches
Instead of building a text file and then filtering it. Do the filter when you parse the web page. Just look for links that match your criteria and only write good links to the file.
Here is regex way to solve this problem.. But , you should not use this solution with big files..
import java.io.File;
import java.io.IOException;
import java.util.regex.Pattern;
import org.apache.commons.io.FileUtils;
public class FileReplacer {
public static void main(String[] args) {
replaceFileContent();
}
public static void replaceFileContent() {
try {
String allStr = FileUtils.readFileToString(new File("c:/temp/data.txt"));
Pattern pattern =Pattern.compile("^(?!http://placeofjo\\.blogspot\\.com/.*$).+$(\\r\\n)?", Pattern.MULTILINE);
String newAllStr = pattern.matcher(allStr).replaceAll("");
FileUtils.writeStringToFile(new File("c:/temp/newdata.txt"), newAllStr);
} catch (IOException e) {
// TODO Auto-generated catch block
throw new RuntimeException(e);
}
}
}
精彩评论