Moving, renaming huge amount of text files based on content and size
*Update July 4*
I ended up doing the following:
- Sort on date
- Check if last sentence is the same
- If Yes: If bigger -> this is the new message to be chosen. If smaller: remove. If no more of the same can be found, choose this one and move to another folder.
- If No: move on. Loop this again until all files with certain date have been checked.
Thanks all for the help!!
I'm bus开发者_如何转开发y with a big project where I have a huge number of emails that I have to filter, imported from gmail through thunderbird. There is a big problem though. Because gmail uses conversations, but thunderbird doesn't format them as such, what I have is a text file for each email, though the complete previous conversation as well. And so a whole new text file for each reply.To clarify, an example of a conversation:
Me:Hi, how are you?
You, replying: Good!
Me: Great!
In gmail this looks exactly as above, but for me this are now 3 files:
file 1:
Me, sent at 11:41: Hi, how are you?
file 2:
You, sent at 11:42: Good! Me, sent at 11:41: Hi how are you?
file 3:
Me, sent at 11:43: Great! You, sent at 11:42: Good! Me, sent at 11:41: Hi how are you?
As you can understand, this is no problem with 3 files: I just throw away file 1 and 2 and only use file 3. That's precisely what I want to do. But considering in total there are around 30k files, I would very much like to automate that.
It is unfortunately not possible to do this complete by file name, though partially it can. The files are named after their date. For instance: 20110102 for Jan 2, 2011. However as there are multiple email conversations on a day, I would lose a lot if I would just sort by date and only keep the largest.
I hope the problem is clear and you can help me with this. I work on Mac OSX 10.7. I've tried using Applescript, but either my script is not good or Applescript can't handle the amount of files. Maybe you have a recommendation for software or a script in some way? I'm open for all and not unfamiliar with programming.
Thanks in advance!
As your task is basically just text processing, any language you're familiar with, including AppleScript, PHP, bash, C, should be able to do the job. I think perhaps @inTide's breaking the problem down into discreet steps is what you need to do, building one portion at a time in the language of your choice.
Pick a language that you're familiar with and start writing one the code to the first step and make sure it's working as you expect, and then expand, adding a small bit of new functionality at each point and making sure that functionality works before moving on. Without an example of the code you've written or a better description of how AppleScript is failing for you, additional advice is difficult.
精彩评论