How to read .ARC files from the Heritrix crawler using Python?
I looked at the Heritrix documentation website, and they listed a Python .ARC file reader. However, it is 404 not found when I clicked on it. http://crawler.archive.org/articles/developer_manual/arcs.html
Does anyone else know any Heritrix ARC reader that uses Python?
(I asked this question befo开发者_StackOverflowre, but closed it due to inaccuracy)
Nothing a little Googling can't find: http://archive-access.cvs.sourceforge.net/viewvc/archive-access/archive-access/projects/hedaern/
精彩评论