How to use Posterous API to 'scrape' my own site for contributor info, date info, and response info
As a college teacher in STEM, I like it when I can use technology to enhance the learning experience of my students, and I doubly like it when it allows me to introduce them to cool tech, too.
During the last year, I've had a couple classes make posts to a Posterous site (http://spectrawiki.posterous.com) and post comments to the posts of others. This is required for the course, and I've be saddled with keeping track of class activity.
I'd hoped I could download site content in a way that gives me the data I need (who posted, when they posted, and if they posted an article or comment), but Posterous doesn't have this functionality. A very responsive Posterous Tech suggested I look at the API.
But I'm not a coder. I know enough about coding (HTML, PHP, matlab, python, R, Mathematica) to respect those who really know what they're doing.
So I ask the Stack Overflow community: how could I hack together something (e.g., a python script) with the API to get the data I'm looking for (listed above)? Are there any tutorials out there that would lead me through the steps of building a script? I've never used an API before, so I don't really know wher开发者_JAVA技巧e to begin.
Thanks in advance for any pointers.
[Edit] For Example: the Posterous API Reference has an example like this
curl -X PUT --user you@example.com:password -d "api_token=<your token>" -d "post[title]=New Title" http://posterous.com/api/2/sites/12345/posts/6789
when it talks about the API being RESTful. How can I modify this curl command to return some type of information about my Posterous site? I can (probably) handle the authentication flags and that token flag. But using the other flags, that's where I could use a pointer.
Follow-up, 5 December 2012: It looks like pyposterous no longer works. My scripts fail and pyposterous fails its own unit tests. Bummer.
====
Pyposterous did, indeed, give me the tools to answer my question, so I thought I'd share it here for others. Here's the script I wrote:
import datetime
import pyposterous
from pyposterous import Cursor
api = pyposterous.API(username='[username]', password='[passwd]')
d=datetime.datetime.today().strftime("%y%B%d-%H%M")
filename = 'report-posts-' + str(d) +'.txt'
log=open(filename,'w')
for post in Cursor(method=api.read_posts, start_page=1, parameters={'hostname':'spectrawiki'}):
try:
print >> log, "--------------------"
print >> log, "%s, %s, %s" % (post.author, post.date, post.title)
except AttributeError:
pass # No comments
except UnicodeEncodeError:
pass
It's a crude script, but it gets the basic job done.
精彩评论