Fetching plain text in Yahoo Pipes

2022-12-09 08:26 问答作者：

I have a Yahoo pipe taking the Atom feed from a Google group, and I want to do some processing on the message's full text (running various regular expressions to extract data). I can get a message's text in plain text from from Google using a url like this:

http://groups.google.com/group/(group_name)/msg/(message_id)?dmode=source&output=gplain

However, I'm having trouble getting it inside Yahoo pipes as a string value. Fetch Page rejects non-HTML pages. YQL using the html table seems to work, and wraps the plain text inside a p element, whose text I can extract like this:

select * from html where url="..." and xpath="//p"

However, if the message text contains html tags, YQL returns an HTML subtree instead of a string. Is there any way of flatt开发者_高级运维ening it back into its HTML source?

The trick is to remove the "output=gplain" and grab the content from the pre element.

select content from html 
where url="http://groups.google.com/group/haml/msg/0f78eda2f5ef802d?dmode=source" 
and xpath='//div[contains(@class,"maincontbox")]/pre'

I have created a pipe with Google Group and Message ID as inputs to demonstrate:

http://pipes.yahoo.com/pipes/pipe.info?_id=3d345e162405e7dbd47d73b95c21f102

继续阅读：plaintext yahoo-pipes

Fetching plain text in Yahoo Pipes

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？