Reading custom values in Ebay RSS feed (XML::RSS module)

2023-01-01 18:39 问答作者：

I've spent entirely way too long trying to figure this out. I'm using XML: RSS and Perl to read / parse an Ebay RSS feed. Within the <item></item> area, I see these entries:

<rx:BuyItNowPrice xmlns:rx="urn:ebay:apis:eBLBaseComponents">1395</rx:BuyItNowPrice>
<rx:CurrentPrice xmlns:rx="urn:ebay:apis:eBLBaseComponents">1255</rx:CurrentPrice>

However, I can't figure out how to grab the details during the loop. I wrote a regex to grab them:

@current_price = $item  =~ m/\<rx\:CurrentPrice.*\>(\d+)\<\/rx\:CurrentPrice\>/g;

Which works if you place the above 'CurrentPrice' entry into a standalone string, but not while the script is 开发者_运维知识库reading through the RSS feed.

I can grab most of the information I want out of the item->description area (# bids, auction end time, BIN price, thumbnail image, etc.), but it would be nicer if I could grab the info from the feed without me having to deal with grabbing all that information manually.

How to grab custom fields from an RSS feed (short of writing regexes to parse the entire feed w/o a module)?

Here's the code I'm working with:

$my_limit = 0;
use LWP::Simple;
use XML::RSS;

$rss = XML::RSS->new();
$data = get( $mylink );
$rss->parse( $data );

$channel = $rss->{channel};

$NumItems = 0;
foreach  $item (@{$rss->{'items'}}) {
if($NumItems > $my_limit){
last;
}

@current_price = $item =~ m/\<rx\:CurrentPrice.*\>(\d+)\<\/rx\:CurrentPrice\>/g;

print "$current_price[0]";

}

If you have the rss/xml document and want specific data you could use XPATH:

Perl CPAN XPATH

XPath Introduction

What is the way in which "it doesn't work" from an RSS feed? Do you mean no matches when there should be matches? Or one match where there should be several matches?

One thing that jumps out at me about your regular expression is that you use .*, which can sometimes be greedier than you want. That is, if $item contained the expression

<rx:BuyItNowPrice xmlns:rx="urn:...nts">1395</rx:BuyItNowPrice>
<rx:CurrentPrice xmlns:rx="urn:...nts">1255</rx:CurrentPrice>
<rx:BuyItNowPrice xmlns:rx="urn:...nts">1395</rx:BuyItNowPrice>
<rx:SomeMoreStuff xmlns:rx="urn:...nts">zzz</rx:BuyItNowPrice>
<rx:CurrentPrice xmlns:rx="urn:...nts">1255</rx:CurrentPrice>

then the first part of your regular expression (\<rx\:CurrentPrice.*\>) will wind up matching everything on lines 2, 3, and 4, plus the first part of line 5 (up to the >). Instead, you might want to use the regular expression¹

m/\<rx:CurrentPrice[^>]*>(\d+)\<\/rx:CurrentPrice\>/

which will only match up to the closing </rx:CurrentPrice> tag after a single instance of an opening <rx:CurrentPrice> tag.

¹ The other obvious answer is that you really don't want to use a regular expression at all, that regular expressions are inferior tools for parsing XML compared to customized parsing modules, and that all the special cases you will have to deal with using regular expressions will eventually render you unconscious from having repeatedly beaten your head against your desk. See Salgar's answer, for example.

继续阅读：feed parsing perl rss

Reading custom values in Ebay RSS feed (XML::RSS module)

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？