getting a substring out of a item in yahoo pipes

2023-02-11 08:59 问答作者：

following situation:

item.
   content => "This is a 48593 test"
   title => "the title"

item.
   content => "This is a 48593 test 3255252"
   title => "the title"

item.
   content => "This 35542 is a 48593 test"
   title => "the title"

item.
   content => "i havent exactly 5 digits 34567654"
   title => "the title"

this is my current item in the console of pipe开发者_运维百科s

no i want to replace "content" with "the last match of a number that has exactly 5 digits. wanted result:

item.
   content => "48593"
   title => "the title"

item.
   content => "48593"
   title => "the title"

item.
   content => "48593"
   title => "the title"

item.
   content => ""
   title => "the title"

is there a way to do this in pypes 2?

please comment if something is unclear

Use the regex module like this:

In item.content replace (.*) with X $1

In item.content replace .*\b(\d{5})\b.* with $1

In item.content replace X .* with nothing (leave field empty)

Here's an example pipe

Some Explanations

\d{5} finds exactly five digits
\b word boundaries, so that numbers with more digits are not found
the X at the beginning marks strings where the regular expression doesn't match to delete them afterwards
finding the last number and not the first is the default behavior. Because * is a greedy operator.

sorry , i don't know anything else than Python

but as your problem interested me and that regexes are more or less the same in all the langages, I propose my solution in Python

import re

pat = re.compile("(?:.*((?<!\d)(?:\d{5})(?!\d))|\Z).*")

gh = ("This is a 48593 test",
      "This is a 48593 test 3255252",
      "This 35542 is a 48593 test",
      "i havent exactly 5 digits 34567654")

for x in gh:
    print x
    print 'AAA'+pat.search(x).groups("")[0]+'ZZZ'
    print

results

This is a 48593 test
AAA48593ZZZ

This is a 48593 test 3255252
AAA48593ZZZ

This 35542 is a 48593 test
AAA48593ZZZ

i havent exactly 5 digits 34567654
AAAZZZ

The 'AAA' and 'ZZZ' have no other utility to show that the 4th result gives ""

The "" in groups("") gives the default value "" when there is no match

Otherwise the 4th result would be None :

import re

pat = re.compile("(?:.*((?<!\d)(?:\d{5})(?!\d))|\Z).*")

gh = ("This is a 48593 test",
      "This is a 48593 test 3255252",
      "This 35542 is a 48593 test",
      "i havent exactly 5 digits 34567654")

for x in gh:
    print x
    print pat.search(x).groups()[0]
    print

results in

This is a 48593 test
48593

This is a 48593 test 3255252
48593

This 35542 is a 48593 test
48593

i havent exactly 5 digits 34567654
None

继续阅读：json regex xml yahoo yahoo-pipes

getting a substring out of a item in yahoo pipes

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？