capture text, including tags from string, and then reorder tags with text

2023-01-03 16:14 问答作者：

I have the following text:

abcabcabcabc<2007-01-12><name1><2007-01-12>abcabcabcabc<name2><2007-01-11>abcabcabcabc<name3><2007-02-12>abcabcabcabc<name4>abcabcabcabc<2007-03-12><name5><date>abcabcabcabc<name6>

I need to use regular expressions in order to clean the above text:

The basic extraction rule is:

<2007-01-12>abcabcabcabc<name2>

I have no problem extracting this pattern. My issue is that within th text I have malformed sequences: If the text doesn't start with a date, and end with a name my extraction fails. For example, the text above may have several mal formed sequences, such as:

abcabcabcabc<2007-01-12><name1>

Should be:

<2007-01-12>abcabcabcabc<name1>

Is it possible to have a regular expression that would clean the above, pri开发者_如何学Pythonor to extracting my consistent pattern. In short, i need to find all mal formed patterns, and then take the date tag and put it in front of it, as provided in the example above.

Thanks.

Do you need something like this perhaps?

public class Extract {
    public static void main(String[] args) {
        String text =
            "abcabcabcabc<2007-01-12><name1>" +
            "<2007-01-12>abcabcabcxxx<name2>" +
            "<2007-01-11>abcabcabcyyy<name3>" +
            "<2007-02-12>abcabcabczzz<name4>" +
            "abcabcabc123<2007-03-12><name5>" +
            "<date>abcabcabc456<name6>";
        System.out.println(
            text.replaceAll(
                "(text)<(text)>(text)<(text)>"
                    .replace("text", "[^<]*"),
                "$1$3 - $2 - $4\n"
            )
        );
    }
}

This prints:

abcabcabcabc - 2007-01-12 - name1
abcabcabcxxx - 2007-01-12 - name2
abcabcabcyyy - 2007-01-11 - name3
abcabcabczzz - 2007-02-12 - name4
abcabcabc123 - 2007-03-12 - name5
abcabcabc456 - date - name6

Essentially, there are 3 parts:

The naked text is captured by \1 and \3 -- one of these should be an empty string
The date is \2
The name is \4

You can of course use a Matcher and extract individual group too.

References

regular-expressions.info/Grouping

继续阅读：regex

capture text, including tags from string, and then reorder tags with text

References

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

王昌瑞《潜梦追凶》剧组庆生新锐演员未来可期？

Is it allowed to ask users to enter credit card details for own payment method?

Escaping "<" in Perl-generated XML

imessage会显示已读吗？

微信重新建群怎么建？

References

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

王昌瑞《潜梦追凶》剧组庆生 新锐演员未来可期？

Is it allowed to ask users to enter credit card details for own payment method?

Escaping "<" in Perl-generated XML

imessage会显示已读吗？

微信重新建群怎么建？

王昌瑞《潜梦追凶》剧组庆生新锐演员未来可期？