c# - Capture text inside start/end character but ignore doubled end character -
i trying text inside start/end characters ("<" ">") regex, while ignoring doubled end character inside text (so "<<" should included in captured data).
i tried
<([^>]*)>
and
<(.*?)>(?!>)
but failing in following case:
input:
<test>>value>
expected output:
test>>value
but regex capture part of strings.
the first 1 captures
test
and second
test>
sadly out of ideas on how further approach problem. 1 of regex gods have ideas how solve this?
edit:
thanks answers, sadly not match requirement have (which dropped keep question short possible thinking wouldnt matter... lesson learned)
input:
<test>>value><test>
expected output:
test>>value test
using zero-width negative lookahead assertion match >
not followed >
terminate match seems simplest way:
<(.*)>(?!>)
captures test>>more
when matched against <test>>more>
.
note, second regex (<(.*?)>(?!>)
) using minimal matching modifier, stop @ first >
not followed >
.
edit:
with additional information, <test>>more><another>
should capture test>>more
, another
:
<([^>]*(?:>>[^>]*)*)>
using regex.matches
make above captures.
expanded
< # match < ( # start capture [^>]* # match many non-> (?: # start non-capturing group >> # match >> [^>]* # match many non-> )* # repeat 0 or more ) # end capture > # match >
ie. breaks content of angle brackets >>
, non->
blocks , matches indefinite number of them. handle <>>>
(captures >>
).
Comments
Post a Comment