c# - Capture text inside start/end character but ignore doubled end character -
i trying text inside start/end characters ("<" ">") regex, while ignoring doubled end character inside text (so "<<" should included in captured data).
i tried
<([^>]*)> and
<(.*?)>(?!>) but failing in following case:
input:
<test>>value> expected output:
test>>value but regex capture part of strings.
the first 1 captures
test and second
test> sadly out of ideas on how further approach problem. 1 of regex gods have ideas how solve this?
edit:
thanks answers, sadly not match requirement have (which dropped keep question short possible thinking wouldnt matter... lesson learned)
input:
<test>>value><test> expected output:
test>>value test
using zero-width negative lookahead assertion match > not followed > terminate match seems simplest way:
<(.*)>(?!>) captures test>>more when matched against <test>>more>.
note, second regex (<(.*?)>(?!>)) using minimal matching modifier, stop @ first > not followed >.
edit:
with additional information, <test>>more><another> should capture test>>more , another:
<([^>]*(?:>>[^>]*)*)> using regex.matches make above captures.
expanded
< # match < ( # start capture [^>]* # match many non-> (?: # start non-capturing group >> # match >> [^>]* # match many non-> )* # repeat 0 or more ) # end capture > # match > ie. breaks content of angle brackets >> , non-> blocks , matches indefinite number of them. handle <>>> (captures >>).
Comments
Post a Comment