c# - Capture text inside start/end character but ignore doubled end character -


i trying text inside start/end characters ("<" ">") regex, while ignoring doubled end character inside text (so "<<" should included in captured data).

i tried

<([^>]*)> 

and

<(.*?)>(?!>) 

but failing in following case:

input:

<test>>value> 

expected output:

test>>value 

but regex capture part of strings.

the first 1 captures

test 

and second

test>  

sadly out of ideas on how further approach problem. 1 of regex gods have ideas how solve this?

edit:

thanks answers, sadly not match requirement have (which dropped keep question short possible thinking wouldnt matter... lesson learned)

input:

<test>>value><test> 

expected output:

test>>value test 

using zero-width negative lookahead assertion match > not followed > terminate match seems simplest way:

<(.*)>(?!>) 

captures test>>more when matched against <test>>more>.

note, second regex (<(.*?)>(?!>)) using minimal matching modifier, stop @ first > not followed >.

edit:

with additional information, <test>>more><another> should capture test>>more , another:

 <([^>]*(?:>>[^>]*)*)> 

using regex.matches make above captures.

expanded

 <       # match <  (       # start capture   [^>]*  #  match many non->   (?:    #  start non-capturing group    >>    #   match >>    [^>]* #   match many non->   )*     #   repeat 0 or more  )       #  end capture  >       # match > 

ie. breaks content of angle brackets >> , non-> blocks , matches indefinite number of them. handle <>>> (captures >>).


Comments

Popular posts from this blog

sql - invalid in the select list because it is not contained in either an aggregate function -

Angularjs unit testing - ng-disabled not working when adding text to textarea -

How to start daemon on android by adb -