Quantcast
Channel: SCN : Popular Discussions - SAP SQL Anywhere
Viewing all articles
Browse latest Browse all 1302

Problem with REGEXP_SUBSTR

$
0
0

I am having a problem with REGEXP_SUBSTR(). I am trying to extract "captureThis" from the following string:

 

Token1 blah, blah, (Token1 blah Token2 ignoreThis blah) blah, blah Token2 captureThis blah blah'

 

The rules are that it must be preceded by "Token1" followed by arbitrary text, terminated by "Token2". Except that if the same pattern appears in parentheses in the arbitrary text, everthing in parentheses should be ignored.

 

If I run the following REGEXP_SUBSTR() statement, I get a result that correctly ends in "captureThis"

select REGEXP_SUBSTR('Token1 blah, blah, (Token1 blah Token2 ignoreThis blah) blah, blah Token2 captureThis blah blah',
    'Token1\\s+((\\([^\\)]+\\))|(((?!Token2).)+))*\\s+Token2\\s+\\S*' ,
  1,1);

Result: Token1 blah, blah, (Token1 blah Token2 ignoreThis blah) blah, blah Token2 captureThis

 

 

However, if I use a Positive Lookbehind Zero-Width Assertion to filter out the tokens, I get a different result:

select REGEXP_SUBSTR('Token1 blah, blah, (Token1 blah Token2 ignoreThis blah) blah, blah Token2 captureThis blah blah',
'(?<=Token1\\s+((\\([^\\)]+\\))|(((?!Token2).)+))*\\s+Token2\\s+)\\S*'
,  1,1);

 

Result: ignoreThis (I need it to be "captureThis")

 

I also found an interesting result by playing with the fourth parameter, occurrence-number. In the first statement above, occurrence-number 1 is a string ending in "captureThis", and occurrence-number 2 is a string ending in "ignoreThis".

However, in the second statement, with the Lookbehind, the order is reversed. Occurrence-number 1 is "ignoreThis", and occurrence-number 2 is "captureThis".

 

Is there any way to alter the regular expression in the second statement so occurrence-number 1 will be "captureThis"?

 

Thanks,

Eric


Viewing all articles
Browse latest Browse all 1302

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>