Modify

Opened 11 years ago

Closed 10 years ago

#679 closed Support Requests (fixed)

regex - perl syntax affects what gets matched

Reported by: nobody Owned by: John Maddock
Milestone: Component: regex
Version: None Severity: Problem
Keywords: Cc:

Description (last modified by John Maddock)

Hi,

I'm trying to move from 1.32 to 1.33 (I need unicode), 
and it seems that 'what gets matched' have changed: it 
no longer matches the left-most longest match, but the 
first BFS of the expression. Ok. It is documented. But 
how do I pass that?

The only way I found around it is using 
regex::no_perl_ex, but:
1. The change of 'what gets matched' is not documented 
(or I didn't find it), and therefore I'm not sure it 
does what I think it does.
2. There is no 'lookahead' anymore, and I need it.

Is there a way to tell a perl regex to match the left-
most longest? or is there another workaround?

Thanks, Moddy.

moddyt@itemfield.com





Attachments (0)

Change History (4)

comment:1 Changed 11 years ago by nobody

Logged In: NO 

If you use the Perl syntax then you now get Perl matching
rules.  You can pass match_posix to the matching algorithms
to force them to use leftmost-longest, but the behaviour is
very hard to specify for Perl-expressions.  Or you can
compile the expression as a POSIX regex using
regex::extended or regex::basic and then get well-defined
leftmost longest behaviour with POSIX compatible expressions.

The problem is that mixing Perl syntax and leftmost longest
rules leads to all kinds of problems when you try and figure
out what a "non-greedy-repeat" should do for example.  There
are similar issues with other Perl extensions.

HTH, John.

comment:2 Changed 11 years ago by nobody

Logged In: NO 

Thanks. I'll try that.

I'm still left with one question: 

Where are the flags "match_posix" and "no_perl_ex" appear  
documentated?

Thanks, Moddy.

comment:3 Changed 10 years ago by Daryle Walker

Component: Noneregex
Severity: Problem

comment:4 Changed 10 years ago by John Maddock

Description: modified (diff)
Resolution: Nonefixed
Status: assignedclosed

Sorry for the delay: no_perl_ex is intended for internal use and intentionally not documented. The new match_flag_type options I've just added to the docs in SVN now.

Modify Ticket

Change Properties
Set your email in Preferences
Action
as closed The owner will remain John Maddock.
The resolution will be deleted.

Add Comment


E-mail address and name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.