[wellylug] Regex, replace stuff in command line help...

Grant McLean grant at mclean.net.nz
Thu Jun 29 11:37:02 NZST 2006


On Thu, 2006-06-29 at 11:01 +1200, David Antliff wrote:
> 
> On Thu, 29 Jun 2006, Grant McLean wrote:
> > Using '.*' in a Perl regex is frequently an error.  I added the trailing
> > '?' which causes it to find the first match rather than the longest.
> 
> While I left out any anchors because I don't know the full context, it 
> is my understanding that .*? will successfully match nothing. Not very 
> useful.

The '?' normally means the thing before it is optional, but since (as
you observed) that would not be very useful after .*, in that context it
has a different meaning.  It makes the immediately preceding quantifier
(the *) 'non-greedy'.  This causes the regex engine to succeed on the
first (shortest) match rather than the default behaviour of trying to
find the longest match.

An example may clarify:

  $ echo '<img src="x.gif" border="0">' | \ 
  > perl -nle '/src="(.*?)"/ && print $1'
  x.gif

Versus, without the ?

  $ echo '<img src="x.gif" border="0">' | \ 
  >  perl -nle '/src="(.*)"/ && print $1'
  x.gif" border="0

Cheers
Grant




More information about the wellylug mailing list