Home Geschichten Kunst Computer Tindertraum

[current]

I'm having a real scripter-block here.
(Monday 25th February 2002)

As a long-time Perl scripter I'm used to doing string manipulation with regex. I've read and understood Mastering Regular Expressions to a degree that I've written a rather nice XML-Parser in pure Perl.

But today I have a problem I can't solve with regex.

I have the following kind of tags in my project:
<span ptal:content="hello">
    some text
</span>

no great problem there:
"/<((\w+)[^>]*)\s+ptal:content=\"([^\"]+)\"([^>]*)>(.*?)<\/\\2>/"
will match the whole thing

But now comes the challenge:
<span ptal:define="say hello">
  <span ptal:content="say">
      some text
  </span>
</span>


The problem is that any kind of above regex will match the yellow part like so:
<span ptal:define="say hello">
  <span ptal:content="say">
      some text
  </span>

</span>

as it will take the first opening-tag and match until it finds an matching closing tag, disregarding any nesting.

And converting the regex to be greedy is no solution, as it would then match the first <span> to the very last </span>...

I guess some programmatic string-parsing is called for here... Darn if only I could do that...

Some poiners I found:

[ by Martin>] [permalink] [similar entries]

similar entries (vs):

similar entries (cg):

relevant words



Martin Spernau
© 1994-2003

traumwind icon Big things to come (TM) 30th Dez 2002

Go slowly all the way round the outside
Oblique Strategies, Ed.3 Brian Eno and Peter Schmidt



amazon.de Wunschliste





 

usefull links:
Google Graph browser
Traumwind 6-Colormatch
UAV News

powered by SBELT