Java Mailing List Archive

Apache Ant Archive

» Ant Users List
» Ant Developers List
containsregex and concat

containsregex and concat

2006-11-28       - By George Bills
Reply:     1     2     3     4     5     6     7     8     9     10  

Thanks: the regular expression works now, which is progress.
Unfortunately I'm getting all of the concatenated text, not just the
matching text. If I use replace:
 <!--<tokenfilter><filetokenizer />-->
   <containsregex flags="isg"
     byline="false" <!-- implies filetokenizer -->
   <!-- </tokenfilter>-->

I end up getting something like:
[concat] <html>
[concat] <head>
[concat] <title>summary</title>
[concat] <link rel="stylesheet" href="summary.css" type="text/css">
[concat] </head>
[concat] <body>
[concat] <a name="overview"></a>
[concat] <center>
[concat] </center>
[concat] ...more HTML here...
[concat] </html>

I'm assuming it's because the file is just one big token - but if I use
a line tokenizer, will I be able to match regular expressions over
multiple lines?

Thanks for the help.

Rebhan, Gilbert wrote:
> Hi,
> <table[^>/]*>(.*?)</table>
> should match :
> <table class="summary">foobar</table>
> also with more than one attribute
> <table class="summary" foo="bar">foobar</table>
> foobar is  /1  (group 1)
> Regards, Gilbert
> -----Original Message-----
> From: George Bills [mailto:gbills@(protected)]
> Sent: Monday, November 27, 2006 6:41 AM
> To: Ant Users List
> Subject: Re: containsregex and concat
> Hrm, it probably isn't since advanced regexs are still black magic to
> me. The "." was supposed to match any character, including a newline
> (with the s flag), the * to say match 0-n of them and the ? to say be
> lazy, match as little as possible (so that I don't pull in
> <table>...</table><table>...</table> in one match).
> I just tried [^<], but it doesn't seem to work - I think because of such
> things as "<table><tr>...</tr></table>" - the opening bracket of <tr>
> conflicts. I tried [.&lt;&gt]*? to make sure that the "regex.body" part
> was matching the brackets, but that didn't work either.
> Also, <table class="summary"> was wrong - <table class="summary"(.*?)>
> is a little better since the tables can have more than the class
> attribute (in fact, all of them do). But after changing that I'm
> matching the entire document - <html> through to </html>. That might
> just be because I'm using filetokenizer - if I make one match within
> filetokenizer, do I end up getting the entire document? If so, how do I
> get only the matching text?
> Regex is now: <table class="summary".*?>.*?</table>
> Thanks for the help, I appreciate it.
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@(protected)
> For additional commands, e-mail: user-help@(protected)

To unsubscribe, e-mail: user-unsubscribe@(protected)
For additional commands, e-mail: user-help@(protected)

©2008 - Jax Systems, LLC, U.S.A.