I love BBEdit on my Mac, but I was left scratching my head again today when I was trying to remember how to make its regex engine match a pattern across multiple lines. My hope was to extract a list of initial articles from a page that had HTML like this :
<table>
<tr>
<td valign="top" colspan="34" align="left">am</td>
<td valign="top" colspan="10" align="left">Scottish Gaelic</td>
</tr>
</table>
<table>
<tr>
<td valign="top" colspan="34" align="left">an</td>
<td valign="top" colspan="10" align="left">English,</td>
<td valign="top" colspan="10" align="left">Irish,</td>
<td valign="top" colspan="10" align="left">Scots,</td>
<td valign="top" colspan="10" align="left">Scottish Gaelic,</td>
<td valign="top" colspan="10" align="left">Yiddish</td>
</tr>
</table>
<table>
<tr>
<td valign="top" colspan="34" align="left">an t-</td>
<td valign="top" colspan="10" align="left">Irish,</td>
<td valign="top" colspan="10" align="left">Scottish Gaelic</td>
</tr>
</table>
Indeed, it has well over 100 tables like that, and I was looking for the contents of the first TD in each. The following regex does it:
( ?s) [ ^< ] *< table> [ ^< ] *< tr> [ ^< ] *< td[ ^> ] *> ( [ ^< ] * ) </ td>.* ?</ table>
The most significant part of this is the (?s) at the beginning that tells BBEdit to match the pattern across line breaks. A more ninja-like regex assassin would probably be able to do it better, but this worked.
Posted February 9, 2009 by Casey Bisson
Categories: Technology . Tags: bbedit , grep , regex , regular expressions .
1 Comment(s)
Comments RSS
TrackBack Identifier URI
User contributed tags for this post:
grep multiple lines (26) - grep multiline regex (26) - regular expression multiline (17) - grep multiline match (15) - valign (8) - regex multiple lines (8) - perl multiline regex (8) - regex multiple line (7) - PERL match multiple lines (7) - bbedit grep multiline (7) - multiline regex (6) - bbedit regex (6) - regex multiline (6) - grep multiline (6) - grep multiline regexp (6) - regular expression match multiple lines (6) - unix grep multiple lines (6) - grep across multiple lines (6) - grep regexp multiline (5) - regular expression multiple lines (5) - grep regex multiline (5) - regular expression match across lines (5) - php multiline regex (5) - php multiline regexp (5) - grep match two lines (5) - bbedit grep multiple lines (5) - bbedit regular expressions (5) - bbedit regex multiple line (5) - regular expression multiline match (4) - multi line regex (4) - grep multi lines (4) - grep matching multiple lines (4) - bbedit regular expression (4) - multiline grep example (4) - multiple line grep (4) - grep multi line (4) - regular expressions multi lines (4) - multi line grep (4) - perl multiline replace (4) - regex two lines (4) - perl match multiline (4) - php parse html multiline (4) - grep regular expression multiline (3) - grep pattern multiline (3) - multiline grep (3) - multi line regular expression (3) - unix regular expression multiline (3) - bbedit grep (3) - regular expression lines (3) - regex across lines (3) - unix regexp multiline (3) - grep regex across lines (3) - multiline regexp php (3) - grep multi pattern (3) - perl multiline grep (3) - perl regexp multiline (3) - grep line break (3) - perl Matching Multiple Lines (3) - perl grep multiline regex (3) - grep over multiple lines (3) - grep multiline search (3) - perl multiline regexp (3) - grep multiline pattern match (3) - Perl multiline matching (3) - Perl extract multiple lines (3) - multiple line regular expression (3) - regex grep xml blok (3) - bbedit match multiple lines (3) - regex html tags multiline (3) - grep multiline pattern (3) - parsing multiline blocks perl (3) - multiline regexp pattern (3) - matching multiple lines (2) - regex table tr td (2) - regular expression multi line (2) - multi line regular expression grep (2) - regex lines (2) - multiline match grep (2) - regex multiline grep (2) - grep multi-line (2) - regex matching multiple lines (2) - matching multiple lines grep (2) - regex multiline search (2) - grep regular expressions multiline (2) - grep examples multiline (2) - bbedit grep match across lines (2) - php regular expression match two line breaks (2) - php matching multiple lines (2) - regex multiline matching (2) - regular expression across lines (2) - how to extract lines feb 2009 (2) - regex over multiple lines (2) - grep pattern multiple lines (2) - multiline grep regex (2) - multiple lines regex (2) - regular expression match line breaks (2) - regular expressions bbedit (2) - regular expressions multiline (2) - regexp multiline (2) - regex match line break (2) - regular expressions multiple lines (2) - extract content with regexp (2) - multi-line grep (2) - php regular expression multiline (2) - match a string multiple lines unix (2) - multiline regexp grep (2) - grep match two patterns (2) - finding text across multiple lines in VS 2008 regex (2) - regex multiline html tag (2) - grep multiline howto (2) - regexp multi line (2) - regex match two lines (2) - regex search across line breaks (2) - BBEdit regex search (2) - bbedit replace line break (2) - regex unix lines (2) - php regular expressions multiline (2) - regular expression match multiline (2) - perl match accross line (2) - perl regex across lines (2) - multiline regexp text editor (2) - regex multiline match (2) - multiple line to 1 (2) - Regex.Matches (2) - list lines not matching pattern php (2) - match multiple patterns in a line unix (2) - perl match between two tags (2) - perl match replace multiple line (2) - regex multiline php (2) - multiline perl regexp (2) - bbedit multiline grep (2) - regular expression match multiple line (2) - grep regexp html (2) - regex multiline line (2) - regex extract multiline (2) - how to grep multiple lines in unix (2) - php match multiple lines (2) - grep multiline expression (2) - regex search between multiple lines (2) - regular expressions extract multiple tables html (2) - perl match line (2) - grep linebreak (2) - how to extract lines between two patterns (2) - perl match two lines (2) - perl grep multi lines file (2) - regular expression multiple lines grep (2) - grep regex multiline match (2) -
jet turbine for sale (10) best pano software for iphone (7) www.zeexi.com (7) history of the shovel (6) itunes api (5) Trinoble Ukraine (5) biggest bear (5)
November 2009
M
T
W
T
F
S
S
« Oct
1
2 3 4 5 6 7 8
9 10 11 12 13 14 15
16 17 18 19 20 21 22
23 24 25 26 27 28 29
30
On windows, I use biterscripting to parse across multiple lines. Just read in the contents of the entire file into a string variable.
For example, I have a file page.html. I want to extract a blcok starting at .
var str content ; cat page.html > $content
stex -r “^^” $content
The above will extract the block across multiple lines.
For example,
stex -r “^^” “a\n\nd”
will extract “” .
Have fun with regular expressions.
Patrick