Parse HTML And Traverse DOM In PHP?

I spoke of this the other day, but now I’ve learned of PHP’s DOM functions, including loadHTML(). Use it in combination with simplexml_import_dom like this:

``` $dom = new domDocument; $dom->loadHTML('
  • one
  • two
  • three
    • sublist item
'
); if($dom){ $xml = simplexml_import_dom($dom); print_r($xml); } ```

This IBM developerWorks article has some more useful info.

Here’s some code I prototyped to parse out the ISBNs and LCCN (or any data, really) from an an average record in Scriblio:

``` $dom = new domDocument; $dom->loadHTML($content); if($dom){ $xml = simplexml_import_dom($dom); } foreach($xml->body->ul->li as $thing){ if($thing['class'] == 'isbn'){ foreach($thing->ul->li as $stuff){ print_r($stuff); } } if($thing['class'] == 'lccn'){ foreach($thing->ul->li as $stuff){ print_r($stuff); } } } ```