PHP5’s SimpleXML Now Passes CDATA Content

I didn’t hear big announcement of it, but deep in the docs (? PHP 5.1.0) you’ll find a note about additional Libxml parameters. In there you’ll learn about “LIBXML_NOCDATA,” and it works like this:

simplexml_load_string($xmlraw, 'SimpleXMLElement', LIBXML_NOCDATA);

Without that option (and with all previous versions of PHP/SimpleXML), SimpleXML just ignores any < ![CDATA[...]]> ‘escaped’ content, such as you’ll find in most every blog feed.

cdata, cdata in php, fixed, parsing rss, php, php5, rss, simplexml, xml

Related:

6 Comments

  1. Comment by Bjorn on April 14, 2006 1:11 am

    I wish more shared hosts would pick up PHP5. All of mine are still using PHP4 and I don’t have access to any of the new XML tools, and I’m afraid to write any thing with PHP4 because they may upgrade and break my scripts. It’s nice to see what there is, even if I can’t use it.

    [tags]php5[/tags]

  2. Comment by Bjorn on April 14, 2006 1:17 am

    Your comment system has some PHP errors:

    WordPress database error: [Duplicate entry '11257-php5' for key 2]
    INSERT INTO wp_bsuite_tags (`post_id`,`comment_id`,`tag`,`tag_raw`) VALUES (’11257′, 35263,’php5′, ‘php5′)

    Warning: Cannot modify header information - headers already sent by (output started at /home/mais04/public_html/blog/wp-includes/wp-db.php:102) in /home/mais04/public_html/blog/wp-comments-post.php on line 55

    Warning: Cannot modify header information - headers already sent by (output started at /home/mais04/public_html/blog/wp-includes/wp-db.php:102) in /home/mais04/public_html/blog/wp-comments-post.php on line 56

    Warning: Cannot modify header information - headers already sent by (output started at /home/mais04/public_html/blog/wp-includes/wp-db.php:102) in /home/mais04/public_html/blog/wp-comments-post.php on line 57

    Warning: Cannot modify header information - headers already sent by (output started at /home/mais04/public_html/blog/wp-includes/wp-db.php:102) in /home/mais04/public_html/blog/wp-includes/pluggable-functions.php on line 247

    [tags]php errors[/tags]

  3. Comment by Morgan on December 12, 2006 4:37 pm

    Googled looking for a solution to simpleXml skipping text wrapped in CDATA nodes. Got your page, and tossed in the libXml2 parameter. Worked great! Thanks bud!

  4. Comment by nico on March 6, 2007 9:32 am

    thanx a lot for my nerves with this article !

  5. Comment by James Constable on June 12, 2007 5:05 am

    Thanks. This was doing my head in for a couple of hours yesterday. Thought my xml was at fault until i found this article. Glad they’ve fixed it. Its a major oversight otherwise.

  6. Comment by Evan K on July 11, 2007 11:27 am

    For anyone who’s still stuck using PHP 5’s simplexml before v5.1.0 (like me), you can use a fairly simple regex to filter any cdata and collapse them into regular text nodes:

    function cdata_to_text($text) {
    $result = preg_replace(’/<!\[CDATA\[(.*?)\]\]>/ie’, "htmlentities(’\1′)", $text);
    $result = str_replace("\&quot;", "&quot;", $result);
    return $result;
    }

Comments RSS TrackBack Identifier URI

Leave a comment

 

User contributed tags for this post:

simplexml cdata (948) - php simplexml cdata (302) - CDATA simplexml (137) - simplexmlelement cdata (114) - PHP CDATA (71) - simplexml load string php4 (42) - simplexml php cdata (41) - LIBXML NOCDATA (41) - simple xml cdata (35) - cdata php (34) - SimpleXMLElement php4 (31) - simplexml load string CDATA (28) - cannot parse CDATA (26) - simplexml_load_file cdata (25) - php4 simplexml load string (23) - simplexml modify (19) - F (18) - php5 CDATA (18) - cdata simplexml php (17) - simplexml CDATA php (17) - php5 google earth (16) - php parser  (16) - simplexml and cdata (16) - libxml cdata simplexml (16) - simplexml ampersand (15) - php simplexml (14) - simplexml php4 (13) - simplexml libxml nocdata (12) - php simplexmlelement cdata (12) - cdata (12) - CDATA in PHP (12) - simplexml php (11) - Warning Cannot modify header information headers alread (11) - LIBXML_NOCDATA (11) - cdata with simplexml (10) - simple xml php CDATA (10) - preg_replace cdata (10) - php LIBXML NOCDATA (9) - php simplexml parse line by line (9) - cdata in simplexml (8) - php5 simplexml CDATA (8) - php5 header (8) - xml php cdata (8) - simple XML and CDATA (7) - php4 SimpleXMLElement (7) - simpleXML modify xml (7) - php simplexml php4 (7) - xml cdata simplexml (7) - php simplexml rss (7) - php5 parse RSS (7) - php5 xml cdata (7) - simplexml flickr output (7) - php5 Warning Cannot modify header information headers a (6) - php xml cdata (6) - php simplexml ![CDATA (6) - php simplexml and cdata (6) - simplexml load string for php4 (6) - simplexml with CDATA (6) - simplexml load string post (6) - php simplexml cdata problem (6) - simplexml flickr (6) - php 5 1 cdata rss (6) - simplexml_load_file php4 (6) - simplexml (6) - php5 RSS (6) - simplexml modify cdata (6) - how to read CDATA php (6) - php5 simplexml (6) - php CDATA problem (6) - php cdata simplexml (6) - sexcom (6) - simplexml load string cdata error (5) - simplexml_load_file INSERT INTO mysql (5) - libxml cdata (5) - CDATA php5 (5) - php4 xml cdata (5) - SimpleXMLElement NOCDATA (5) - simplexml php  (5) - simplexml write (5) - cdata xml php simplexml (5) - php5 xml cdata parse (5) - SimpleXML LIBXML_NOCDATA (5) - php5 header xml (5) - php simplexml search (5) - simplexml rss (5) - simple_xml CDATA (5) - simpleXML line breaks (5) - simplexml load content in php4 (5) - SimpleXML cdata comment (5) - LIBXML NOCDATA php (5) - php 4 xml cdata (4) - simplexml cdata problem (4) - simplexml load string (4) - SimpleXMLElement load (4) - write CDATA Simplexml (4) - simplexml output cdata (4) - SimpleXMLElement and cdata (4) - CDATA php simplexml (4) - SimpleXMLElement insert cdata (4) - php simplexml_load_file CDATA (4) - php SimpleXML post (4) - SimpleXMLElement in php4 (4) - SimpleXML CDATA Fixed (4) - php4 simplexml (4) - simple xml php 5 rss simplexml load string (4) - Inserting CDATA with simpleXML (4) - Simplexml Cannot modify header information (4) - cdata and simplexml (4) - simplexml et CDATA (4) - simplexml nocdata (4) - php output CDATA xml (4) - ampersand simpleXML (4) - simpleXML write CDATA (4) - simplexml php LIBXML NOCDATA (3) - php simpleXML api (3) - php 5 simplexml modify (3) - SimpleXMLElement on php4 (3) - CDATA simplexml str replace (3) - php5 simplexml load string (3) - simplexml PHP ampersand (3) - php5 6 SimpleXML (3) - simple xml write cdata (3) - simplexml htmlentities (3) - simplexml insert (3) - regex cdata php (3) - simple php blog Warning Cannot modify header informatio (3) - simplexml_import_dom LIBXML_NOCDATA (3) - simplexml ignores CDATA (3) - cdata simplexml load string (3) - xml CDATA php (3) - php5 to php4 3 (3) - php SimpleXML CDATA libxml (3) - cdata xml and php4 (3) - php4 simplexml_load_file() (3) - outputting CDATA content in XML (3) - cdata and php (3) - php simple xml cdata (3) - Warning simplexml load string (3) - CDATA content (3) - php5 cdata problem (3) - simplexml load string in php4 (3) - php CDATA xml (3) - simplexml search (3) - simplexml c (3) - simplexml insert into (3) - simple xml post php (3) - simplexml in php4 2 (3) - SimpleXML parsing RSS (3) - php4 cdata (3) - php simplexml load file CDATA (3) -