PHP5′s SimpleXML Now Passes CDATA Content

I didn’t hear big announcement of it, but deep in the docs (? PHP 5.1.0) you’ll find a note about additional Libxml parameters. In there you’ll learn about “LIBXML_NOCDATA,” and it works like this:

simplexml_load_string($xmlraw, 'SimpleXMLElement', LIBXML_NOCDATA);

Without that option (and with all previous versions of PHP/SimpleXML), SimpleXML just ignores any < ![CDATA[...]]> ‘escaped’ content, such as you’ll find in most every blog feed.

cdata, cdata in php, fixed, parsing rss, php, php5, rss, simplexml, xml

7 Comments to “PHP5′s SimpleXML Now Passes CDATA Content”

  1. I wish more shared hosts would pick up PHP5. All of mine are still using PHP4 and I don’t have access to any of the new XML tools, and I’m afraid to write any thing with PHP4 because they may upgrade and break my scripts. It’s nice to see what there is, even if I can’t use it.

    [tags]php5[/tags]

  2. Your comment system has some PHP errors:

    WordPress database error: [Duplicate entry '11257-php5' for key 2]
    INSERT INTO wp_bsuite_tags (`post_id`,`comment_id`,`tag`,`tag_raw`) VALUES (’11257′, 35263,’php5′, ‘php5′)

    Warning: Cannot modify header information – headers already sent by (output started at /home/mais04/public_html/blog/wp-includes/wp-db.php:102) in /home/mais04/public_html/blog/wp-comments-post.php on line 55

    Warning: Cannot modify header information – headers already sent by (output started at /home/mais04/public_html/blog/wp-includes/wp-db.php:102) in /home/mais04/public_html/blog/wp-comments-post.php on line 56

    Warning: Cannot modify header information – headers already sent by (output started at /home/mais04/public_html/blog/wp-includes/wp-db.php:102) in /home/mais04/public_html/blog/wp-comments-post.php on line 57

    Warning: Cannot modify header information – headers already sent by (output started at /home/mais04/public_html/blog/wp-includes/wp-db.php:102) in /home/mais04/public_html/blog/wp-includes/pluggable-functions.php on line 247

    [tags]php errors[/tags]

  3. Googled looking for a solution to simpleXml skipping text wrapped in CDATA nodes. Got your page, and tossed in the libXml2 parameter. Worked great! Thanks bud!

  4. thanx a lot for my nerves with this article !

  5. Thanks. This was doing my head in for a couple of hours yesterday. Thought my xml was at fault until i found this article. Glad they’ve fixed it. Its a major oversight otherwise.

  6. For anyone who’s still stuck using PHP 5’s simplexml before v5.1.0 (like me), you can use a fairly simple regex to filter any cdata and collapse them into regular text nodes:

    function cdata_to_text($text) {
    $result = preg_replace(‘/<!\[CDATA\[(.*?)\]\]>/ie’, "htmlentities(‘\1′)", $text);
    $result = str_replace("\&quot;", "&quot;", $result);
    return $result;
    }

  7. Thanks man, just what I needed. Owe you a beer :)

User contributed tags for this post:

simplexml CDATA (1906) - php simplexml cdata (403) - SimpleXMLElement cdata (309) - CDATA simplexml (259) - LIBXML_NOCDATA (90) - simple xml cdata (82) - PHP CDATA (80) - simplexml_load_file cdata (66) - simplexml php cdata (47) - LIBXML NOCDATA (46) - simplexml load string php4 (42) - cdata php (38) - php LIBXML_NOCDATA (38) - SimpleXMLElement php4 (32) - simpleXML write CDATA (28) - simplexml load string CDATA (28) - simplexml and cdata (28) - cannot parse CDATA (26) - simplexml CDATA php (24) - php simplexmlelement cdata (23) - php4 simplexml load string (23) - simplexml add cdata (23) - preg_replace cdata (23) - cdata (22) - simplexml ampersand (22) - simplexml modify (20) - simplexml  (20) - php5 CDATA (19) - SimpleXML LIBXML_NOCDATA (19) - cdata simplexml php (19) - F (18) - libxml cdata simplexml (17) - php5 google earth (16) - php parser  (16) - simple_xml CDATA (15) - php simplexml (15) - simplexml php4 (14) - CDATA in PHP (14) - php xml cdata (13) - simplexml how get cdata (12) - simplexml_load_file LIBXML_NOCDATA (12) - cdata and simplexml (12) - simplexml nocdata (12) - simple xml php CDATA (12) - simplexml libxml nocdata (12) - cdata in simplexml (11) - simplexml php (11) - Warning Cannot modify header information headers alread (11) - cdata with simplexml (11) - simplexml cdata problem (11) - simplexml insert cdata (10) - xml cdata simplexml (10) - SimpleXMLElement NOCDATA (10) - php simplexml write cdata (10) - php5 xml cdata (10) - simplexml line break (10) - simplexml output cdata (10) - simplexml (10) - php LIBXML NOCDATA (10) - php simplexml parse line by line (9) - php5 simplexml CDATA (9) - simplexml flickr (9) - php4 SimpleXMLElement (8) - simple XML and CDATA (8) - simplexml_load_file php4 (8) - simpleXML read CData (8) - php5 header (8) - simplexml ignores CDATA (8) - xml php cdata (8) - LIBXML_NOCDATA php (7) - simplexmlelement LIBXML_NOCDATA (7) - php5 parse RSS (7) - php simplexml php4 (7) - php cdata simplexml (7) - simplexml write (7) - simplexml with CDATA (7) - simpleXML modify xml (7) - simplexml parse cdata (7) - simplexml flickr output (7) - php5 RSS (7) - simpleXML line breaks (7) - php simplexml rss (7) - php5 simplexml (7) - php simple xml cdata (7) - php simplexml and cdata (6) - php simplexml ampersand (6) - php5 header xml (6) - php5 Warning Cannot modify header information headers a (6) - php 5 1 cdata rss (6) -  (6) - php preg_replace CDATA (6) - simplexml PHP ampersand (6) - simplexml create cdata (6) - simplexml load string post (6) - php CDATA problem (6) - php simplexml ![CDATA (6) - php simplexml cdata problem (6) - simplexml load string for php4 (6) - CDATA php5 (6) - simplexml modify cdata (6) - how to read CDATA php (6) - php simplexml_load_file CDATA (6) - LIBXML NOCDATA php (5) - simplexml et CDATA (5) - SimpleXML cdata comment (5) - php simplexml search (5) - php4 xml cdata (5) - ampersand simpleXML (5) - simplexml load string cdata error (5) - simplexml get cdata (5) - simplexml rss (5) - libxml cdata (5) - simplexml_load_file INSERT INTO mysql (5) - SimpleXMLElement and cdata (5) - cdata xml php simplexml (5) - php5 xml cdata parse (5) - simplexml php  (5) - php read cdata (5) - CDATA xml php (5) - simplexml load content in php4 (5) - write CDATA Simplexml (4) - CDATA SimpleXMLElement (4) - SimpleXMLElement load (4) - SimpleXML CDATA Fixed (4) - regex cdata php (4) - xml CDATA php (4) - simple xml write cdata (4) - php output CDATA xml (4) - simplexml htmlentities (4) - CDATA php simplexml (4) - simplexml load CDATA (4) - php SimpleXML post (4) - parse cdata php (4) - simplexml load string (4) - php parse xml cdata (4) - SimpleXMLElement in php4 (4) - SimpleXMLElement insert cdata (4) - php4 cdata (4) - CDATA content (4) - simple xml php 5 rss simplexml load string (4) - php simplexml insert CDATA (4) - Simplexml Cannot modify header information (4) - php 4 xml cdata (4) - simplexml rss cdata (4) - Inserting CDATA with simpleXML (4) - php4 simplexml (4) - simplexml insert (3) - cdata xml and php4 (3) - SimpleXML CDATA write (3) -