WordPress Strips Classnames, And How To Fix It

WordPress 2.0 introduced some sophisticated HTML inspecting and de-linting courtesy of kses.

kses is an HTML/XHTML filter written in PHP. It removes all unwanted HTML elements and attributes, and it also does several checks on attribute values. kses can be used to avoid Cross-Site Scripting (XSS), Buffer Overflows and Denial of Service attacks.

It’s a good addition, but it was also removing the class names from some of the elements of my posts. The result is that the following structured XHTML was coming through without any structure.

`


  • Title



    • The Effects Of A Modified Ball In Developing The Volleyball Pass And Set For High School Students



  • ...`

    Without the semantic value of the classnames, the XHTML loses all the microformatting, making it not only less re-usable/remixable but also harder to style.

    `


    • Title



      • The Effects Of A Modified Ball In Developing The Volleyball Pass And Set For High School Students



    • ...`

      A WordPress form post pointed me to the includes/kses.php file, where the $allowedposttags array set the standards for the acceptable tags and attributes. It begins like this:

      $allowedposttags = array ('address' => array (), 'a' => array ('href' => array (), 'title' => array (), 'rel' => array ()...

      It’s a hack, but changing the entries for some of the tags got me through.

      'ul' => array ('class' => array())