An Example of How To Remove Empty HTML Tags with PHP
2014
In his latest blogpost Tom McFarlin gives us “An Example of How To Remove Empty HTML Tags“. Empty tags can be a real nightmare, since they can really destroy the layout of an article. Take for example empty<p>-Tags, which easliy produce huge spaces between paragraphs. There is no solution in CSS for such a problem and to call the designer is a wasted call. Especially WYSIWYG editors produce these problems easily and probably a lot of WordPress users know, what I am talking about.
So I was quite curious about this topic and read his blogpost. I was a bit disappointed, when I saw, he was facing the problem of empty HTML Tags with an Javascript solution:
( function ( $) { 'use strict'; $( '.comment code' ).each(function() { if ( '' === $.trim( $( this ).text() ) ) { $( this ).remove(); } }); }( jQuery ) );
This solution is neat, no question: for Javascript enabled browsers. But for sure, this solution has some disadvantages:
- the browser has to keep care of the problem, which basically costs time. First you retrieve data you don’t need and in a second step, you need to remove this data.
- Empty tags are something like silent conversations. What does <em></em> mean? I emphasize nothing?
- Your browser needs to have Javascript enabled
So, a server side solution would be my preferred way. I was searching a bit, if there might be such a solution, and I found this blogpost by CodeSnap:
* @version 1.0 * @param string $str String to remove tags. * @param string $repto Replace empty string with. * @return string Cleaned string. */ function remove_empty_tags_recursive ($str, $repto = NULL) { //** Return if string not given or empty. if (!is_string ($str) || trim ($str) == '') return $str; //** Recursive empty HTML tags. return preg_replace ( //** Pattern written by Junaid Atari. '/<([^<\/>]*)>([\s]*?|(?R))<\/\1>/imsU', //** Replace with nothing if string empty. !is_string ($repto) ? '' : $repto, //** Source string $str ); } /* +===================================== | EXAMPLE +===================================== */ $str=<< EOF; echo remove_empty_tags_recursive ($str); /* +===================================== | OUTPUT: +===================================== */ /*Hello User,Welcome to our domain.*/Hello User,Welcome to our domain.
Code Snap is using Regular Expressions to identify empty HTML Tags. The advantage of this solution over DOM-solutions is obvious: It doesn’t need to be a valid DOM-path. But for now, this solution only removes for example <a></a> but not <a href=”” class=”external link”></a>.
So its time to play with the regular expression. In order to fix this, I would suggest this solution:
'/<([^<\/>]*)([^<\/>]*)>([\s]*?|(?R))<\/\1>/imsU'
This removes properly <p> as well as <p class=”entry”></p>
For me, the proposed Javascript solution is like: First you smash a glas and then you present the broom to fix it. A server side solution doesn’t smash the glas in the first place.