Convert HTML To ASCII With PHP

The reverse of turning ASCII text into HTML is to convert HTML into ASCII. And to this end here is a little function that does this.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
function html2ascii($s){
 // convert links
 $s = preg_replace('/<a\s+.*?href="?([^\" >]*)"?[^>]*>(.*?)<\/a>/i','$2 ($1)',$s);
 
 // convert p, br and hr tags
 $s = preg_replace('@<(b|h)r[^>]*>@i',"\n",$s);
 $s = preg_replace('@<p[^>]*>@i',"\n\n",$s);
 $s = preg_replace('@<div[^>]*>(.*)</div>@i',"\n".'$1'."\n",$s);  
  
 // convert bold and italic tags
 $s = preg_replace('@<b[^>]*>(.*?)</b>@i','*$1*',$s);
 $s = preg_replace('@<strong[^>]*>(.*?)</strong>@i','*$1*',$s);
 $s = preg_replace('@<i[^>]*>(.*?)</i>@i','_$1_',$s);
 $s = preg_replace('@<em[^>]*>(.*?)</em>@i','_$1_',$s);
   
 // decode any entities
 $s = strtr($s,array_flip(get_html_translation_table(HTML_ENTITIES)));
 
 // decode numbered entities
 $s = preg_replace('/&#(\d+);/e','chr(str_replace(";","",str_replace("&#","","$0")))',$s);
 
 // strip any remaining HTML tags
 $s = strip_tags($s);
 
 // return the string
 return $s;
}

To use this function just pass it a string. Here is an example of it at work.

1
2
3
$htmlString = '<p>This is some <strong>XHTML</strong> markup that <em>will</em> be<br />turned <a href="http://www.hashbangcode.com/" title="#! code">into</a> an ascii string</p>';
 
echo html2ascii($htmlString);

Produces the following output.

1
2
This is some *XHTML* markup that _will_ be
turned into (http://www.hashbangcode.com/) an ascii string
Category: 

Share:

  • Add news feed
  • Bookmark this on Delicious

Comments

I got error at in line 19 --> $s = preg_replace('//e','chr(\\1)',$s); Warning: Wrong parameter count for chr() in C:\PHP-test\xxxxx.php (??) : regexp code on line 1

Add new comment