Using The e Modifier In PHP preg_replace

The PHP function preg_replace() has powerful functionality in its own right, but extra depth can be added with the inclusion of the e modifier. Take the following bit of code, which just picks out the letters of a string and replaces them with the letter X.

1
2
3
$something = 'df1gdf2gdf3sgdfg';
$something = preg_replace("/([a-z]*)/", "X", $something);
echo $something; // XX1XX2XX3XX

This is simple enough, but using the e modifier allows us to use PHP functions within the replace parameters. The following bit of code turns all letters upper case in a string of random letters by using the strtoupper() PHP function.

1
2
3
$something = 'df1gdf2gdf3sgdfg';
$something = preg_replace("/([a-z]*)/e", "strtoupper('\\1')", $something);
echo $something; // DF1GDF2GDF3SGDFG

Here is another example, but in this case the full string is repeated after the modified string.

1
2
3
$something = 'df1gdf2gdf3sgdfg';
$something = preg_replace("/([a-z0-9]*)/e", "strtoupper('\\1').'\\1'", $something);
echo $something; // DF1GDF2GDF3SGDFGdf1gdf2gdf3sgdfg

Notice that when using the e modifier it is important to properly escape the string with single and double quotes. This is because the string as a whole is parsed as PHP and so if you don't put single quotes around the backreferences then you will get PHP complaining about constants.

For a more complex example I modified the createTextLinks() function that wrote about recently on the site. The function originally found any URL strings within a larger string and turned them into links. The modified function now returns the same thing, except that the link text has been shortened using the shortenurl() function.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
$longurl = "there is the new site http://www.google.co.uk/search?aq=f&num=100&hl=en&client=firefox-a&channel=s&rls=org.mozilla%3Aen-US%3Aofficial";
 
function createShortTextLinks($str='') {
 
 if($str=='' or !preg_match('/(http|www\.|@)/im', $str)){
  return $str;
 }
 
 // replace links:
 $str = preg_replace("/([ \t]|^)www\./im", "\\1http://www.", $str);
 $str = preg_replace("/([ \t]|^)ftp\./im", "\\1ftp://ftp.", $str);
 
 $str = preg_replace("/(https?:\/\/[^ )\r\n!]+)/eim", "'<a href=\"\\1\" title=\"\\1\">'.shortenurl('\\1').'</a>'", $str);
 
 $str = preg_replace("/(ftp:\/\/[^ )\r\n!]+)/eim", "'<a href=\"\\1\" title=\"\\1\">'.shortenurl('\\1').'</a>'", $str);
 
 $str = preg_replace("/([-a-z0-9_]+(\.[_a-z0-9-]+)*@([a-z0-9-]+(\.[a-z0-9-]+)+))/eim", "'<a href=\"mailto:\\1\" title=\"Email \\1\">'.shortenurl('\\1').'</a>'", $str);
 
 $str = preg_replace("/(\&)/im","\\1amp;", $str);
 
 return $str;
}
 
function shortenurl($url){
 if(strlen($url) > 45){
  return substr($url, 0, 30)."[...]".substr($url, -15);
 }else{
  return $url;
 }
}
 
echo createShortTextLinks($longurl);
Category: 

Share:

  • Add news feed
  • Bookmark this on Delicious

Comments

if I change your suggestion to: $str = preg_replace("/(?:>)(https?:\/\/[^ )<\r\n!]+)(?:<)/","'.shortenurl('\\1').'", $str); it seems to work - but I guess the "" should not be taken out by the regexp and then added manually again .. Chris
philipnorton42's picture

You want to replace the regular expression so that it matches any string that looks like a URL and is in between a < and a >. This ought to work: (?:>)(https?:\/\/[^ )<\r\n!]+)(?:<) This can be plugged into the e modifier like this: $str = preg_replace("/(?:>)(https?:\/\/[^ )<\r\n!]+)(?:<)/eim", "'.shortenurl('\\1').'", $str); Let me know how you get on!

Any chance someone can tell how to shorten the URL if the link is already inserted? So if I have a text like "some text <a href="http://VERY_LONG_URL">http://VERY_LONG_URL</a>" some more text" and I want to shorten the URL only in the visible part to something like: "some text <a href="http://VERY_LONG_URL">http://SHORT_URL</a>". I just cannot get my brain to understand these regular expressions good enough to do this - please help! Chris

Actually not quite - I had to add a greater than before the shortenurl and a less than behind - cause the regexp did take them out from the a href and end a tags ... I just have one issue with the above statement - I found that if I have something like http://this.is.a.very.long.url the script will shorten the URL which results in a loss of the original information since there is no link on it that stays untouched. So would you be able to modify the statement that it only matches something like: <a href="SOME_URL">SOME_URL</a> To make sure I only catch links with the URL as the text and not just URLs that don't have a link associated with them? Thanks so much Chris

hmm Wordpress again messed up my text - so one more try: when I have something like "less than" "p" "greater than" URL "less than" "p" "greater than" - the statement will shorten the URL resulting in a loss of information since there is no a href at all. I would like to only match something like "less than" "a href=" URL SOME_ADDITIONAL_PARAMETERS "greater than" URL "less than" "a" "greater than" would that be possible in order to make sure I only shorten URLs that actually have the same URL as link associated with it? Chris
philipnorton42's picture

Despite the best efforts of Wordpress I get what you mean. You just need to add a rule at the start of the code to spot an opening a tag, no matter what it contains, at the start of the pattern. Try giving this a go. (?:&lt;[^\\]a.*?&gt;)(https?:\/\/[^ )&lt;\r\n!]+)(?:&lt;) If you are interested I use a tool called rework to test my regular expressions. Take a look - http://osteele.com/tools/rework/. In my experience, the easy part of writing regular expressions is matching things what you want, the difficult part is stopping it matching things you don't want.

Hmm - just tried your last version on the page you posted - I always get no match. I should really try to dig in these regular expressions to understand why it is not working ... Chris
philipnorton42's picture

Try removing that bit at the start, and using //2 instead of //1. The ?: means "match this, but don't do anything with it", and can lead to some problems on some systems due to lack of support. Like this... (&lt;a.*?&gt;)(https?:\/\/[^ )&lt;\r\n!]+)(&lt;) If you want to learn regular expressions quickly I can recommend getting Ben Forta's book Regular Expressions In 10 Minutes - ISBN 0672325667. I read that book and it all became clear, and it isn't as heavy going as some other books. I now use regular expressions every day and they don't scare me as much!

Wow - that looks good - I am about there - I now used the following PHP command: $str = preg_replace("/(<a>)(https?:\/\/[^ )<\r\n!]+)()/eim", "'\\1'.shortenurl('\\2').'\\3'", $str); and apart from some escape "\" in front of every " in the output - the result is perfect. Thanks so much for all your help Chris

Add new comment