Category: PHP Strings

Format A List Of Items In PHP

8 January, 2010 | PHP Strings | No comments

It is usual when writing a list of items to separate each item with a comma, except the last two items, which are separated with the word "and". I recently needed to implement a function that took a string and converted it into a list of this type so I thought I would expand on it and post it here.

The function takes a single parameter, which can either be an array or a comma separated string. If an array is passed to the function then it is converted into a comma separated string and then passed onto the next part in the function. The function then removes any trailing commas, any commas that have nothing in between them and then makes sure that each comma has a single space after it. The final step is to replace the last comma with the word "and". Once the manipulation is complete then the resulting string is returned. If the string (after removing any trailing commas) doesn't contain any commas then it is simply returned.

Here is the function in full, will comments for each step.

/**  * This function will take a string in the format of a single item or  * multiple items in the format 1,2,3,4,5 or an array of items.  * The output will be a readable set of items with the last two items   * separated by " and ".  *  * @param  string|array $numbers The list of items as a string or array.  * @return string                The formatted items.  */ function formatItems($numbers) {     if (is_array($numbers)) {         // If numbers is an array then implode it into a comma separated string.         $numbers = implode(',', $numbers);     }       if (is_string($numbers)) {             // Make sure all commas have a single space character after them.             $numbers = preg_replace("/(\s*?,\s*)/", ", ", $numbers);             // Remove any spare commas             $numbers = preg_replace("/(,\s)+/", ", ", $numbers);                     // The string contains commas, find the last comma in the string.             $lastCommaPos = strrpos($numbers, ',') - strlen($numbers);             // Replace the last ocurrance of a comma with " and "             $numbers = substr($numbers, 0, $lastCommaPos) . str_replace(',', ' and', substr($numbers, $lastCommaPos));         }     }     return $numbers; }

Here are a few examples of this function in action.

echo formatItems('1'); echo formatItems('1,2'); echo formatItems('1,2,3,4,5,6,7'); echo formatItems(range(1,6)); echo formatItems(range('a','g')); echo formatItems('sdfgdf g,sdf, g,dfg,df,g ,df g,df,g ,d fg'); echo formatItems('1.45/76,5/,85/6.,45./6,456'); echo formatItems('sdfah,      ,,, ,, ,,,, ,  ,  ,  568776ythU~O@)_}_:{>9l,65653224,253,4,236,56,98./,978/59'); echo formatItems('4575,8 56,456,36,45656      ,,    , 4, 56, 546, 546, , 6, 456, , ');

This code produces the following output.

1 1 and 2 1, 2, 3, 4, 5, 6 and 7 1, 2, 3, 4, 5 and 6 a, b, c, d, e, f and g sdfgdf g, sdf, g, dfg, df, g, df g, df, g and d fg 1.45/76, 5/, 85/6., 45./6 and 456 sdfah, 568776ythU~O@)_}_:{>9l, 65653224, 253, 4, 236, 56, 98./ and 978/59 4575, 8 56, 456, 36, 45656, 4, 56, 546, 546, 6 and 456

Written by Philip Norton.

Extract Keywords From A Text String With PHP

29 July, 2009 | PHP Strings | No comments

A common issue I have come across in the past is that I have a CMS system, or an old copy of Wordpress, and I need to create a set of keywords to be used in the meta keywords field. To solve this I put together a simple function that runs through a string and picks out the most commonly used words in that list as an array. This is currently set to be 10, but you can change that quite easily.

The first thing the function defines is a list of "stop" words. This is a list of words that occur quite a bit in English text and would therefore interfere with the outcome of the function. The function also uses a variant of the slug function to remove any odd characters that might be in the text.

function extractCommonWords($string){       $stopWords = array('i','a','about','an','and','are','as','at','be','by','com','de','en','for','from','how','in','is','it','la','of','on','or','that','the','this','to','was','what','when','where','who','will','with','und','the','www');           $string = preg_replace('/ss+/i', '', $string);       $string = trim($string); // trim the string       $string = preg_replace('/[^a-zA-Z0-9 -]/', '', $string); // only take alphanumerical characters, but keep the spaces and dashes too…       $string = strtolower($string); // make it lowercase           preg_match_all('/\b.*?\b/i', $string, $matchWords);       $matchWords = $matchWords[0];              foreach ( $matchWords as $key=>$item ) {           if ( $item == '' || in_array(strtolower($item), $stopWords) || strlen($item) <= 3 ) {               unset($matchWords[$key]);           }       }          $wordCountArr = array();       if ( is_array($matchWords) ) {           foreach ( $matchWords as $key => $val ) {               $val = strtolower($val);               if ( isset($wordCountArr[$val]) ) {                   $wordCountArr[$val]++;               } else {                   $wordCountArr[$val] = 1;               }           }       }       arsort($wordCountArr);       $wordCountArr = array_slice($wordCountArr, 0, 10);       return $wordCountArr;   } }

The function returns the 10 most commonly occurring words as an array, with the key as the word and the amount of times it occurs as the value. To extract the words just use the implode() function in conjunction with the array_keys() function. To change the number of words returned just alter the value in the third parameter of the array_slice() function near the return statement, currently set to 10. Here is an example of the function in action.

$text = "This is some text. This is some text. Vending Machines are great."; $words = extractCommonWords($text); echo implode(',', array_keys($words));

This produces the following output.

some,text,machines,vending

Written by Philip Norton.

Using PHP implode() To Construct Strings

4 June, 2009 | PHP Strings | No comments

If you are constructing a simple string from a set of variables contained in an array then you can use the implode function to convert the array into a string. The implode() function takes two parameters. The first is the glue that is used to join the items in array together and the second is the array to use. Here is a trivial example of implode() in action.

$array = array(1, 2, 3, 4, 5, 6);   echo implode(',', $array);

This will print out the following:

1,2,3,4,5,6

The good thing about the implode() function is that it doesn't add stray commas to the start and end of the string so there is no need to alter the string after the function is used.

So how can this function be used in any application? If you are creating an SQL statement then you can use implode to construct it through an array. The following code will take two variables called $column1 and $column2 and use them to create the WHERE clause of an SQL statement. The two variables might be created through GET or POST requests but the string for each clause is added to an array called $clauses. At the end of this process, if the array length is greater than 1, the implode() function is used to add the full WHERE clause to the SELECT statement.

$sql = 'SELECT * FROM table';   $clauses = array();  //These two variables might be created via a form request. $column1 = 'one'; $column2 = 'two';   if ( isset($column1) ) {     $clauses[] = 'column1 = "'.$column1.'"'; } if ( isset($column2) ) {     $clauses[] = 'column2 = "'.$column2.'"'; }   if ( count($clauses) > 0 ) {     $sql .= ' WHERE '.implode(' AND ', $clauses).';'; } echo $sql;

This will print out the following:

SELECT * FROM table WHERE column1 = "one" AND column2 = "two";

This might not be applicable for all applications, but I think it makes SQL statement creation slightly more readable than constructing them purely as strings.

Written by Philip Norton.

Disemvoweling PHP Function

7 April, 2009 | PHP Strings | 1 comment

Disemvoweling is a technique used on blogs and forums to censor any post or comment that contains spam or other unwanted text. It involves simply removing the vowels from the text so that it is almost, but not entirely, unreadable.

Use the following function to disemvowel a string of text.

function disemvowel($string) {     return str_replace(array('a', 'e', 'i', 'o', 'u', 'A', 'E', 'I', 'O', 'U'), '', $string); }

As an example, the first sentence on this post:

Disemvoweling is a technique used on blogs and forums to censor any post or comment that contains spam or other unwanted text.

would appear like this:

Dsmvwlng s tchnq sd n blgs nd frms t cnsr ny pst r cmmnt tht cntns spm r thr nwntd txt.

Which doesn't make a lot of sense, but is still kind of readable. This technique kills unwanted comments without removing the text entirely.

Check out the Wikipedia page on Disemvoweling for more information on the origins or this method.

Written by Philip Norton.

Preparing HTML And PHP Code For Pubilishing On Websites

1 April, 2009 | PHP Strings | No comments

I talked a while ago about Adding Code To Wordpress Blogs And Comments, but I decided that it needed a bit of code to do this automatically.

So here it is, prepared by the text processor.

<form method="post" action="http://www.hashbangcode.com/examples/text-process/text.php">     <textarea name="text" rows="10" cols="80" wrap="off"></textarea>     <input type="submit" value="Process" /> </form>   <?php if ( isset($_POST["text"]) ) {     $text   = $_POST["text"];     $text   = stripslashes( $text );     $input  = array ( "/&/", "/'/", "/"/", "/</", "/>/", "/t/", "/(?<=s)x20|x20(?=s)/", "/^\s$/m", "/&/", "/rn/" );     $output = array ( "&amp;", "&#39;", "&quot;", "&lt;", "&gt;", "&nbsp;&nbsp;&nbsp;&nbsp;", "&nbsp;", "&nbsp;<br />", "&amp;", "<br />" );     $temp = preg_replace($input, $output, $text);     echo '<div style="border:1px solid grey;">'.$temp.'</div>'; } ?>

There seems to be rather a lot going on here, but the process is quite simple. The preg_replace() function can take an array as an argument for the input and output parameters. When you do this the arrays will be matched up so that the second item in the input array will be replaced by the second item in the output array.

So here is a list of the things I am matching for and what they are replaced with.

  • /&/ This matches for any ampersand, we replace this with the encoded variant of &amp;.
  • /'/ Find single quotes and encode them with &#39;.
  • /\"/ Find double quotes and encode them with &quot;.
  • /</ This matches all < and replaces them with &lt;.
  • />/ Same as above but the other way around, in this case the equivalent is &gt;.
  • /\t/ Next we start matching for white space, the first is to find all tab characters and replace them with four &nbsp; characters, like this &nbsp;&nbsp;&nbsp;&nbsp;
  • /(?<=\s)\x20|\x20(?=\s)/ Next we look for any space character that has white space characters before and after it and replace with a single white space character &nbsp;.
  • /^\s$/m This matches for any line with nothing on it. These must be replaced with a single &nbsp; character, but in order to keep the code as it was posted we add a <br /> tag, the final output would be &nbsp;<br />.
  • /&/ Now that we have all of our tags encoded we need to re encode all of the & characters so that when the script prints out the content to a HTML page with all & translated to &amp;.
  • /\r\n/ Finally, we find all of the new line characters and convert them to <br /> tags. You might want to change this to just \n if you are using a Linux format.

Before we do any of this we pass the text through the stripslashes() function. This is because sending the text over POST might add slashes to the " and ' characters. This call just removes them.

You can try out the processor if you want by copying some code into the following text box.

This will output to the text process example page. You can also visit this page directly and play around with the tool.

Written by Philip Norton.