PHP Paragraph Regular Expression

Wednesday, June 9, 2010 - 14:13

I quite often find the need to extract a section of text from the beginning of a blog post or similar to be used as the excerpt. I normally use a function that will count the number of whole words available and return the string containing those words.

A good alternative to this, although only applicable if the original post is in HTML, is to use a regular expression to extract the contents. The following code will take a string and extract just the first paragraph of text.

$intro = '';
preg_match("/<p.*?>(.*?)<\/p>/is", $string, $matches);
if (isset($matches[1])) {
    $intro = trim(strip_tags($matches[1]));

If the regular expression finds any matches to paragraph tags then it strips out the HTML and trims the string so that the final output doesn't have any formatting or whitespace. The i modifier is used to make the matching case insensitive and the s modifier is used to make the "." match all characters, including new lines. Without the s modifier the result wouldn't return anything if the paragraph text contains a newline character.

This sort of excerpt extraction can be used in systems that store posts as HTML like Wordpress or Drupal.

philipnorton42's picture

Philip Norton

Phil is the founder and administrator of #! code and is an IT professional working in the North West of the UK.
Google+ | Twitter



Hi, I'm in a trouble with php paragraph replace. I've following para 

{tab=Sed facilisis consequat libero}Fusce eu ligula purus, eu ultricies nisl. Fusce eu ligula purus, eu ultricies nisl. Curabitur sed leo felis. {/tab}

{tab=Curabitur ultricies sapien}Sed facilisis consequat libero, at tincidunt neque tristique vitae. {/tab}

 my aim is to replace {tab=name}something{/tab} to <div class="tab">something</div>

Reg exp. i'm using is: /({tab.+?})(.*\S)({\/tab})/s

Its working fine with single line or para. but not with multiline. Can anybody suggest how to fix it. Thanks in Advance...


I was searching for php tutorial came across your website thanks for valuable info

valuable info thanks for sharing

valuable info thanks for sharing

I found it usefull with preg_match_all() to extract all texts paragraphs to array (RegExp: "/(.*?)<\/p>|^(.*?)<\/p>$/is").
This way I can display any paragraph I want and I know how many are in text. Thanks ;)

I simply wanted to write down a quick word to say thanks to you for those wonderful tips and hints you are showing on this site.

Add new comment