Tutorial :PHP - BBCode parser - Parse both bbcode link tag and not tagged link


I need to do this :

when a user insert a BBCode tag, with preg_replace and regex i do some trasformation.


function forumBBCode($str){     $format_search=array(        '#\[url=(.*?)\](.*?)\[/url\]#i'     );       $format_replace=array(        '<a class="lforum" target="_blank" href="$1">$2</a>'     );       $str=preg_replace($format_search, $format_replace, $str);     $str=nl2br($str);     return $str;  }  

now i want also this : when a user insert a normal text with a link, this must be trasformed too. i can't do this trought preg_replace function, because if i write a code as

$format_search  '#(www\..*?)#i'    $format_replace  '<a class="lforum" target="_blank" href="$1">$1</a>'  

it will convert the link 2 time (in the [url] and when the link is without this tag).

so i think to this function :

    function checkLinks($string) {      $arrelab="";      $arr=split(' |\r\n', $string);      for($i=0; $i<sizeof($arr); $i++) {          echo $i." - ".$arr[$i]."<br/>";          if ((strpos($arr[$i], 'www.')!==false) or (strpos($arr[$i], 'http://')!==false) or (strpos($arr[$i], 'ftp://')!==false)) {              if (strpos($arr[$i], '[url=')===false) {                  $arr[$i]='<a class="lforum" target="_blank" href="'.$arr[$i].'">'.$arr[$i].'</a>';              }          }            $arrelab=$arrelab." ".$arr[$i];      }      return $arrelab;  }  

the problem is that i need a split as for the newline, as for the empty space. any help would be appreciated.

p.s. sorry for my bad english :)



It's easy to workaround with a lookbehind assertion.

preg_replace('#(?<![>/"])((http://)?www.........)#im', '<a href="$1">$1</a>'  

Thus the regex will skip any URL enclosed in " or > or preceeded by /
It's a workaround, not a solution.

PS: target="_blank" is user pestering. Cut it out.


The easiest option would be to parse the plain-text urls first and ensure they don't come immediately after an equals sign.

Update from Marios:

preg_replace('#(?<![>/"])(((http|https|ftp)://)?www[a-zA-Z0-9\-_\.]+)#im', '<a href="$1">$1</a>'


Your problem can be identified by reading your title.. parsing in combination with regex

You can't 'parse' html or bb-code with a regular expression because they are not regular languages.

You should write (or find) a bb-code parser instead of using regular expressions.

Google's first result for a BB-code parser is NBBC: The New BBCode Parser. But I've never used it so I can't comment on the quality.


There is an easier way to do this. I have created a walk through in the RedBonzai Developers blog. The link to it is here: http://www.redbonzai.com/blog/web-development/how-to-create-a-bb-codes-function-in-php/

Let me know if you have any questions.


Note:If u also have question or solution just comment us below or mail us on toontricks1994@gmail.com
Next Post »