php regex to detect text inside brackets ignoring nested brackets -


i'm trying make php regex work parses string text in brackets while ignoring possible nested brackets:

let's want

lorem ipsum [1. dolor sit amet, [consectetuer adipiscing] elit.]. aenean commodo ligula eget dolor.[2. dolor, [consectetuer adipiscing] elit.] aenean massa[3. lorem ipsum] dolor. 

to return

[1] => "dolor sit amet, [consectetuer adipiscing] elit." [2] => "dolor, [consectetuer adipiscing] elit." [3] => "lorem ipsum" 

so far got

'/\[([0-9]+)\.\s([^\]]+)\]/gi' 

but breaks when nested brackets occur. see demo

how can ignore inner brackets detection? thx in advance!

you can use pattern captures item number , following text in 2 different groups. if sure item numbers unique, can build associative array described in question simple array_combine:

$pattern = '~\[ (?:(\d+)\.\s)? ( [^][]*+ (?:(?r) [^][]*)*+ ) ]~x';  if (preg_match_all($pattern, $text, $matches))     $result =  array_combine($matches[1], $matches[2]); 

pattern details:

~     # pattern delimiter \[    # literal opening square bracket (?:(\d+)\.\s)? # optional item number (*)  (              # capture group 2    [^][]*+         # not square bracket (possessive quantifier)    (?:             #         (?r)        # recursion: (?r) alias whole pattern        [^][]*      # not square bracket    )*+             # repeat 0 or more times (possessive quantifier) ) ]                  # literal closing square bracket ~x  # free spacing mode 

(*) note item number part must optional if want able use recursion (?r) (for example [consectetuer adipiscing] doesn't have item number.). can problematic if want avoid square brackets without item number. in case can build more robust pattern if change optional group (?:(\d+)\.\s)? conditional statement: (?(r)|(\d+)\.\s)

conditional statement:

(?(r)        # if in recursion              # match (nothing in our case)   |          # else   (\d+)\.\s  #    ) 

in way item number becomes mandatory.


Comments

Popular posts from this blog

html - Outlook 2010 Anchor (url/address/link) -

javascript - Why does running this loop 9 times take 100x longer than running it 8 times? -

Getting gateway time-out Rails app with Nginx + Puma running on Digital Ocean -