php regex to detect text inside brackets ignoring nested brackets -
i'm trying make php regex work parses string text in brackets while ignoring possible nested brackets:
let's want
lorem ipsum [1. dolor sit amet, [consectetuer adipiscing] elit.]. aenean commodo ligula eget dolor.[2. dolor, [consectetuer adipiscing] elit.] aenean massa[3. lorem ipsum] dolor.
to return
[1] => "dolor sit amet, [consectetuer adipiscing] elit." [2] => "dolor, [consectetuer adipiscing] elit." [3] => "lorem ipsum"
so far got
'/\[([0-9]+)\.\s([^\]]+)\]/gi'
but breaks when nested brackets occur. see demo
how can ignore inner brackets detection? thx in advance!
you can use pattern captures item number , following text in 2 different groups. if sure item numbers unique, can build associative array described in question simple array_combine
:
$pattern = '~\[ (?:(\d+)\.\s)? ( [^][]*+ (?:(?r) [^][]*)*+ ) ]~x'; if (preg_match_all($pattern, $text, $matches)) $result = array_combine($matches[1], $matches[2]);
pattern details:
~ # pattern delimiter \[ # literal opening square bracket (?:(\d+)\.\s)? # optional item number (*) ( # capture group 2 [^][]*+ # not square bracket (possessive quantifier) (?: # (?r) # recursion: (?r) alias whole pattern [^][]* # not square bracket )*+ # repeat 0 or more times (possessive quantifier) ) ] # literal closing square bracket ~x # free spacing mode
(*) note item number part must optional if want able use recursion (?r)
(for example [consectetuer adipiscing]
doesn't have item number.). can problematic if want avoid square brackets without item number. in case can build more robust pattern if change optional group (?:(\d+)\.\s)?
conditional statement: (?(r)|(\d+)\.\s)
conditional statement:
(?(r) # if in recursion # match (nothing in our case) | # else (\d+)\.\s # )
in way item number becomes mandatory.
Comments
Post a Comment