[ Index ] |
PHP Cross Reference of Unnamed Project |
[Summary view] [Print] [Text view]
1 =head1 NAME 2 3 perlreref - Perl Regular Expressions Reference 4 5 =head1 DESCRIPTION 6 7 This is a quick reference to Perl's regular expressions. 8 For full information see L<perlre> and L<perlop>, as well 9 as the L</"SEE ALSO"> section in this document. 10 11 =head2 OPERATORS 12 13 C<=~> determines to which variable the regex is applied. 14 In its absence, $_ is used. 15 16 $var =~ /foo/; 17 18 C<!~> determines to which variable the regex is applied, 19 and negates the result of the match; it returns 20 false if the match succeeds, and true if it fails. 21 22 $var !~ /foo/; 23 24 C<m/pattern/msixpogc> searches a string for a pattern match, 25 applying the given options. 26 27 m Multiline mode - ^ and $ match internal lines 28 s match as a Single line - . matches \n 29 i case-Insensitive 30 x eXtended legibility - free whitespace and comments 31 p Preserve a copy of the matched string - 32 ${^PREMATCH}, ${^MATCH}, ${^POSTMATCH} will be defined. 33 o compile pattern Once 34 g Global - all occurrences 35 c don't reset pos on failed matches when using /g 36 37 If 'pattern' is an empty string, the last I<successfully> matched 38 regex is used. Delimiters other than '/' may be used for both this 39 operator and the following ones. The leading C<m> can be omitted 40 if the delimiter is '/'. 41 42 C<qr/pattern/msixpo> lets you store a regex in a variable, 43 or pass one around. Modifiers as for C<m//>, and are stored 44 within the regex. 45 46 C<s/pattern/replacement/msixpogce> substitutes matches of 47 'pattern' with 'replacement'. Modifiers as for C<m//>, 48 with one addition: 49 50 e Evaluate 'replacement' as an expression 51 52 'e' may be specified multiple times. 'replacement' is interpreted 53 as a double quoted string unless a single-quote (C<'>) is the delimiter. 54 55 C<?pattern?> is like C<m/pattern/> but matches only once. No alternate 56 delimiters can be used. Must be reset with reset(). 57 58 =head2 SYNTAX 59 60 \ Escapes the character immediately following it 61 . Matches any single character except a newline (unless /s is used) 62 ^ Matches at the beginning of the string (or line, if /m is used) 63 $ Matches at the end of the string (or line, if /m is used) 64 * Matches the preceding element 0 or more times 65 + Matches the preceding element 1 or more times 66 ? Matches the preceding element 0 or 1 times 67 {...} Specifies a range of occurrences for the element preceding it 68 [...] Matches any one of the characters contained within the brackets 69 (...) Groups subexpressions for capturing to $1, $2... 70 (?:...) Groups subexpressions without capturing (cluster) 71 | Matches either the subexpression preceding or following it 72 \1, \2, \3 ... Matches the text from the Nth group 73 \g1 or \g{1}, \g2 ... Matches the text from the Nth group 74 \g-1 or \g{-1}, \g-2 ... Matches the text from the Nth previous group 75 \g{name} Named backreference 76 \k<name> Named backreference 77 \k'name' Named backreference 78 (?P=name) Named backreference (python syntax) 79 80 =head2 ESCAPE SEQUENCES 81 82 These work as in normal strings. 83 84 \a Alarm (beep) 85 \e Escape 86 \f Formfeed 87 \n Newline 88 \r Carriage return 89 \t Tab 90 \037 Any octal ASCII value 91 \x7f Any hexadecimal ASCII value 92 \x{263a} A wide hexadecimal value 93 \cx Control-x 94 \N{name} A named character 95 96 \l Lowercase next character 97 \u Titlecase next character 98 \L Lowercase until \E 99 \U Uppercase until \E 100 \Q Disable pattern metacharacters until \E 101 \E End modification 102 103 For Titlecase, see L</Titlecase>. 104 105 This one works differently from normal strings: 106 107 \b An assertion, not backspace, except in a character class 108 109 =head2 CHARACTER CLASSES 110 111 [amy] Match 'a', 'm' or 'y' 112 [f-j] Dash specifies "range" 113 [f-j-] Dash escaped or at start or end means 'dash' 114 [^f-j] Caret indicates "match any character _except_ these" 115 116 The following sequences work within or without a character class. 117 The first six are locale aware, all are Unicode aware. See L<perllocale> 118 and L<perlunicode> for details. 119 120 \d A digit 121 \D A nondigit 122 \w A word character 123 \W A non-word character 124 \s A whitespace character 125 \S A non-whitespace character 126 \h An horizontal white space 127 \H A non horizontal white space 128 \v A vertical white space 129 \V A non vertical white space 130 \R A generic newline (?>\v|\x0D\x0A) 131 132 \C Match a byte (with Unicode, '.' matches a character) 133 \pP Match P-named (Unicode) property 134 \p{...} Match Unicode property with long name 135 \PP Match non-P 136 \P{...} Match lack of Unicode property with long name 137 \X Match extended Unicode combining character sequence 138 139 POSIX character classes and their Unicode and Perl equivalents: 140 141 alnum IsAlnum Alphanumeric 142 alpha IsAlpha Alphabetic 143 ascii IsASCII Any ASCII char 144 blank IsSpace [ \t] Horizontal whitespace (GNU extension) 145 cntrl IsCntrl Control characters 146 digit IsDigit \d Digits 147 graph IsGraph Alphanumeric and punctuation 148 lower IsLower Lowercase chars (locale and Unicode aware) 149 print IsPrint Alphanumeric, punct, and space 150 punct IsPunct Punctuation 151 space IsSpace [\s\ck] Whitespace 152 IsSpacePerl \s Perl's whitespace definition 153 upper IsUpper Uppercase chars (locale and Unicode aware) 154 word IsWord \w Alphanumeric plus _ (Perl extension) 155 xdigit IsXDigit [0-9A-Fa-f] Hexadecimal digit 156 157 Within a character class: 158 159 POSIX traditional Unicode 160 [:digit:] \d \p{IsDigit} 161 [:^digit:] \D \P{IsDigit} 162 163 =head2 ANCHORS 164 165 All are zero-width assertions. 166 167 ^ Match string start (or line, if /m is used) 168 $ Match string end (or line, if /m is used) or before newline 169 \b Match word boundary (between \w and \W) 170 \B Match except at word boundary (between \w and \w or \W and \W) 171 \A Match string start (regardless of /m) 172 \Z Match string end (before optional newline) 173 \z Match absolute string end 174 \G Match where previous m//g left off 175 176 \K Keep the stuff left of the \K, don't include it in $& 177 178 =head2 QUANTIFIERS 179 180 Quantifiers are greedy by default -- match the B<longest> leftmost. 181 182 Maximal Minimal Possessive Allowed range 183 ------- ------- ---------- ------------- 184 {n,m} {n,m}? {n,m}+ Must occur at least n times 185 but no more than m times 186 {n,} {n,}? {n,}+ Must occur at least n times 187 {n} {n}? {n}+ Must occur exactly n times 188 * *? *+ 0 or more times (same as {0,}) 189 + +? ++ 1 or more times (same as {1,}) 190 ? ?? ?+ 0 or 1 time (same as {0,1}) 191 192 The possessive forms (new in Perl 5.10) prevent backtracking: what gets 193 matched by a pattern with a possessive quantifier will not be backtracked 194 into, even if that causes the whole match to fail. 195 196 There is no quantifier {,n} -- that gets understood as a literal string. 197 198 =head2 EXTENDED CONSTRUCTS 199 200 (?#text) A comment 201 (?:...) Groups subexpressions without capturing (cluster) 202 (?pimsx-imsx:...) Enable/disable option (as per m// modifiers) 203 (?=...) Zero-width positive lookahead assertion 204 (?!...) Zero-width negative lookahead assertion 205 (?<=...) Zero-width positive lookbehind assertion 206 (?<!...) Zero-width negative lookbehind assertion 207 (?>...) Grab what we can, prohibit backtracking 208 (?|...) Branch reset 209 (?<name>...) Named capture 210 (?'name'...) Named capture 211 (?P<name>...) Named capture (python syntax) 212 (?{ code }) Embedded code, return value becomes $^R 213 (??{ code }) Dynamic regex, return value used as regex 214 (?N) Recurse into subpattern number N 215 (?-N), (?+N) Recurse into Nth previous/next subpattern 216 (?R), (?0) Recurse at the beginning of the whole pattern 217 (?&name) Recurse into a named subpattern 218 (?P>name) Recurse into a named subpattern (python syntax) 219 (?(cond)yes|no) 220 (?(cond)yes) Conditional expression, where "cond" can be: 221 (N) subpattern N has matched something 222 (<name>) named subpattern has matched something 223 ('name') named subpattern has matched something 224 (?{code}) code condition 225 (R) true if recursing 226 (RN) true if recursing into Nth subpattern 227 (R&name) true if recursing into named subpattern 228 (DEFINE) always false, no no-pattern allowed 229 230 =head2 VARIABLES 231 232 $_ Default variable for operators to use 233 234 $` Everything prior to matched string 235 $& Entire matched string 236 $' Everything after to matched string 237 238 ${^PREMATCH} Everything prior to matched string 239 ${^MATCH} Entire matched string 240 ${^POSTMATCH} Everything after to matched string 241 242 The use of C<$`>, C<$&> or C<$'> will slow down B<all> regex use 243 within your program. Consult L<perlvar> for C<@-> 244 to see equivalent expressions that won't cause slow down. 245 See also L<Devel::SawAmpersand>. Starting with Perl 5.10, you 246 can also use the equivalent variables C<${^PREMATCH}>, C<${^MATCH}> 247 and C<${^POSTMATCH}>, but for them to be defined, you have to 248 specify the C</p> (preserve) modifier on your regular expression. 249 250 $1, $2 ... hold the Xth captured expr 251 $+ Last parenthesized pattern match 252 $^N Holds the most recently closed capture 253 $^R Holds the result of the last (?{...}) expr 254 @- Offsets of starts of groups. $-[0] holds start of whole match 255 @+ Offsets of ends of groups. $+[0] holds end of whole match 256 %+ Named capture buffers 257 %- Named capture buffers, as array refs 258 259 Captured groups are numbered according to their I<opening> paren. 260 261 =head2 FUNCTIONS 262 263 lc Lowercase a string 264 lcfirst Lowercase first char of a string 265 uc Uppercase a string 266 ucfirst Titlecase first char of a string 267 268 pos Return or set current match position 269 quotemeta Quote metacharacters 270 reset Reset ?pattern? status 271 study Analyze string for optimizing matching 272 273 split Use a regex to split a string into parts 274 275 The first four of these are like the escape sequences C<\L>, C<\l>, 276 C<\U>, and C<\u>. For Titlecase, see L</Titlecase>. 277 278 =head2 TERMINOLOGY 279 280 =head3 Titlecase 281 282 Unicode concept which most often is equal to uppercase, but for 283 certain characters like the German "sharp s" there is a difference. 284 285 =head1 AUTHOR 286 287 Iain Truskett. Updated by the Perl 5 Porters. 288 289 This document may be distributed under the same terms as Perl itself. 290 291 =head1 SEE ALSO 292 293 =over 4 294 295 =item * 296 297 L<perlretut> for a tutorial on regular expressions. 298 299 =item * 300 301 L<perlrequick> for a rapid tutorial. 302 303 =item * 304 305 L<perlre> for more details. 306 307 =item * 308 309 L<perlvar> for details on the variables. 310 311 =item * 312 313 L<perlop> for details on the operators. 314 315 =item * 316 317 L<perlfunc> for details on the functions. 318 319 =item * 320 321 L<perlfaq6> for FAQs on regular expressions. 322 323 =item * 324 325 L<perlrebackslash> for a reference on backslash sequences. 326 327 =item * 328 329 L<perlrecharclass> for a reference on character classes. 330 331 =item * 332 333 The L<re> module to alter behaviour and aid 334 debugging. 335 336 =item * 337 338 L<perldebug/"Debugging regular expressions"> 339 340 =item * 341 342 L<perluniintro>, L<perlunicode>, L<charnames> and L<perllocale> 343 for details on regexes and internationalisation. 344 345 =item * 346 347 I<Mastering Regular Expressions> by Jeffrey Friedl 348 (F<http://regex.info/>) for a thorough grounding and 349 reference on the topic. 350 351 =back 352 353 =head1 THANKS 354 355 David P.C. Wollmann, 356 Richard Soderberg, 357 Sean M. Burke, 358 Tom Christiansen, 359 Jim Cromie, 360 and 361 Jeffrey Goff 362 for useful advice. 363 364 =cut
title
Description
Body
title
Description
Body
title
Description
Body
title
Body
Generated: Tue Mar 17 22:47:18 2015 | Cross-referenced by PHPXref 0.7.1 |