webguru Site Admin
Joined: 07 Aug 2005 Posts: 19 Location: New York
|
Subject: Regular Expressions Reference Posted: Wed October 08, 2008 |
|
|
| Code: |
=====================================
Regular Expressions - notes gathered by Dave Nguyen
=====================================
. matches any one character (except newline)
$ matches end of the line (or before newline at the end)
\ matches character to right literally
? matches character to left zero or one time
* matches character to left zero or more times
+ matches character to left one or more times
| alternation
^ exclude
() grouping (of operators)
[] character class
{} numeric quantifiers
Character Class Match
--------------- = -----
\d = matches a digit, same as [0-9]
\D = matches a non-digit, same as [^0-9]
\s = matches a whitespace character (space, tab, newline, etc.)
\S = matches a non-whitespace character
\w = matches a word character
\W = matches a non-word character
\b = matches a word-boundary (NOTE: within a class, matches a backspace)
\B = matches a non-wordboundary
POSIX Class Match
----------- -----
[:alnum:] = alphabetic and numeric characters
[:alpha:] = alphabetic characters
[:blank:] = space and tab
[:cntrl:] = control characters
[:digit:] = digits
[:graph:] = non-blank (not spaces and control characters)
[:lower:] = lowercase alphabetic characters
[:print:] = any printable characters
[:punct:] = punctuation characters
[:space:] = all whitespace characters (includes [:blank:], newline, carriage return)
[:upper:] = uppercase alphabetic characters
[:xdigit:] = digits allowed in a hexadecimal number (i.e. 0-9, a-f, A-F)
Examples
-------------------------------------
Pattern Matches
------- -------
3.14 = 3514, 3f14, 3_14, ...
3\.14 = only 3.14
m?ethane = only ethane, methane
comm?a = only coma, comma
ab*c = ac, abc, abbc, abbbc, abbbbc, ...
ab+c = abc, abbc, abbbc, abbbbc, ...
cyclo.*ane = cycloane, cycloxane, cyclowerqnln!0389ane, cyclones are insane, ...
a\.*z = az, a.z, a..z, a...z, a....z, ...
a.\*z = a_*z, aX*z, a0*z, any four character string starting with a and ending with *z
a\++z = a+z, a++z, a+++z, a++++z, ...
a+\+z = aa+z, aaa+z, aaaa+z, aaaaa+z, ...
a\+\+z = only a++z
a.?e = aae, aaae, abe, abbe, ace, acce, aZe, aZZe
a\.?e = only a.e, a..e
a.\?e = aa?e, ab?e, aX?e, and four character string starting with a and ending with ?e
a\.\?e = only a.?e
^x = any character not x
^[dave] = any character except d, a, v, and e
x{3} = only no spam allowed
x{3,} = at least 3 occurances of x, e.g. no spam allowed, xxxx, xxxxx
x{3,5} = only no spam allowed, xxxx, or xxxxx
[0-9] = a single digit
[a-z] = a single lowercase letter
[A-Za-z0-9] = a single uppercase letter, a single lowercase letter, or a single digit
[0-3,5-8] = any digit not 4 or not 9
www\.([a-z]+)\.com = www.anylowercasecharacters.com
\d{1} or [0-9] = 0 - 9
[1-9]\d{1} or [1-9][0-9] = 10 - 99
1\d{2} or 1[0-9][0-9] = 100 - 199
2[0-4]\d{1} or 2[0-4][0-9] = 200 - 250
25[0-5] = 250 - 255
25[0-5]|2[0-4][0-9]|1\d{2}|[1-9][0-9]|[0-9] = 255 - 0
((\d{1,3})\.){3}(\d{1,3}) = 0.0.0.0 - 999.999.999.999
(\d+\.){3}\d+ = 0.0.0.0 - infinity.infinity.infinity.infinity
((25[0-5]|2[0-4][0-9]|1\d{2}|[1-9][0-9]|[0-9])\.){3}(25[0-5]|2[0-4][0-9]|1\d{2}|[1-9][0-9]|[0-9]) = 255.255.255.255 - 0.0.0.0
|
|
|