lab posts
Apr 03, 2009
Regex cheat sheet
From my Python class notes...
re.compile()
useful when there will be multiple results or searching over iteration
here's the pattern:
-assign re.compile(r'REGEX') to an OBJECT
patternobj = re.compile(r'REGEX')
-use OBJECT.search(VAR)
for ITERATOR in SOMEFILE.readlines():
if patternobj.search(ITERATOR):
print ITERATOR
re.search()
useful for one-off matched
here's the pattern:
if re.search(r'REGEX', THING TO LOOK IN): do something
re.match()
always matches from the start of the string
here's the pattern:
if re.match(r'REGEX', THING TO LOOK IN): do something
re.compile() produces a PATTERNOBJ
re.search() and re.match() produce a MATCHOBJ
MATCHOBJ.group()
gets the text matched by the group
here's the pattern:
string1 = "Find REGEXTHING in this sentence"
matchobj = re.search(r'(REGEX)', string1) #Note parens
id = matchobj.group(1) #Num refers to first paren group
or:
OBJECT = re.compile(r'REGEX')
matchobj = OBJECT.search(THING TO LOOK IN)
if (matchobj):
print "the matched text was "" + matchobj.group(1) +"
MATCHOBJ.groups()
returns all parens groups in a tuple
re.findall()
finds all matches (returns a list?)
here's the pattern:
OBJECT = re.findall(r'REGEX', THING TO LOOK IN)
as in
text = "High: 33, low: 17"
temp_tuples = re.findall(r'(\w+):\s+(\d+)', text)
print temp_tuples #[('High', '33'), ('low', '17')]
PATTERNOBJ.sub()
does search and replace on text returned by re.compile
here's the pattern:
OBJECT = ("some", "tuple", "of", "strings")
OBJ2 = re.compile(r'REGEX')
for ITERATOR in OBJ2:
ITERATOR = OBJ2.sub('THING TO SWAP IN', ITERATOR)
print ITERATOR
options:
ITERATOR = OBJ2.sub('THING TO SWAP IN', ITERATOR, count=n)
#limits number of substitutions in a found term
PATTERNOBJ.subn()
like sub(), but returns a 2-element tuple containing the subbed text
and # of substitutions made
(THING TO LOOK IN, NUM) = PATTERNOBJ.subn('THING TO SWAP IN', THING TO LOOK IN)
**
Flags, classes, qualifiers, and metachars
^ = beginning of line
$ = eol
\A = beginning of line
\Z = end of line
(?i) = case insensitive; put before string
re.I = case insensitive alt; put after THING TO LOOK IN
re.MULTILINE = allows ^ and $ to match on start or end of lines
re.DOTALL = splits a string allowing regex as the splitter
\d = digit 0-9
\w = word char: letters, nums or underscores; contains \d
\s = any whitespace char
\D = not in \d
\W = not in \w
\S = not in \s
[abcdef] = custom wildcard class
[^] = negates custom char class
string.hexdigits = list of hex digits
string.ascii_letters
string.ascii_lowercase
string.ascii_uppercase
string.digits
string.punctuation
string.uppercase
string.whitespace
string.printable
. = wildcard
\ = toggle/escape the char's function
* = 0 or more
+ = 1 or more
? = 0 or 1
{3,10} = between 3 and 10 (inclusive)
{3,} = 3 or more
() = grouping
() \1 = backreference
\b = boundary char, finds word boundary (whitespace, punctuation)
**
Working with files
FILEOBJ.read()
the idea is to open the while file into a string and then apply the regex to it
here's the pattern:
FILE = open('FILENAME')
TEXT = FILE.read()
if re.search(r'REGEX', TEXT, re.I):
print "I found this: ", text
Mar 16, 2009
Loving Backdrop
Filed Under:
Combined with Megazoomer, it saves me from myself.
Backdrop is a freeware app for Mac that hides everything but the app you're working in. It's different from Finder's 'Hide Others' command in that it also hides the desktop, and replaces it with a single color that you can tweak. For scatterbrained compulsives like me, it's great.
Combine that with Megazoomer, specially tweaked for TextMate. By default, Megazoomer expands the current window to fill the screen. I've got a tweaked version that only expands the window vertically, which makes more sense when you're writing something and want to keep the page normal.
