Regular Expressions - grep sim and password validation
Question 1(From Textbook)
Write a simple program to simulate the operation of the grep command on Unix. Ask the user to enter a regular expression and count the number of lines that matched the regular expression:
$ python grep.py
Enter a regular expression: ^Author
mbox.txt had 1798 lines that matched ^Author
To code this program, first we need to understand how grep works.
Given one or more patterns,
grep searches input files for matches to the patterns
When it finds a match in a line, it copies the line to standard output (by default), or produces whatever other sort of output you have requested with options
grep [option...] [patterns] [file...]
We're not going to pass any options. We're going to simulate pattern and a file name passed to a grep functions in Python
- the regular expression can be any valid regex in Python.
- the file name is going to be mbox.txt.
We can either use search, match or findall methods from re that we have seen during class.
re.search(pattern,string)
scans entire string looking for a match
returns a match object if found or returns None object
re.match(pattern,string)
only returns a match if there is a match of the pattern in the beginning of the string.
** If you wanted to look anywhere in the string, use search() instead of match.
We are only interested in counting the number of lines, not necessarily the number of occurrences in a line, or the substring that matches the expression, so we won't use findall()
Program:
import re
def grep(pattern, fname):
# required to return cound
count=0
fhand = open(fname)
for line in fhand:
if re.search(pattern,line):
count+=1
return count
fname='mbox.txt'
pattern1 = '\w+@\w+' #resembling a mail id
pattern2 = '^From:'
count1 = grep(pattern1,fname)
count2 = grep(pattern2,fname)
print(fname, 'had',count1, 'lines that matched',pattern1)
print(fname, 'had',count2, 'lines that matched',pattern2)
Output:
grep_demo.py
mbox.txt had 22009 lines that matched \w+@\w+
mbox.txt had 1797 lines that matched ^From:
Question 2
Write a program to check the validity of password read by users. The following criteria should be used to check for validity.
Password should have at least
i) One lowercase letter
ii) One Digit
iii) One uppercase letter
iv) One special character from[$#@!]
v)length of six characters
Your program should accept a password and check the validity using the above criteria and print "valid" or "invalid" as the case may be.
If a search pattern is not found, re.search() returns None object... we can use this to check.
Program:
import re
def validate(pswd):
if len(pswd)<6 or \
re.search('[a-z]',pswd)==None or \
re.search('[A-Z]', pswd)==None or \
re.search('[0-9]',pswd)==None or \
re.search('[#$@!]',pswd)==None:
print('Invalid')
else:
print('Valid')
password = input("Enter Password:")
validate(password)
The or logic can be used in RE, but in this case, we need 'and' logic. The and logic can be simulated using lookahead expression, (?=...) where the ? should be within parantheses.
Look aheads are place sensitive, so
>>> s='adverb'
>>> re.search('ad(?=verb)',s)
<re.Match object; span=(0, 2), match='ad'>
>>> s='verb'
>>> re.search('ad(?=verb)',s)
it only matches if verb is followed by ad.
The lookahead only matches from the current match position. We need to tweak it a little for the password problem.
There is another version (?:..) that discards the matches that renders it unavailable for future matches.
In our case this will not suit us also.
We will use the lookahead (?=.*....) with .* to start looking for a pattern from the beginning.
the last one, (?=.{6,}) checks to see if there are at least 6 characters.
>>>pswd='$20Rddfd'
>>>re.search('(?=.*[a-z])(?=.*[A-Z])(?=.*[0-9])(?=.*[$#!@])(?=.{6,})',pswd)
<re.Match object; span=(0, 0), match=''>
>>>pswd='$a5Rd'
>>> re.search('(?=.*[a-z])(?=.*[A-Z])(?=.*[0-9])(?=.*[$#!@])(?=.{6,})',pswd)
Program :
import redef validate(pswd):
if re.search('(?=.*[a-z])(?=.*[A-Z])(?=.*[0-9])(?=.*[$#!@])(?=.{6,})',pswd):
print('Valid')
else: print('Invalid')
password = input("Enter Password:")
validate(password)
Output:
Enter Password:$gt415AValid
Enter Password:rEd_b%$45
Valid
Enter Password:aaZ012_
Invalid
Comments
Post a Comment