Regex quiz
From JmPm
Revision as of 12:25, 14 March 2007
q1.pl
which is different ? /"(.*)"/ /"([^"]*)"/ /"(.*?)"/
q1-b.pl
$line1='xx"yyy"x';
$line2='xx"yy"zz"x';
($a) = $line1 =~ /"(.*)"/;
($b)= $line1 =~ /"([^"]*)"/;
($c)= $line1 =~ /"(.*?)"/;
print " line1 = $a, $b, $c\n";
($a) = $line2 =~ /"(.*)"/;
($b)= $line2 =~ /"([^"]*)"/;
($c)= $line2 =~ /"(.*?)"/;
print " line1 = $a, $b, $c\n";
q2.txt
capture data into 2 variables- data after n until the next tab or the end and data after e if there is e until the end example: 'nxxxxxx\teyyyyyyyy' $n=xxxxxx $e=yyyyyyyy or 'nxxxxxx' $n=xxxxxx $e=
q2.data
nxxxx eyyyyyy
nxxxxx
q2-a.pl
open IN,"q2.data";
while ($line=<IN>){
chomp $line;
my ($n,$e)= $line=~ //;
print "n=$n,e=$e\n";
}
q2-b.pl
open IN,"q2.data";
while ($line=<IN>){
chomp $line;
my ($n,$e)= $line=~ /[ne]([^\t]+)\t?/g;
my ($n,$e)= $line=~ /n([^\t]+)(?:\te([^\t]+))?/;
print "n=$n,e=$e\n";
}
q3.txt
capture data beween pairs of $$ with a letter after it, but if there is a single $ then it is data and not a separator $a='AAAA' $b='BBB$BB$BBB' $c='CCCC' or $array[1]='AAAA' $array[2]='BBB$BB$BBB' $array[3]='CCCC'
and there can be more...
q3-a.pl
$line='$$aAAAA$$bBBB$BB$BBB$$cCCCC';
@fields=$line=~//;
($a,$b,$c)=$line=~//;
print "@fields\n";
print"a=$a,b=$b,c=$c";
q3-b.pl
$line='$$aAAAA$$bBBB$BB$BBB$$cCCCC';
@fields=$line=~/\$\$\w((?:(?!\$\$).)*)/g;
print "@fields\n";
q4.txt
capture a number of unknown length in a text that follows ab or a but cannot have a c after it example ab123c or a123c $num= ab123x or a123x $num='123' a1z or ab1 z $num='1'
q4.data
xxxxab123cxxxx
xxxxab123xxx
xxxa12345xxxx
xxxxb123xxx
xxxa1234cxxxx
q4-a.pl
open IN,"q4.data";
while ($line=<IN>){
chomp $line;
print "line = $line\n";
($n)=$line=~/ab?(\d+)[^c\d]/;
print "n=$n\n";
}
q4-b.pl
open IN,"q4.data";
while ($line=<IN>){
chomp $line;
print "line = $line\n";
$line=~s/ab?(\d+)c//;
($n)=$line=~/ab?(\d+)/;
print "n=$n\n";
}
q5.txt
separate a line by commas. If there are quotation marks, take what is inside the quotes. If there is a comma inside quotes, it is a literal not a separator example: "aaaa","bbb,bbbb",ccc,dddd $a=aaaa $b=bbb,bbbb $c=ccc $d=ddd
q5-b.pl
$a='"aaaa","bbb,bbbb",ccc,dddd';
@arr=$a=~//;
foreach (@arr){
print"***$_******\n";
}