Programming Assignment #5 (110 Points)

Due: Fri, Mar 07, 5:00pm

The overall objective of this assignment is to introduce you to Scala, including classes, objects, closures, iteration, as well as its functional features.

To start, download pa5.zip and run unzip pa5.zip. This will unleash two files

containing several skeleton Scala functions, with missing bodies denoted by the placeholder text sys.error("TO BE WRITTEN"). Your task is to replace the placeholder text in those files with the appropriate Scala code for each case.

The zipfile also contains the following helper files that you will not modify:

Assignment Testing and Evaluation

Your functions/programs must compile and/or run on a ACS Linux machine (e.g. ieng6.ucsd.edu) as this is where your solutions will be checked. While you may develop your code on any system, ensure that your code runs as expected on an ACS machine prior to submission. You should test your code in the directories from which the zip files (see below) will be created, as this will approximate the environment used for grading the assignment.

Most of the points, will be awarded automatically, by evaluating your functions against a given test suite. Test.scala contains a very small suite of tests which gives you a flavor of of these tests. At any stage, by typing at the UNIX shell:

> make test | grep "130>>" > log
you will get a report on how your code stacks up against the simple tests.

The last line of the file log must contain the word "Compiled" otherwise you get a zero for the whole assignment.

If for some problem, you cannot finish a function, leave it as is with the sys.error("TO BE WRITTEN") with your partial solution enclosed below as a comment.

The second last line of the log file will contain your overall score, and the other lines will give you a readout for each test. You are encouraged to try to understand the code in Test.scala, and subsequently devise your own tests and add them to Test.scala, but you will not be graded on this.

Alternately, inside the Scala shell, type:

scala> import Test._ 
scala> Test() 
  .
  .
130>> Results: ...
130>> Compiled
and it should return a pair of integers, reflecting your score and the max possible score on the sample tests. If instead an error message appears, your code will receive a zero.

Submission Instructions

To turn in your solution, simply type

$ make turnin
in the directory in which you are working. turnin will provide you with a confirmation of the submission process; make sure that the size of the file indicated by turnin matches the size of your zip file.

Scala Tips

Installing Scala at Home

You may download and install Scala on your home machines. You will need to also install Java 1.6 (sudo apt-get install open-6-jdk on Ubuntu).

Remember that this is only to enable you to play with the assignment at home. The final version turned in must work on the ACS Linux machines. While you can use Windows, Mac, etc. to begin working with Scala, the code you turn in must be that required for the Linux environment.

Using .scala Files

To compile a file Misc.scala, type this at the shell:

$ scalac Misc.scala
However, we have provided a Makefile to make things easier. So you should just enter
$ make 
and that will compile all the files pertaining to the assignment.

To play with objects, classes, or functions in a source file, you can load them in the REPL like so:

scala> :load Misc.scala

Loading Misc.scala...
import java.io.File
defined module Lines

scala> val zs = Lines.list("Misc.scala")
zs: List[String] = List(import java.io.File, "", ...)
If you already compiled the file, however, you can skip the the :load step. This is recommended.

Useful Links

  • Scala "Hello, World"
  • Scala REPL
  • Google!
  • Problem #1: Manipulating Words (Words.scala)

    (A) 5 points

    Fill in the definition for the function

    def apply(file: String) : Iterator[String] = 
      sys.error("TO BE WRITTEN")
    
    inside object Words. The function should take as input a filename file and return an Iterator over the words in the file converted to lower-case. You can assume that the file located at file will have one word per line. The words returned by apply should be in the same order they occur in the input file.

    When you are done, you should get the following behavior in the REPL (of course, the particular resXX variables that are automatically generated may differ for you):

    scala> val zs = Words("words")
    zs: Iterator[String] = non-empty iterator
    
    scala> zs.next
    res0: String = aditya
    
    scala> zs.next
    res1: String = adivasi
    
    scala> zs.next
    res2: String = adiz
    
    scala> zs.next
    res3: String = 1080
    

    (B) 10 points

    Next, complete the definition of the function

    def groupFreq[A,B](xs: Iterator[A], f: A => B): HashMap[B, Int] =
      sys.error("TO BE WRITTEN")
    
    As indicated by the type, the function takes an xs: Iterator[A] and a grouping function f that converts A values into their B groups. The function should return an immutable HashMap that, for each B group, counts the number of A values that fell into the group.

    When you are done, you should get the following behavior:

    scala> Words.groupFreq(Words.inti.iterator, Words.isEven)
    res9: scala.collection.immutable.HashMap[String,Int] =
            Map(Odd -> 5, Even -> 2)
    

    (C) 5 points

    Write a function

    def sizeFreq(file: String): HashMap[Int, Int] 
    
    that uses groupFreq to compute a HashMap that maps integer word-lengths to the number of words of that length.

    Once you have implemented the function, you should get the following behavior:

    scala> val wf = Words.sizeFreq("words")
    res0: scala.collection.immutable.HashMap[Int,Int] = 
      Map( 5 -> 25099, 10 -> 54640, 24 -> 19, 25 -> 9, 14 -> 19300
         , 20 -> 508, 29 -> 2, 1 -> 53, 6 -> 41696, 28 -> 2, 21 -> 239
         , 9 -> 62608, 13 -> 27947, 2 -> 1271, 45 -> 1, 17 -> 4012
         , 22 -> 103, 27 -> 3, 12 -> 37559, 7 -> 53939, 3 -> 6221
         , 18 -> 2008, 16 -> 7128, 31 -> 1, 11 -> 46474, 26 -> 2, 23 -> 50
         , 8 -> 62332, 30 -> 1, 19 -> 1052, 4 -> 13209, 15 -> 12140)
    
    scala> wf(4)
    res5: Int = 13209
    
    scala> wf(30)
    res3: Int = 1
    
    scala> wf(12)
    res4: Int = 37559
    

    (D) 10 points

    Write a function

    def charFreq(file: String): HashMap[Char, Int] 
    
    that uses groupFreq to compute a HashMap that maps each Char c to the number of times c appears across throughout the entire file file. (Update: Before computing the frequencies, convert all of the words in file to lowercase, so the output map will not contain any bindings for capital letters.) If a character c does not appear anywhere in file, then there should be no mapping for c.

    Once you have implemented the function, you should get the following behavior:

    scala> val cf = Words.charFreq("words")
    res0: scala.collection.immutable.HashMap[Char, Int] = ...
    
    scala> cf('d')
    res3: Int = 155630
    
    scala> cf('z')
    res4: Int = 18107
    
    scala> cf('e')
    res5: Int = 484449
    

    (E) 5 points

    Write a function

    def wordsOfSize(file: String, size: Int) : Iterator[String] 
    
    that returns an iterator over all the words in file that have length equal to size.

    When you are done, you should get the following behavior:

    scala> Words.wordsOfSize("words", 29).toList
    res10: List[String] = List(cyclotrimethylenetrinitramine,
                               trinitrophenylmethylnitramine)
    

    (F) 5 points

    Write a function

    def wordsWithAllVowels(file: String): Iterator[String]
    
    which returns an iterator over all the words in file that contain all five vowels.

    When you are done, you should get the following behavior:

    scala> val vows = Words.wordsWithAllVowels("words")
    vows: Iterator[String] = non-empty iterator
    
    scala> vows.next
    res2: String = abdomino-uterotomy
    
    scala> vows.next
    res3: String = abevacuation
    
    scala> vows.next
    res4: String = abietineous
    
    scala> vows.next
    res5: String = abiogenous
    
    scala> vows.next
    res6: String = aboideau
    
    scala> vows.next
    res7: String = aboideaus
    

    (G) 5 points

    Write a function

    def wordsWithNoVowels(file: String): Iterator[String]
    
    which returns an iterator over all the words in file that do not contain any vowels.

    When you are done, you should get the following behavior:

    scala> Words.wordsWithNoVowels("words").toList
    res0: List[String] = List(1080, 10th, 1st, 2, 2,4,5-t, 2,4-d, 2d, 2nd,
    30-30, 3-d, 3-d, 3d, 3m, 3rd, 4-d, 4gl, 4h, 4th, 5-t, 5th, 6th, 7th,
    8th, 9th, b, b-, b, b911, b/b, bb, bb, bbb, b.b.c., bbc, bbl, bbl, bbl.,
    bbls, bbn, bbs, bbxrt, b.c., b/c, bc, bcbs, bcc, bcd, bcd, bcf, b.ch.,
    bch, bch, bchs, b.c.l., bcl, bcm, bcp, bcpl, bcr, bcs, bcwp, bcws, b.d.,
    b/d, bd, bd, bd., bdc, bdd, bdf, bdft, bdl, bdl., bdls, bdrm, b.d.s.,
    bds, bds, bdt, b/f, bf, b.f., bf, bfd, bfdc, bfhd, bfr, bfs, bft, bg, bg,
    bglr, bgp, bh, bhc, bhd, bhl, bhp, bhp, bht, bk, bk, bk., bkbndr, bkcy,
    bkcy., bkg, bkg., bkgd, bklr, bkpr, bkpt, bks, bks., bkt, b.l., b/l, bl,
    b/l, bl, bl., bld, bldg, bldg., bldr, blds, blf, blk, blk., bll, blm,
    bls, bls, blt, blv, blvd, blvd, bly, blynn, blyth, b.m., bm, bm, bmg,
    bmj, bmp, b,...
    

    (H) 10 points

    Finally, write a function

    def wordsMatchingRegexp(file: String, re: Regex): Iterator[String] 
    
    which loads the words from the file file which match the regular expression re. Recall that we have seen regular expressions when implementing the NanoML lexer in HW4; see this tutorial for information on how to match strings against regular expressions in Scala.

    Once you have implemented the function, you should get the following behavior in the REPL:

    scala> import scala.util.matching.Regex
    
    scala> val re1 = new Regex("""^[a-z].{2}$""")
    re1: scala.util.matching.Regex = ^[a-z].{2}$
    scala> val ws1 = Words.wordsMatchingRegexp("words", re1).take(5).toList
    ws1: List[String] =
           List(a-1, aaa, aaa, aae, aaf)
    
    scala> val re2 = new Regex("""^ca.*{2}e$""")
    re2: scala.util.matching.Regex = ^ca.*{2}e$
    scala> val ws2 = Words.wordsMatchingRegexp("words", re2).take(5).toList
    ws2: List[String] =
           List(caballine, cabane, cabbage, cabbagelike, cabbage-tree)
    
    scala> val re3 = new Regex("""^xYx.*$""")
    scala> val ws3 = Words.wordsMatchingRegexp("words", re3).take(5).toList
    ws3: List[String] =
           List()
    

    Problem #2: Password Cracking (Crack.scala)

    For this problem, you will write a simple dictionary-based password cracker. Primarily, this will be an exercise in string manipulation and data structures. You will be attempting to crack as many passwords as possible in a UNIX password file. The passwords are encrypted using the UNIX crypt() function. For more information, see man passwd and man crypt on UNIX or Linux.

    (A) 5 points

    Write a Scala function

    def transformReverse(w: String) : Iterator[String]
    
    which takes a string and returns an iterator over the original string and the reversal of the original string.

    When you are done, you should get the following behavior (the order of the results is not important):

    scala> Crack.transformReverse("caterpillar").toList
    res1: List[String] = List(caterpillar, rallipretac)
    

    (B) 10 points

    Write a Scala function

    def transformCapitalize(w: String) : Iterator[String] 
    
    which takes a string and returns an iterator over all the possible ways to capitalize the input string.

    When you are done, you should get the following behavior (the order of the results is not important):

    scala> Crack.transformCapitalize("taco").toList
    res4: List[String] =
            List(taco, tacO, taCo, taCO, tAco, tAcO, tACo, tACO,
                 Taco, TacO, TaCo, TaCO, TAco, TAcO, TACo, TACO)
    

    (C) 15 points

    Write a Scala function

    def transformDigits(w:String) : Iterator[String]
    
    which should return an iterator over all possible ways to replace letters with similar looking digits according to the mappings in the table below. This should be done without regard to the capitalization of the input string. However, when a character is not replaced with a digit, the character’s capitalization should be preserved.

    o -> 0i,l -> 1
    z -> 2e -> 3
    a -> 4s -> 5
    b -> 6, 8t -> 7
    g,q -> 9

    When you are done, you should get the following behavior (the order of the results is not important):

    scala> Crack.transformDigits("Bow").toList
    res0: List[String] = List(Bow, B0w, 6ow, 60w, 8ow, 80w)
    

    (D) 10 points

    The passwd file contains lines with the following format (as described in man passwd):

    • account:password:UID:GID:GECOS:directory:shell

    Write a function Entry.apply
    def apply(line: String) : Entry
    
    which takes a string (a line from the file passwd) and returns an instance of the case class Entry (which has fields account, password, UID, GID, GECOS, directory, and shell). The UID and GID fields should be returned as integers.

    Once you have implemented the function, you should get the following behavior:

    scala> val e =
    Entry("checani:IqAFDoIjL2cDs:1:1:Pengpu Checani:/home/checani:/bin/bash")
    e: Entry =
    Entry(checani,IqAFDoIjL2cDs,1,1,Pengpu Checani,/home/checani,/bin/bash)
    
    scala> e.password
    res1: String = IqAFDoIjL2cDs
    
    scala> e.account
    res2: String = checani
    
    Hint: The .split method may be of use.

    (E) 15 points (+ 25 points EXTRA CREDIT)

    The Scala function

    def checkPassword(plain: String, enc: String) : Boolean
    
    takes a plain-text password plain and an encrypted password enc, and returns true if plain encrypts to enc using the function Crypt.crypt (provided in the Java file Crypt.java).
    scala> Crack.checkPassword("asarta","IqAFDoIjL2cDs")
    res3: Boolean = true
    
    scala> Crack.checkPassword("asartaz","IqAFDoIjL2cDs")
    res4: Boolean = false
    

    Use the above to write a function Crack.apply

    def apply(pwdFile: String, wordsFile: String, outFile: String) : Unit 
    
    which takes a password file name pwdFile, a file with a list of words named wordsFile, and a string outFile that denotes the location of an output file. The function apply should generate candidate passwords from the words in wordsFile by applying the various transformations (transformReverse, transformDigits, and transformCapitalize).

    Attempt to crack as many of the passwords in the password file as possible. As passwords are discovered, they should be written to the output file in the format:

        username=pass

    For example:

        checani=asarta

    After each password is cracked, a new line should be written to the output file. The output file should be flushed immediately, because your function will only be allowed to run for a limited amount of time (several minutes).

    Therefore, you should attempt to crack the easiest passwords first. All passwords in the input file are between 6 and 8 characters and can be cracked using words in the provided dictionary along with a combination of the transformations above.

    You will receive 15 points if using our test inputs (which will be larger passwd and words files than the sample ones provided), your solution cracks all passwords which are simply untransformed strings. For the 25 extra credit points, fractional credit will be proportional to the number of additional passwords cracked.

    Once you have implemented the function, you should get the following behavior:

    scala> Crack("passwd", "words", "out")
    ... time passes ...
    ... Hit Ctrl-C ...
    ...
    
    At this point, the file out (because that’s the value passed in for the outFile parameter) should contain something like the following:
    checani=asarta
    root=...
    ...=...
    ...=...
    
    Alternatively, at the UNIX prompt:
    $ make crack
    $ scala Crack passwd words out
    
    As the program runs, it should produce the above output in the file out.