Tuesday, April 1, 2014

Week 11: Sorting and Efficiency

Hi everyone, this slog (which is the last slog ever!) will be about sorting and efficiency. 

In the past few weeks in class, we learned about a way to express the efficiency of algorithms based on their runtime, using something called Big-O notation.  In simple terms, if we say a function is "Big-O of n2" that means that the function n2 is an upper bound for the runtime of our function on input of size n. This notation simplifies things considerably, as it enables us to only consider the runtime of a function when the size of the input (for example, a list) is large. As well, it allows us to group functions with similar runtimes together. For example, say that the runtime of a certain function is n2 + 14n + 63, and the runtime of another is 3n2. When n (the size of the input) becomes very large, the lower terms (14n + 63) will be irrelevant compared to the highest order term. Thus we can ignore them for when the list is really large. We also know that since these runtimes will only differ by a factor of 3 when n is very large, their runtimes will be similar enough that we want to be able to group them together. Thus using Big-O notation, we say both functions are O(n2). This simplification will allow us to classify functions by their general behaviour, and give us a better understanding of how efficient they are. 

In CSC108, the three sorting algorithms we looked at (bubble sort, selection sort, insertion sort) were all O(n2). In 148, since we are able to divide the list in half and sort it recursively, the runtime of quick sort (in the average case) and merge sort is O(nlogn). This is because we are dividing the list in half everytime (which is logn steps) and at each step, we make n comparisons. O(nlogn) is much quicker than O(n2) for large values of n. For example, if we have a list of 10,000 elements, the runtime of quick sort will be about 13,000 compared to selection sort, which will be 100 million (relatively speaking). This shows the efficiency of quick sort, and demonstrates the usefulness of Big-O in knowing how long a certain algorithm will take to execute.

Good luck on the final exam everyone!

Tuesday, March 25, 2014

Week 10: Assignment 2 (Part 2) and More Sorting

This week, the last assignment for the semester was completed, and we learned more about different kinds of sorting algorithms. 

Assignment 2 part 2: I liked this part much more than the first part of the assignment, because this time we actually got to write code for functions. Although I still enjoyed the design portion of this assignment (part 1), I generally enjoy the problem solving part of computer science more. My partner and I split up the 4 functions between us, so I was doing is_regex and regex_match. The weekend before it was due, I had spent A LOT of time on is_regex, going in the wrong direction. The approach I had taken was kind of an 'obstacle course' coding method: there were all these conditions that if the input didn't meet, the function would return False. At the end, if the input had survived that long then it must be a regex and the function would return True. Unfortunately, this proved nearly impossible, at the point where we had to do recursion on the inner regexes. On Monday in class we were given a hint, which was to take the opposite approach: check to make sure the string satisfies basic regex conditions, and then call is_regex on the inner (possible) regexes. This way, if the brackets/dots/bars were messed up, the recursive call would simply return False at some point. This took my code from over 40 lines and turned it into 10 clean concise ones. I feel that this technique is really important for coding when you're checking if input is of the right format: check the basics, and then assume it is of the right format and perform operations on it. If it isn't, eventually it won't be able to fake it anymore.

Sorting algorithms: This week we learned about two sorts: quick_sort and merge_sort. The interesting thing about quick_sort is that its worst case runtime is actually when the list is already sorted (or sorted in reverse). We can change this based on our choice of pivot however, in which case quick_sort almost runs as quickly as merge_sort. Merge sort is the fastest sort we've encountered so far (except for the built-in sort), with O(n lg(n)) run time. It is interesting to see how much recursion improves the run time of sorting algorithms: when these new sorts were tested against the ones we used in 108 (before we knew about recursion), they were significantly faster. The reason recursion improves it so much is that it allows you to repeatedly split the list in half and sort each half, using recursion. This splitting the list in half is what accounts for the log (base 2) n iterations of the algorithm. It will be interesting to see what other sorting algorithms there are that we haven't learned yet, and to see if there is a fastest one out there.

Good luck on the midterm everyone!


Wednesday, March 19, 2014

Week 9 - Sorting, Big-Oh, and e3

A bunch of different things happened this week: e3 was due, we learned about sorting and Big-Oh in class, and we played around with BSTs in the lab. 

In the lab, one of the main concepts to understand this week was defining a function inside another function which does most of the work you want your outer function to do. We used this for BST and BSTNodes. We were asked to write a method count_less for BST, inside which we had to define a helper function. The purpose of the helper function was to do the recursion that BST couldn't do. The helper function had the same function as BST, the only difference was that it took in a BSTNode as a parameter. That way this function could go through the nodes in BST recursively and count the nodes less than a given value. BST couldn't do this recursion because it is a method for a tree not a node, so the helper function did all the work and count_less just called the helper function to return its value. This is an important concept that I think we will see in future labs and programming.

In class we learned about Big-Oh, however just a few weeks before I had learned about it in CSC165. It is a good way of measuring a function's run time, but an even better way is Big-Theta (which I just recently learned about), so I hope we cover that in 148 before the end of the course. The advantage of Big-Theta is that it gives a function that provides both a lower and upper bound (based on your choice of constant) for the runtime.

E3 was really fun to do. I find these exercises (and assignments) are just the right amount of difficulty: they are challenging but not impossible, which makes them interesting. I found the hint in the handout for the second function to be misleading and it made me go in a direction that was way more complicated than needed (all I'll say is that too many tuples were involved). However when I tried not to follow the handout's hint and do it my own way, I found there was a much more straightforward and intuitive way to write the functions. I wrote a helper function to determine the depth of a child, and then my main function would recursively check which child was deeper and then go find the sum there. Although it was different from their suggestion it still passed all the tests!
Good luck with A2 part 2 everyone!

Tuesday, March 11, 2014

Week 8 - Assignment 2 (Part 1) and Linked Lists

The raw results of Assignment 1 have finally been posted, and I'm excited because that means our marks are probably coming soon! I seem to have done well on the tests, so I'm optimistic about my mark. 
The first midterm marks were also posted, and I didn't do as well as I thought on one of the questions. When I went to pick it up, I realized I had made an incredibly stupid error. I had tried to index into a Tree and also tried to check if the Tree's children were None, when the code given above had clearly stated they would be [ ] if they were empty! I realized this right away after looking over my test and was annoyed, however now I know to read the questions carefully next time!
This week we had to submit Assignment 2. Assignment 2 is significantly different from our first assignment - the first one focused more on the math/algorithm part of programming, whereas this one is getting us to practice writing our own code without any guidance, and is focused on design. In class we learned about wrapper classes, and these were used quite a bit in Assignment 2. I found the handout to not provide much guidance for what we were supposed to write, but I looked at last semester's handout for tips to get started, which helped. The first part of Assignment 2 wasn't that challenging - I think the second part will be though. 
Also, this week's lab went much better than last week's. I feel I'm starting to get a handle on Linked Lists. My partner and I finished all the required steps early (a first!) and got to do the extra functions. Those were more difficult, but were still interesting and fun to complete. I find we are starting to move more quickly in 148, and it seems like you need a very good grasp of recursion to be able to understand everything from here on out. I have a decent grasp - I understand the concept and how it is applied, but I can't always trace or visualize the execution of a recursive function in my head. This is something I think is really important to understand recursion (and the rest of the course) so I'm going to practice this a lot in preparation for the coming midterm. 
Good luck on exercise 3 everyone and have a great week!

Monday, March 3, 2014

Week 7 - Recursion

We have been working with recursion for quite a few weeks, and I'd like to think that by now I understand it enough to be able to explain it. Recursion is a technique used in computer science where the solution to a certain problem depends on the solution to smaller problems of the same kind. A function that uses this technique will call itself, thus simplifying the problem into smaller problems with each call until it reaches some kind of base case. At this point, some value is obtained, and the function calls are executed in reverse order. 

Recursion can be used in several different situations, but there are a few main ones. One is when you have a big problem and you need to know how to solve smaller problems of the same kind in order to solve the big problem. This is what we did in Assignment 1. When we had 8 cheeses and 4 stools, in order to move those 8 cheeses we found an i value (3) that told us how many cheeses to move (5). So we had to move 5 cheeses from one stool to another stool - but in order to do that we had to first move (5 - i) cheeses from one stool to another stool, and in order to do that...until finally, all we had to do was move one cheese from one stool to another. This problem of moving 8 cheeses was solved simply by knowing how to move one cheese, and repeating that 33 times. 

Another situation is when you don't know how many items are in a list or how many times you'll have to repeat a process. An example of this is nested lists. A lot of the functions we wrote in class and also most of the functions in the Tree Class that we developed involved nested lists. Because we don't know to what depth the list is nested, we cannot use a for loop (where we would have to specify the number of items to go through.) A while loop cannot be used for similar reasons. The only way to solve this problem is to continue to call the function indefinitely, and to tell it when you want it to stop - this is called the base case. A recursive function always needs a base case and therefore usually needs an if statement; otherwise it will never stop calling the function.

Recursion is used a lot in computer science. Two simple examples of recursion are factorials and the Fibonacci sequence. A factorial (n!) is defined as (n x n-1!). This is a recursive definition, and the execution of this function would continue until n = 1, at which point it would execute the enclosing functions in reverse order. The Fibonacci sequence is a sequence where the next number is obtained by adding the two previous numbers (and the sequence starts with 1, 1). We could define a function for the nth term: fib(n) = fib(n-1) + fib(n-2). To find this, we would need to know the two previous terms, and hence their two previous terms and so forth, so we'll end up having to know the first two terms of the list. Both these functions are examples of the first use of recursion mentioned above.

Recursion is extremely useful. I have really enjoyed learning recursion so far and I hope that I will continue to enjoy it!

Have a good week everyone!

Sunday, February 23, 2014

Week 6 - Assignment 1 & Trees

Assignment 1 is done! 

The week before reading week was a busy one: finishing A1, doing two other midterms, and in the meantime trying to process the new information we were learning about trees in class. Out of all of these tasks, I found Assignment 1 the most enjoyable. I had already completed the bulk of the assignment and in the few days before it was due, I mostly just needed to write the code for the tour of four stools, and (my favourite...not): docstrings!

The hard part about writing the tour of four stools function for me was keeping track of all the different stools throughout the recursion. I traced out the code on several pieces of paper before I finally figured out how to write it. The hardest part was figuring out which intermediate to use as which parameter in the function - although in the end it didn't really make a difference. My function worked for up to 20 cheeses (and is probably still calculating the result for 100) so I hope I'll do well on this assignment. Speaking of which, I have been anxiously checking MarkUs every day for the past week, but expectedly, no marks up yet. 

The part of A1 I found the most difficult was the function where you asked the user for input, to play the game manually from the console. This is another example of the big leap from 108 to 148: in 108, we were supposed to assume all user input would be of the correct format. It took a while and several if statements, but eventually I got a functioning way to play the game. 

In class we learned about trees. I found trees to be the most difficult use of recursion to understand that we've seen so far, because we were creating a class that was defined recursively. I'm interested to see what we will use trees for, as I've heard that they are used a lot in more advanced programming. 

Now that that's all done, time to study for the midterm!
Good luck everyone!

Monday, February 10, 2014

Week 5 - Namespaces & Assignment 1

Hello again everyone,

Assignment 1 is due in a few days and Piazza has been overflowing with questions! I had originally intended to keep up with all the activity on Piazza from the start of the course, but seeing as there are 165 posts for A1, I don't think I'll be able to anymore. 

This week in class was a mash-up of Python namespaces and tips for A1. The Python namespace stuff was interesting: it seemed very confusing at first but once I understood it, I realized it was easy to follow if you trace the code carefully. One thing I find interesting about this and other topics in 148 is how much there is to know about Python that we didn't learn in 108. It seems like every week, there is some new topic of which I wasn't even aware based on what I learned in 108, and it is amazing to see how much I overlooked last semester. 

I finally started Assignment 1 last week, and was excited to get started because I was really eager to solve the game. I've always liked math puzzles and this one was definitely a tough one. There were two parts to this assignment: 1) writing the code, and 2) solving the problem. Each one presented its own difficulty: it took me a while to really understand the given code enough to know exactly how to implement the methods. Once I started though and realized I was going to use a dictionary for the TOAHModel, everything pretty much fell into place. The second part took me longer: I spent a long time trying to find a pattern or a formula for the "i" value that would give the least number of moves, until eventually I realized you could just obtain it with the code. I think the most important thing that I learned was not to divide this assignment into Computer Science and Math, but to use them hand in hand. To complete this assignment, I'm going to try putting that into practice.

Good luck with Assignment 1 everyone!