Thursday, December 29, 2011

Common escape characters


\a

Beep

\b

Backspace

\c

"Control" character. \cD = CTRL-D

\e

Escape

\f

Form feed

\l

Make the next letter lowercase

\n

New line, return.

\r

Carriage return.

\t

Tab.

\u

Make the next letter uppercase

\x

Enables hex numbers

\v

Vertical tab

A simple problem in R

I have recently started looking at the R environment for statistical analysis.
After one day, I am fascinated and upset.
The fascination comes from the insanely large number of abilities you are given.
The frustration is from the same source.
Namely, there is no single way to do something.
Also, the naming sucks and it is hard to get around with no empirical basis.
Anyway, here is a sample problem that I happened to have to solve along the way.

We have a small sample survey of stress levels at work.
The sample size is 30 and the categories used for classification are "none", "somewhat" and "very".
Stressful, that is.
We need to figure out the frequency distribution of this data.
The input given is a comma separated file with the 30 values in one line.
Say it is called "stress.csv", located in the pwd.
Here is my example R session:

stress = read.table("stress.csv",header=FALSE)
stress = t(stress)
signif((table(stress)/length(stress)*100),3)

Notice that since we have the initial data in one line, we have to transpose the numbers before we start working with it.
We then use the table function to count occurrences of unique elements.
The last statement looks awkward, and may be cleaned up.
I do, however, like nesting things.
Here is a simplified version:

#Count unique elements
occurrences = table(stress)
#Count fractional frequency of each element                      
frequency = occurrences/length(stress)
#Convert to percent value           
frequency_percent = frequency * 100
#Format with three significant digits              
formatted_output = signif(frequency_percent, 3)  
Hope this clarifies things a bit.
Printing formatted_output gives:
stress
    none somewhat     very 
    20.0     46.7     33.3 
Quick and dirty, but seems to do the job.

Wednesday, November 16, 2011

A morning of building

I am a university student.
And my computer department's software policy is terrible.
For one, their VIM was not compiled with GNU support.
So you are stuck with a fairly awkward terminal VIM.
I mean, of course, VIM is supposed to be all keyboard-rocking and everything but...
...isn't is nice to sometimes just grab those code lines with a mouse select?
Anyways, I set out on a quest for getting gVIM working on my account.
Certainly, I do not have sudo permissions, so I had to get the source.
Ok, now what? I have never built anything outside of an IDE before.
make?, was it?
Alright,
cd vim73/src
make
#big, scary, make log here
./vim -g
Awesome!
But as soon as I try to turn on syntax highlighting, I notice that everything is not this great.
I do not have write access to the /usr/bin.
After a few minutes of searching I ended up doing this:
./configure --prefix=$HOME/vim/
make install
#bigger, scarier, make install log here
cd ~/vim/bin/vim -g
Essentially, you tell make install to install everything in the prefix directory.
$HOME is the home directory environmental variable of course.
To be cool, I have also edited by shell startup file (.bash_profile for bash).
alias vim='$HOME/vim/bin/vim'
alias gvim='$HOME/vim/bin/vim -g'
Let the magic begin!

Tuesday, November 15, 2011

A few words on syntax

I have wanted to get some really easy out-of-the-box setup to color the syntax in the code.
After all, most of the posts on this blog are about code.
Google provides a nice easy solution: google-code-prettify.
Here is a step by step:
1) Include the following files from the Google SVN repo:
<link href=
    "http://google-code-prettify.googlecode.com
    /svn/trunk/src/prettify.css" 
    type="text/css" rel="stylesheet" />
<script type="text/javascript" 
    src="http://google-code-prettify.googlecode.com
    /svn/trunk/src/prettify.js"></script>
2) Add an on-load to the body of the html:
onload="prettyPrint()"
3) Any time you write code, use the following tags:
<pre class="prettyprint lang-html">
    <!-- Hey, look, this is a nice little comment. 
    I should have more of these in my code! -->
</pre>
Note that the new dynamic layouts do not allow you to modify HTML directly, so I ended up going with the old layouts.
Also, this tool by Dan Cederholm has helped me enormously.

Java iterator for a binary search tree

Java has a commonly used Iterator interface.
It is no-where as cool as implicit python iterators, but it may be useful every now and then.
It is usually used like this:
Iterator e = container.iterator();  
while (e.hasNext())  
{  
    System.out.println(e.next());
}  
How do we implement a binary search tree iterator?
Look at the code here for binary search tree iterative traversal.
A simplest solution would be to "print" a tree into a storage array and then allow the iterator to operate on this array.
Alternatively (as shown here), we can use the iterative method developed earlier to maintain iterator state.
A basic idea for this iterator is saving a pointer (I call it a cursor) to a most recently found element.
Since the iterator has to access a lot of the tree's internals, I like to declare it as a private class inside my binary search tree implementation.
Such an approach ensures that variables such as tree.root are not seen from the outside.
Here is a simple implementation of an iterator that allows in-order traversal:
private class BSTIterator implements Iterator
{
    Node root, cursor;
    Stack  iteratorStack;

    public BSTIterator (BSTNode root)
    {
        this.root = root;
        this.cursor = root;
        this.iteratorStack = new Stack ();
    }

    public boolean hasNext()
    {
        return (!iteratorStack.empty() || cursor != null);
    }

    public Comparable next()
    {
        Comparable nextNodeValue;
        while (cursor != null)
        {
            iteratorStack.push(cursor);
            cursor = cursor.leftChild;
        }
        cursor = iteratorStack.pop();
        nextNodeValue = cursor.key;
        cursor = cursor.rightChild;
        return nextNodeValue;
    }
}
The things to notice here are that we just broke up the iterative method into separate chunks inside this class.
All of our local variables moved to class globals.
The while condition became the hasNext() boolean.
The body of the iterative method was modified for next().
As a result, we get nice, clean, multipurpose code.

Iterative traversal of a binary search tree

A binary tree is a simple yet beautiful structure.
One of its most compelling features is sorting, which directly relies on the binary search tree property.
Here is an example of printing a sorted tree:
void printBinaryTreeInOrder (Node cursor)
{
    printBinaryTreeInOrder (cursor.leftChild);
    print (cursor.value + ", ");
    printBinaryTreeInOrder (cursor.rightChild);
}
Nice and easy.
However, it is a bit harder to use an iterative approach to the same problem.
The solution is essentially keeping track of your own call stack while the method executes.
A few words aside.
In a classical von Neumann machine, data and program instructions can be stored in same memory.
A working recursive function allocates stack memory (which is most likely in RAM) every time a new call is made.
For an iterative solution, we will also keep track of the visited nodes in a stack.
So to really understand how the code below works, write out a simple tree on a piece of paper and trace the order in which each node is visited by the recursive function.
Here is the little piece of iterative magic:
void printBinaryTreeInOrder
{
    Node cursor = root;
    Stack  iteratorStack = new Stack  ();
    while (iteratorStack.notEmpty() || cursor)
    {
        if (cursor)
        {
            iteratorStack.push(cursor);
            cursor = cursor.leftChild;
        }
        else
        {
            cursor = iteratorStack.pop();
            print(cursor.key);
            cursor = cursor.rightChild;
        }
    }
}
In essence, we walk the tree from the root.
First, we go all the way to the left-est node, pushing all the nodes we passed on the way there onto the stack.
The node at the far left is the smallest, that's the one we want to print first.
When we call for the left child again, we get null in return.
We go on to the else clause, grabbing the last valid node from the stack and printing it.
Afterwards, we shift to the right node of the cursor and happily chug along.