jump to navigation

Tags for Erlang 17 March 2008

Posted by Oliver Mason in erlang.

The ‘ctags’ program creates a ‘tags’ file, which indexes routines/methods/functions in C source files (and a variety of other languages. The editor vile (my ‘IDE’) can make use of those: you position the cursor on the name of a function, press ^], and the editor jumps to its definition (loading the appropriate source file). The tags file consists of the function name, the file name, and a search pattern.

Running ctags on an Erlang source file does not produce any usable results, unsurprisingly. However, the file format is so simple that a straightforward shell-script will do:

rm -f tags
for i in *.erl
cat $i | grep "^[a-z0-9_]*(" |
sed "s/^\([^(]*\)(.*$/\1/" | uniq |
awk  -v file=$i '{printf("%s    %s\\
   /^%s(/\n",$0,file,$0);}' >> tags

(The double backslash indicates a line break, as the end of the line would otherwise be outside the box and invisible)

This file processes all *.erl files in a directory, extracts the function names (a sequence of letters, numbers, underscores beginning on the first column and followed by an open round bracket), and creates the correct lines for the tags file.

Works fine so far.


Progress – More Erlang 8 March 2008

Posted by Oliver Mason in erlang.
1 comment so far

Erlang practice is slowly moving along. At a speed of about one per day I am currently porting my Java NLP tools over to Erlang. It still takes quite some time getting used to it, especially the flexibility with types.
Everything is a list or a tupel or an atom, so it is sometimes tricky to keep track of what you’re dealing with. Ad-hoc solution: annotate the type.

My implementation of a trie is simply a list, so when looking something up in a trie I pass a list as the first parameter. But what if it’s not a trie-list? Bad things happen within the trie module, far from the place where the actual error is caused. So the exported functions, insert/3 and retrieve/2 now take a tupel rather than a list. The tupel is {trie List}, where the List is what I used to pass previously. But now it is a lot easier to check that the error is actually that the wrong element is being passed.

Learning Erlang 2 March 2008

Posted by Oliver Mason in erlang.
add a comment

After messing about with Scala for a while I found the syntax rather confusing. It’s just too much. Somehow I then had a look at Erlang, can’t quite remember why. But Erlang promises to be robust and easily concurrent, and it is an industrial strength language, being used for telephony programming at Ericsson. And it looks remarkably like Prolog. Now I’m glad I did do Prolog at uni all those years ago, even though it seemed like a waste of time at the time.

Erlang is not exactly famous for its string processing facilities: a string is just a list of numbers. But it also means you can use list processing stuff for dealing with strings. By way of learning I now port/rewrite my corpus access tool in Erlang. That’s the fourth language already, after C, C++, and Java. The Scala version didn’t get started properly, and I think I’ll abandon that now.

There are some pitfalls to bear in mind when working with Erlang. Mainly it is getting used to the way Erlang works. A very useful feature is that functions can return tupels, eg {ok, Result}, or {error, Reason}. One only needs to have a common way of doing that, so that some functions don’t return a tupel while others don’t. Or if they do, keep in mind what function returns what.

Sending messages between processes: Creating processes is dead easy, and I hope it will really mean speed when I run the system on the university’s multi-core machine. Just be careful to remember what messages each process accepts, and don’t include a catch-all one. Especially intermediate processes which get a request from some process, and send one out to another process can sometimes get muddled up with their messages… And don’t open files in ‘raw’ mode without knowing what it means!

So far it is going well. After about a week I have the basic concordancing functionality ported. Of course most of that time was spent debugging and wondering why it didn’t work the way I wanted it. But the files are a lot shorter, and perhaps Erlang does indeed allow a tenfold increase in productivity. One thing I’m unlikely to port is the indexing stuff: that works alright in Java and is a one-off procedure anyway.

I cannot make any judgments on speed: no profiling yet. But once I can do a few more things with the system, I will test-drive it on the Blue Bear computer. Each corpus will be in its own process, so processing them in parallel looks promising.