jump to navigation

Replacing a stack with concurrency 23 April 2008

Posted by Oliver Mason in erlang, NLP, programming.

For some language processing task I needed a reasonably powerful parser (a program to identify the syntactic structure of a sentence). So I dug out my copy of Winograd (1983) (“Language as a Cognitive Process”) and set about implementing an Augmented Transition Network parser in Erlang.

Now, the first thing you learn about natural language is that it is full of ambiguities, and so there will always be several alternatives available, several possible paths through the network which defines the grammar. The traditional solution is to dump all the alternatives on a stack, and look at them when the current path has been finished with. You can either go depth-first, where you complete the current path before you get the next one off the stack, or breadth-first, where you advance all paths by one step at a time, kind of pseudo-parallel.

Having to deal with a stack is tedious, as you need to keep track of the current configuration: which network are you at, what node, what position in the sentence, etc. But then, it occurred to me, there’s an easier way to do it (at least it’s easier in Erlang!): every time you come to a point where you have multiple alternatives, you spawn a new process and pursue all of them in parallel.

The only overhead you need is a loop which keeps track of all the processes currently running. This loop receives the results of successful paths, and gets notified of unsuccessful ones (where the process terminates without having found a valid structure). No need for a stack, and hopefully very efficient processing on multi-core machines as a free side-effect.

I’m still amazed how easy it was to implement. I wouldn’t have fancied doing that in Java or even C. For my test sentences I had about 8 to 10 processes running in parallel most of the time, but it depends on the size of the grammar and the length of the sentence really. What I liked about this was that it seemed the natural way to do in Erlang, where working with processes is just so easy.

And also, another nail in the coffin for the claim that you can’t use Erlang for handling texts easily!


Safe receive in Erlang 20 April 2008

Posted by Oliver Mason in erlang.
1 comment so far

I’m currently working on a system which involves a cascade of processes, and ran into some problems with communication between them. Each of the ‘service’ processes spends most of the time in a receive loop, and then performs some action. Communication is done via a simple rpc function that sends off a message and then receives the reply and returns.

This is fairly standard, it seems, and is explained in Armstrong’s Programming Erlang (p.145). However, a problem arises if the process (B) sending a message to (C) also receives a request from (A):

A => B <= C

The message from (A) interferes with the response from (C). Chaos ensues.

A possible solution makes rpc more robust, but has a little overhead: any replies sent include the process id of the replying process. The rpc function can then just react to messages coming from the process it called.

This could look like:

rpc(Pid, Msg, Arg) ->
Pid ! {self(), Msg, Arg},
{Pid, Reply} ->

Problem solved!

Dealing in meteorites 9 April 2008

Posted by Oliver Mason in misc.
add a comment

While clearing out a decade’s worth of paper from my office I came across an article in the Feb 2003 issue of the university’s in-house paper, Buzz. In one article, the following extract struck me: … artists had acquired a meteorite from a meteorite dealer in Scotland.

Several questions come to mind: Can you make a living as a meteorite dealer? I guess there are more meteorites hitting Earth as one would think, not all of course big enough to make (no pun intended) an impact. And why Scotland? Is Scotland especially prone to being hit by meteorites?

Thanks to the web, the first question can be supplemented by a useful snippet of information: Over the whole surface area of Earth, that translates to 18,000 to 84,000 meteorites bigger than 10 grams per year. But of course 70% of that end up in the sea, and most of them never make it through the atmosphere in the first place. I also don’t want to speculate how much demand there is for meteorites, other than from conceptual artists and perhaps astrophysicists.

Erlang on the Tipping Point? 2 April 2008

Posted by Oliver Mason in erlang.
add a comment

Yariv wants to rename the Erlang web framework ErlyWeb to ‘Erlang on Rails’, as he believes Erlang is on the tipping point, and such a name change would swing more programmers towards adopting Erlang. Mmmh, I should have read the comments rather than just the article via RSS: it appears to be an April Fool. But that doesn’t change the original point of this post.

While that might be a good thing to do (April Fool or not), for me the more important move would be to unify/restructure the various libraries and their documentation. I find it still tricky to find details about some functionality because I don’t remember whether it’s in stdlib or kernel, and the web-based documentation browser makes it very hard to navigate. It also seems to be rather random to the average programmer without too much insight into the underlying implementation details.

But otherwise I agree with him, in that I too think Erlang might be going somewhere. I will definitely stick with it, and hope for the best that the libraries will sort themselves out. Though I’m not too hopeful because of backwards compatibility issues…