Power supply woes 9 May 2008
Posted by ojmason in Apple.add a comment
Yesterday the power supply of my MacBook started playing up; intermittently it would stop either charging the battery, or would come up as disconnected (ie fail completely). The cable from the brick to the laptop is rather ‘molten’ near the brick-end, and by bending it and winding it around the two fold-out hooks I could get it to remain stable. But today it suddenly looks brown as well–it obviously can’t cope with the heat.
On the web I’ve read that the power supply (which provides 60W) was originally designed for 50W only, which would explain why it gets so hot, and why lots of other people have had problems. With my Sony Vaio I never had any issues with the power supply, apart from the plug becoming wobbly after the kids ran into the cable a couple of times (which is an advantage of the MagSafe plug!).
I also read a recommendation to use a MacBook Pro supply, which has a 85W rating. This is supposed to remain cooler. So when I noticed the alarmingly brown/black cable I went straight to the local Apple Store in town and asked for one of those. But the ‘geniuses’ said it would mess up the battery by overcharging it, so I went for a 60W model again to be on the safe side. Of course my MacBook is 22 months old and beyond warranty, so I had to shell out the £60 for the new power brick. The genius also said “this should not happen”, and looked at me as if I had submitted the cable to a barbecue or something. However, I never even wound it up on the hooks, as the cable seems so feeble. Stylish looks, but rather impractical. I’ll have to think of some cooling solution, I guess.
I’ve kept the dodgy supply to repair the cable at some point, and when I do so I’ll post some photos here. One thing that I find amazing is how grubby the cable looks after almost two years of constant use.
Replacing a stack with concurrency 23 April 2008
Posted by ojmason in NLP, erlang, programming.add a comment
For some language processing task I needed a reasonably powerful parser (a program to identify the syntactic structure of a sentence). So I dug out my copy of Winograd (1983) (”Language as a Cognitive Process”) and set about implementing an Augmented Transition Network parser in Erlang.
Now, the first thing you learn about natural language is that it is full of ambiguities, and so there will always be several alternatives available, several possible paths through the network which defines the grammar. The traditional solution is to dump all the alternatives on a stack, and look at them when the current path has been finished with. You can either go depth-first, where you complete the current path before you get the next one off the stack, or breadth-first, where you advance all paths by one step at a time, kind of pseudo-parallel.
Having to deal with a stack is tedious, as you need to keep track of the current configuration: which network are you at, what node, what position in the sentence, etc. But then, it occurred to me, there’s an easier way to do it (at least it’s easier in Erlang!): every time you come to a point where you have multiple alternatives, you spawn a new process and pursue all of them in parallel.
The only overhead you need is a loop which keeps track of all the processes currently running. This loop receives the results of successful paths, and gets notified of unsuccessful ones (where the process terminates without having found a valid structure). No need for a stack, and hopefully very efficient processing on multi-core machines as a free side-effect.
I’m still amazed how easy it was to implement. I wouldn’t have fancied doing that in Java or even C. For my test sentences I had about 8 to 10 processes running in parallel most of the time, but it depends on the size of the grammar and the length of the sentence really. What I liked about this was that it seemed the natural way to do in Erlang, where working with processes is just so easy.
And also, another nail in the coffin for the claim that you can’t use Erlang for handling texts easily!
Safe receive in Erlang 20 April 2008
Posted by ojmason in erlang.1 comment so far
I’m currently working on a system which involves a cascade of processes, and ran into some problems with communication between them. Each of the ’service’ processes spends most of the time in a receive loop, and then performs some action. Communication is done via a simple rpc function that sends off a message and then receives the reply and returns.
This is fairly standard, it seems, and is explained in Armstrong’s Programming Erlang (p.145). However, a problem arises if the process (B) sending a message to (C) also receives a request from (A):
A => B <= C
The message from (A) interferes with the response from (C). Chaos ensues.
A possible solution makes rpc more robust, but has a little overhead: any replies sent include the process id of the replying process. The rpc function can then just react to messages coming from the process it called.
This could look like:
rpc(Pid, Msg, Arg) ->
Pid ! {self(), Msg, Arg},
receive
{Pid, Reply} ->
Reply
end.
Problem solved!
Dealing in meteorites 9 April 2008
Posted by ojmason in misc.add a comment
While clearing out a decade’s worth of paper from my office I came across an article in the Feb 2003 issue of the university’s in-house paper, Buzz. In one article, the following extract struck me: … artists had acquired a meteorite from a meteorite dealer in Scotland.
Several questions come to mind: Can you make a living as a meteorite dealer? I guess there are more meteorites hitting Earth as one would think, not all of course big enough to make (no pun intended) an impact. And why Scotland? Is Scotland especially prone to being hit by meteorites?
Thanks to the web, the first question can be supplemented by a useful snippet of information: Over the whole surface area of Earth, that translates to 18,000 to 84,000 meteorites bigger than 10 grams per year. But of course 70% of that end up in the sea, and most of them never make it through the atmosphere in the first place. I also don’t want to speculate how much demand there is for meteorites, other than from conceptual artists and perhaps astrophysicists.
Erlang on the Tipping Point? 2 April 2008
Posted by ojmason in erlang.add a comment
Yariv wants to rename the Erlang web framework ErlyWeb to ‘Erlang on Rails’, as he believes Erlang is on the tipping point, and such a name change would swing more programmers towards adopting Erlang. Mmmh, I should have read the comments rather than just the article via RSS: it appears to be an April Fool. But that doesn’t change the original point of this post.
While that might be a good thing to do (April Fool or not), for me the more important move would be to unify/restructure the various libraries and their documentation. I find it still tricky to find details about some functionality because I don’t remember whether it’s in stdlib or kernel, and the web-based documentation browser makes it very hard to navigate. It also seems to be rather random to the average programmer without too much insight into the underlying implementation details.
But otherwise I agree with him, in that I too think Erlang might be going somewhere. I will definitely stick with it, and hope for the best that the libraries will sort themselves out. Though I’m not too hopeful because of backwards compatibility issues…
Tags for Erlang 17 March 2008
Posted by ojmason in erlang.Tags: erlang
2 comments
The ‘ctags’ program creates a ‘tags’ file, which indexes routines/methods/functions in C source files (and a variety of other languages. The editor vile (my ‘IDE’) can make use of those: you position the cursor on the name of a function, press ^], and the editor jumps to its definition (loading the appropriate source file). The tags file consists of the function name, the file name, and a search pattern.
Running ctags on an Erlang source file does not produce any usable results, unsurprisingly. However, the file format is so simple that a straightforward shell-script will do:
#!/bin/sh
rm -f tags
for i in *.erl
do
cat $i | grep "^[a-z0-9_]*(” |
sed “s/^\([^(]*\)(.*$/\1/” | uniq |
awk -v file=$i ‘{printf(”%s %s\\
/^%s(/\n”,$0,file,$0);}’ >> tags
done
(The double backslash indicates a line break, as the end of the line would otherwise be outside the box and invisible)
This file processes all *.erl files in a directory, extracts the function names (a sequence of letters, numbers, underscores beginning on the first column and followed by an open round bracket), and creates the correct lines for the tags file.
Works fine so far.
Progress - More Erlang 8 March 2008
Posted by ojmason in erlang.1 comment so far
Erlang practice is slowly moving along. At a speed of about one per day I am currently porting my Java NLP tools over to Erlang. It still takes quite some time getting used to it, especially the flexibility with types.
Everything is a list or a tupel or an atom, so it is sometimes tricky to keep track of what you’re dealing with. Ad-hoc solution: annotate the type.
My implementation of a trie is simply a list, so when looking something up in a trie I pass a list as the first parameter. But what if it’s not a trie-list? Bad things happen within the trie module, far from the place where the actual error is caused. So the exported functions, insert/3 and retrieve/2 now take a tupel rather than a list. The tupel is {trie List}, where the List is what I used to pass previously. But now it is a lot easier to check that the error is actually that the wrong element is being passed.
Learning Erlang 2 March 2008
Posted by ojmason in erlang.add a comment
After messing about with Scala for a while I found the syntax rather confusing. It’s just too much. Somehow I then had a look at Erlang, can’t quite remember why. But Erlang promises to be robust and easily concurrent, and it is an industrial strength language, being used for telephony programming at Ericsson. And it looks remarkably like Prolog. Now I’m glad I did do Prolog at uni all those years ago, even though it seemed like a waste of time at the time.
Erlang is not exactly famous for its string processing facilities: a string is just a list of numbers. But it also means you can use list processing stuff for dealing with strings. By way of learning I now port/rewrite my corpus access tool in Erlang. That’s the fourth language already, after C, C++, and Java. The Scala version didn’t get started properly, and I think I’ll abandon that now.
There are some pitfalls to bear in mind when working with Erlang. Mainly it is getting used to the way Erlang works. A very useful feature is that functions can return tupels, eg {ok, Result}, or {error, Reason}. One only needs to have a common way of doing that, so that some functions don’t return a tupel while others don’t. Or if they do, keep in mind what function returns what.
Sending messages between processes: Creating processes is dead easy, and I hope it will really mean speed when I run the system on the university’s multi-core machine. Just be careful to remember what messages each process accepts, and don’t include a catch-all one. Especially intermediate processes which get a request from some process, and send one out to another process can sometimes get muddled up with their messages… And don’t open files in ‘raw’ mode without knowing what it means!
So far it is going well. After about a week I have the basic concordancing functionality ported. Of course most of that time was spent debugging and wondering why it didn’t work the way I wanted it. But the files are a lot shorter, and perhaps Erlang does indeed allow a tenfold increase in productivity. One thing I’m unlikely to port is the indexing stuff: that works alright in Java and is a one-off procedure anyway.
I cannot make any judgments on speed: no profiling yet. But once I can do a few more things with the system, I will test-drive it on the Blue Bear computer. Each corpus will be in its own process, so processing them in parallel looks promising.
Parallel Processing 26 February 2008
Posted by ojmason in programming.add a comment
My grammar pattern processing was running at a speed of about 7 words per minute, which amounted to an estimated completion time of 1 week for all the 56,000 words in the BNC I am currently looking at. However, as processes on the Blue Bear system can only run a certain length of time it would have required occasional restarts (about 2 a day or so). And due to the way it was set up a restart would go through all words, beginning at ‘a’, finding that for the first x words processing had already been done. Clearly not very efficient.
Now, this is sequential processing. With a highly parallel system it would be a lot faster to run things in parallel, and so I changed the setup and split the whole task into different runs for each letter. Now there are 27 processes churning away at the letters a-z, and the overall throughput is a lot higher, at approximately 180 words per minute. That means the total user time would be little over 5 hours; which is about factor 27 faster than a week!
And once this has finished, I can set it running for the remaining features while working with the data that has then been gathered already.
WordPress after all 26 February 2008
Posted by ojmason in meta.1 comment so far
After having played around with my own, hand-written blog-engine in PHP for a while I realise that it’s not worth it. Going ‘mainstream’ as it is I have perhaps a bit less control over things, but at least it’s a complete system, with categories, navigation, and even comments. I simply don’t have the time to mess about with these things myself anymore.
So the first step is to migrate all worthwhile postings from my old system to this one, not too hard, as I didn’t post that much recently.