jump to navigation

App Store Idiots 20 June 2009

Posted by ojmason in Apple, iphone.
Tags: ,
1 comment so far

You have to be very careful when looking at reviews and ratings in the iPhone App Store (does ‘App Store’ need the ‘iPhone’ qualifier?). Some reviews are good and point out genuinely positive or negative issues with apps, but the majority can safely be ignored, as shown by the uselessness of their comments (‘this app is rubbish, I want my money back’) and their inability to match what they say with what they do. The number of times I have seen reviews where the reviewer wrote ‘This app gets 5 stars from me!’ when they actually gave ONE, or the reverse (‘this cr*p app gets zero stars from me’, but giving 5) is astonishing. Either those people are innumerate or incapable of keeping more than one thought in their head.

I haven’t got any hard facts, but have read somewhere that people are paid for leaving positive reviews of some apps and negative reviews of competing ones. That is about the only explanation for the number of stupid reviews that isn’t too depressing.

Table Frustration 31 May 2009

Posted by ojmason in Apple, iphone, objective-c, programming.
add a comment

DAY 8 and after a long pause I have been able to spend a bit more time on my project. And it was certainly frustrating, racking up the hours while trying to work out why it didn’t work. One thing I need to get used to are IDEs; more than a decade of simply editing code in vi and compiling it on the command-line leave you ill-prepared for working with XCode, nice though it may be. Interface Builder is also a great tool, but it hides things from direct view or easy inspection, at least if you don’t know where to look. And then something doesn’t work because you omitted a link from a View to its File Owner, or you set the base class of MainWindow.xib to a wrong class type.

After trying out some table code in the iPhone Development book I finally got a small demo going. I’m now trying to create an indexed grouped table display backed by a sqlite database working. But now it’s approaching midnight and I’m too tired to work out why it stopped working again…

However, despite all of today’s frustration, I’m getting more and more used to coding in Objective-C, which has to be a positive side-effect!

Further Education 11 May 2009

Posted by ojmason in iphone, objective-c, programming.
1 comment so far

I’m too busy with marking and a paper I have to write to get stuck in again. On the way home today I had a thought about the database-backed table view it will include: something regarding sorting which I came across when working at the Cobuild project. More on that later.

But I’m still using the time more or less productively by watching the lectures on iPhone programming from Stanford (available on iTunesU). Not too sure about the performance aspect, but it is great to be able to listen to those guys explaining the intricacies of Objective-C memory management. And I won’t complain, as it is freely available. Thanks for that, Stanford!

Building the database without glare 5 May 2009

Posted by ojmason in Apple, erlang, iphone, objective-c, programming.
1 comment so far

DAY 7 sounds like a lot, but again I’m only working a few hours in the evening after the kids have gone to bed. I have the feeling that there is a lot of ‘boilerplate’ code to write in Objective-C, or Cocoa at least. But then, that might be the problem with GUI-related programming. Erlang doesn’t nearly need as many lines to accomplish something, anything really! But then, the kind of Erlang programs I’ve been working on are basic R&D text-only affairs, not MVC-style user interface programming.

I have today learned how to turn off the iPhone program icon glare (add UIPrerenderIcon=true to info.plist in the bundle), which I think looks better on my icon which had a horizontal line just about where the glare-line was. I also created a SQLite database from a text file. Next I will need to integrate the DB with the table view, for the first part of actual functionality. I keep switching between chapters in the textbook, the one which builds tables and the one which deals with persistent storage. Why did nobody think of doing a database-backed table view? Need to check Apple’s sample code, they’ve got a book-storage one which might do that.

The depressing bit is that I write a little Noddy-program, and it takes me ages. Mainly getting used to Cocoa, Xcode, and Objective-C, but also going back to no garbage collection, and shifting data between classes I know not well at all (NSString, NSArray, etc). I think my niche will be little utilities, rather than glorious games!

Certified 5 May 2009

Posted by ojmason in Apple, iphone, programming.
1 comment so far

DAY 6 was a little frustrating, but ended on a positive note. First I was struggling with Xcode and multiple views, and strange errors. I had my iPhone plugged in to charge, and that made Xcode stop, complaining about the mismatch of a certificate or something. I then went to Apple’s site to activate my developer’s programme membership, and after some messing about with certificates and provisions it finall worked. And, instead of running the emulator, Xcode installed the app on my phone. It’s only a skeleton at the moment, but it is really exciting to see the first bit of software that I wrote myself running on the iPhone itself!

I then re-did the icon, using a darker colour, as the phone put in a highlight by itself, and it looks better that way. I need to do more icons now, as my tab-bar is text only at present. And then I need to code the actual program!

Program Development By Step-wise Refinement 26 April 2009

Posted by ojmason in Apple, iphone, objective-c, programming.
2 comments

DAY 5, and I’m making my first steps in Objective-C, other than keying in sample programs and trying little things out. It’s been such a long time since I had to worry about memory management, though it looks relatively easy, compare to original C. Still, would be better if one didn’t have to worry about it.

After I got into the Erlang mind-set with my most recent programming, I need some time to get back into C. I’m following a methodology I picked up more than 20 years ago, Wirth’s 1971 Paper on Program development by stepwise refinement, at least as much as I remember it. I haven’t read it since about 1990 I think. Basically you formulate your program in pseudo-code, and step-by-step get closer to actual code.

My first refinement step was to create a Stack class, which is little more than a wrapper around a NSMutableArray with just three methods: push, pop, and isEmpty. Trivial, but useful to practice getting used to the way things work in Objective-C. There is a lot which seems to be based on conventions, such as the names of mehtods which get called under the hood at various stages. Luckily there is plenty of material on the web to help you pick those things up!

As for my first app-project, a morphological analyser, I’m slowly getting there. The main part is probably about 50% coded, and the lexicon-lookup routine is still missing. I guess I’ll opt for a sqlite database here, seems like the most effective way to access that kind of data. But I’ll cover that in a later post.

Oh, and here is the reference for the Wirth paper:

  • Wirth, Niklaus (1971) Program Development by Stepwise Refinement, in Communications of the ACM, Vol. 14, No. 4, April 1971, pp. 221-227.

On-line versus Paper 22 April 2009

Posted by ojmason in Apple, iphone, programming.
add a comment

DAY 4, and the textbook I ordered arrives from Amazon. (I’ll skip days on which I don’t do anything iPhone-dev related). The book looks good, and I now finally understand more about the connections between InterfaceBuilder and XCode. And why my attempts at displaying stuff in an UIWebView didn’t work: it’s all in the connections between interface components and variables in the code. Using graphical tools when you’re not used to them can be tricky.

This I think is the most important aspect in all this: previously my development environment (for pretty much the past 15 odd years) was vi (actually, vile) and a command-line version of either gcc or javac. I am so not used to IDEs, and all the things you won’t find in files. But I have to admit that using IB can greatly speed up the process of laying out applications.

On the telly and web I hear more stories from people making a killing with iPhone apps. Nice, though I have no illusions, as the kinds of things I’d be working on are fairly niche-type products. And I don’t even know how far I get, as I’m only doing it in my spare time. If it becomes too time-consuming I will probably not get very far.

Still, it would be nice to have a finished product in the App-store!

The book, btw, is an easy read. And apart from learning about the iPhone itself there is also a lot of getting used to the Mac style of doing things. Somehow a completely different world. And Objective-C has its own idioms. And on the iPhone there’s no garbage collection– that was always one of the biggest arguments for abandoning C++ in favour of Java, no more segmentation faults. Much fewer bugs, a whole category of them wiped out without any extra effort on my part. So back to manual transmission we go.

The main question when buying Beginning iPhone Development for me was whether it was going to be worth it. And given my lack of experience of developing for an Apple platform I’d say it is. And having it all in front of you in a book is still better than trying to find it yourself on the web, even if all the information is out there somewhere. Books are still more convenient than electronic documentation, at least for textbooks; reference material might be different.

A New Platform… 18 April 2009

Posted by ojmason in Apple, iphone, programming.
2 comments

DAY 1

After playing with my latest gadget (an iPhone) and looking at what’s on offer at the App store I was beginning to wonder whether I could start a side-line as an iPhone-developer myself. There are some things I’d like to do with it for which there don’t seem to be apps readily available, and I also have some ideas for useful tools which other people might even be prepared to shell out a few pennies for. Not that I think they’ll be bestsellers, but with large numbers of people looking at low-cost software there must be some chance for a few sales world-wide.

So off starts The iPhone Project. Having read some people’s stories about developing software on it, it doesn’t seem to bad. And, I don’t have to sell lots of stuff as I don’t depend on the income. Some smallish contribution to the cost of the monthly contract would be useful…

However, you need to invest first: the iPhone SDK only runs on Leopard, while I’m still on Tiger. And then there’s $99 per annum to pay to Apple. But I think I’ll pass on any reference books, as there seems to be quite a lot of on-line material available. And Objective-C doesn’t look too scary either.

First step: ordering the OS upgrade.

DAY 2
I spent the best part of a day trying to make a full backup image of my drive in case something goes wrong–external USB drives are solo slow if you’re shifting dozens of gigabytes around. The ordered Leopard DVD arrives at 15:30, but I still need until about 23:00 before my computer is ready.

If all goes well, the upgrade will have finished just before I actually fall asleep, so getting my iPhone to say “hello world” has to wait for another day.

DAY 3
It did, though it took until 1:00 in the morning. And the next day is spent playing with the iPhone SDK. I do get the obligatory Hello World running, but my attempt at app #1’s interface fails miserably. I can’t tell the UISearchBar not to capitalise and auto-correct the input, and the output (richly formatted text) was meant to be displayed by a UIWebView which remains conspicuously blank.

More frustration, as I cannot for some reason subscribe to the iPhone Developer Programme. The Website does seem to have some problems with my FF browser, and Safari no longer works due to the upgrade to Leopard. I try to download Safari from the Apple Website, but that doesn’t work either. Finally I use my iPhone(!) to enroll, for a cost of about £60 p/a. That should be the main investment sorted, apart from my time.

However, I do begin to wonder whether a textbook on iPhone development would not be a good idea after all… And, just after having written the previous sentence I decide that £20 for a book with consistent 5-star-reviews would probably worth the time I save by tracking down all the info on the web. Amazon, here I come!

So, after three days there is a certain amount of frustration, but also some optimism. And I keep getting more ideas about what kind of apps I might want to write!

Current balance: -£139.40

to be continued…

Collocations – Do we need them? 3 March 2009

Posted by ojmason in linguistics.
add a comment

The concept of collocation was introduced in the middle of the last century by J.R. Firth with his famous quote “You shall know a word by the company it keeps”. Words are not distributed randomly in a text, but instead they stick with each other, their ‘company’. Starting in the late 1980s, the increased interest in collocation by computational linguists and others working in NLP has lead to a proliferation of methods and algorithms to extract collocations from text corpora.

Typically one starts with the environment of the target (or node) word, and collects all the words that are within a certain distance (or span) of the node. Then their frequency in a reference corpus is compared with their frequency in the environment of the node, and from the ratio of frequencies we determine whether they’re near the node by chance or because they’re part of the node’s company. A bewildering variety of so-called significance functions exists, the oldest probably being the z-score, used by Berry-Rogghe in 1973; later, Church and Hanks (1991) popularised mutual information and t-score, which now seem to have been displaced by log-likelihood as the predominant measure of word association.

The problem is: all these metrics yield different results, and nobody knows (or can tell) which are ‘right’. Mutual information, for example, favours rare words, while the t-score promotes words which are relatively frequent already. But apart from rules-of-thumb, there exists no linguistic justification why one metric is preferable to another. It is all rather ad-hoc.

Part of this is that collocation as a concept is rather underspecified. What does it mean for a word to be ‘significantly more common’ near the node word as opposed to be there just by chance? In a sense, collocations are just diagnostics: we know there are words that are to be expected next to bacon, and we look for collocates and find rasher. Fantastic! Just what we expected. But then we look at fire, and find leafcutter as a very significant collocate. How can that happen? What is the connection between fire and leafcutter? The answer is: ants. There are fire ants, and there are leafcutter ants, and they are sometimes mentioned in the same sentence.

This leads us to an issue which I believe gets us on the right track in the end: the fallacy of using the word as the primary unit of analysis. In the latter example, we’re not dealing with fire and leafcutter, we’re instead concerned with fire ants. Once we realise that, then it is perfectly natural to see leafcutter ants as a collocate, whereas we would be surprised to find engine, which instead is a collocate of the lexical item fire.

So, phraseology is the clue. If we get away from single words, and instead consider multi-word units, then we also have an explanation for collocations. Single words form part of larger MWUs, together with other single words. So leafcutter often forms a unit with ants, as does fire. More generally, MWUs such as parameters of the model are formed of several single words, and here we can observe that parameters and model occur together. But they form a single unit of analysis, and only if we break up this unit by considering single words, then we can observe that parameters and model commonly occur together.

From this we can define a very simple procedure to compute collocations: from a corpus, gather all the MWUs that are associated with a particular word. Get a frequency list of all the single word items in those MWUs, sort by frequency, and there we are.

To conclude, collocation is an epiphenomenon of phraseology, a side-effect of words forming larger units. Phraseological units contain multiple single words, and those are picked up by collocation software, because those are the ones that commonly occur in a text together. And the reason for occurring together is that they form a single unit. Once we look at text in terms of MWUs, the need for collocation disappears. Collocation just picks out the constituent elements of multi-word units.

One could of course argue that this is a circular argument, that we are simply replacing a procedure to calculate collocations by one that calculates MWUs. But the difference between those two procedures is that MWU-recognition does not require complicated statistics (which I find hard to see justification for), but instead simply looks at recurrent patternings in language. MWUs are re-usable chunks of texts, which can be justified on the grounds of usage. Collocation is a much harder concept to explain and integrate into views of language. And, as it turns out, we don’t really need it at all.

References

  • Berry-Rogghe, G.L.M. (1973) “The Computation of Collocations and Their Relevance in Lexical Studies.” in The Computer and Literary Studies. Eds. A.J. Aitken, R.W. Bailey and N. Hamilton-Smith. Edinburgh: Edinburgh University Press, p 103-112.
  • Church, K., and Hanks, P. (1991) “Word Association Norms, Mutual Information and Lexicography,” Computational Linguistics, Vol 16:1, p 22-29.
  • Firth, J. R. (1957) “A Synopsis of Linguistic Theory 1930-1955” in Studies in Linguistic Analysis, Oxford: Philological Society.

Thinking Erlang, or Creating a Random Matrix without Loops 26 February 2009

Posted by ojmason in erlang, misc, programming.
9 comments

For a project, my Erlang implementation of a fast PFNET algorithm, I needed to find a way to create a random matrix of integers (for path weights), with the diagonal being filled with zeroes.  I was wondering how best to do that, and started off with two loops, an inner one for each row, and an outer one for the full set of rows.  Then the problem was how to tell the inner loop at what position the ‘0′ should be inserted.  I was thinking about passing a row-ID, when it suddenly clicked: lists:seq/2 was what I needed!  This method, which I previously thought was pretty useless, creates a list with a sequence of numbers (the range is specified in the two parameters).  For example,

1> lists:seq(1,4).
[1,2,3,4]
2> lists:seq(634,637).
[634,635,636,637]
3> lists:seq(1000,1003).
[1000,1001,1002,1003]

Now I would simply generate a list with a number for each row, and then send the inner loop off to do its thing, filling the slot given by the sequence number with a zero, and others with a random value.

But now it gets even better.  Using a separate (tail-)recursive function for the inner loop didn’t quite seem right, so I thought a bit more about it and came to the conclusion that this is simply a mapping; mapping an integer to a list (a vector of numbers, one of which (given by the integer) is a zero).  So instead of using a function for filling the row, I call lists:seq again and then map the whole thing.  This is the final version I arrived at, and I’m sure it can still be improved upon using list comprehensions:

random_matrix(Size, MaxVal) ->
  random:seed(),
  lists:map(
    fun(X) ->
      lists:map(
          fun(Y) ->
              case Y of
                 X -> 0;
                 _ -> random:uniform(MaxVal)
                 end
              end,
          lists:seq(1,Size))
      end,
    lists:seq(1,Size)).

This solution seems to be far more idiomatic, and I am beginning to think that I finally no longer think in an imperative way of loops, but more in the Erlang-way of list operations.  Initially this is hard to achieve, but with any luck it will become a lot easier once one is used to it.  Elegance, here I come!

Example run:

4> random_matrix(6,7).
[[0,1,4,6,7,4],
 [3,0,5,7,5,4],
 [5,1,0,2,5,2],
 [4,2,4,0,3,1],
 [4,4,3,3,0,1],
 [5,7,3,2,2,0]]

Note: I have used random:seed/0 above, as I am happy for the function to return identical matrices on subsequent runs with the same parameters. To get truly random results, that would have to be left out. However, for my benchmarking purposes it saved me having to save the matrix to a file and read it in, as I can easily generate a new copy of the same matrix I used before.