Spot any errors? let me know, but Unleash your pedant politely please.

Monday 23 February 2009

The Oscars.

John Gruber has an interesting take on the Oscars. I agree with almost everything he says. The Kermodian certainty is to be admired, but there's one big problem: WALL-E just wasn't that great a picture. It was that great a picture for the first half, but not for the second half. My gut tells me that half a picture shouldn't win an Oscar nomination.

Sunday 22 February 2009

What I think about comments.

Some people tell you that you don't need comments. It may be true for them, but it's not true for everyone. So unless you keep your code to yourself, you need to use comments. You shouldn't state the obvious. This is a bad comment:

   # increment the loop variable
   i++;


Comments are more useful to remind yourself why you did something in a certain way. Comments shouldn't really explain what the code does, they should justify or clarify something about the code. It's better to have comments on blocks rather than lines. It's OK to add comments to help when learning a language or technique that you maybe will strip out later.

Example of a good/useful comment :

       // create and populate the first few
       // rows. Shouldn't really have to,
       // but if we don't, autosize will
       // throw a wobbler later on
       //
    HSSFRow row = null;
    HSSFCell cell = null;
    for (int rowNum = 0; rowNum < dataRow; rowNum++)
    {
       row = worksheet.createRow(rowNum);

       for (int col=0; col<headersText.length ; col++)
       {
          cell = row.createCell((col + 1), cellType);
          cell.setCellValue("");
       }
    }

It's a good comment because it's going to explain why some odd code is in there, and it'll stop someone taking it out in the future and having to go through the pain of finding a workaround for a problem that they'd inadvertently reintroduced.

It's also formatted in an unusual way. It spans multiple lines, and it's over-indented. This is very deliberate. It stems from the days before syntax-colouring. It achieves two things. The comments get out of the way when you're reading the code, and the comments stand out when you want to read the comments. I've been doing this for years. An old boss used to do it and it infuriated me. He explained why he did it though, and persuaded me to try it for a while, and it works. It's a great way to format comments.

Why readability and scope matter

I was reasonably lucky to have a decent lecturer in Polytechnic, who taught me some good programming practices. This was structured programming (Modula-2), rather than OOP, but they also apply to OOP. Essentially, it boiled down to two complementary goals: Low Coupling and High Cohesion. The goals are fairly simple to achieve by following a couple of simple rules:

1. Limit scope as much as possible.

2. A subroutine should do one thing only, and all the data it needs should be defined in its interface (its formal parameter list).

There are occasions to break the rules, such as when optimising. Breaking the rules for mere convenience, especially early in development is a definite no-no. I've done it, and it usually comes back to haunt me, and it usually needs refactoring to limit the scope.

It's arguably more difficult with OOP, because instance variables have global scope for an object, but not all methods should be allowed to alter them. It's probable that if you get into this situation, your object is too complex and should be split into smaller and simpler objects.

So that's scope.

Readability matters because code gets read far more often than it gets modified. Badly formatted code is difficult to read. Badly formatted code is confusing the reader.

I once spent several days trying to find a problem in a very long module. The module was too long, but the tools we were using made it difficult to split a lot of the code out. There was a pre-processor to flesh out system / messaging calls, but only against certain modules. The module at fault had grown to something like 10,000 lines of code. I knew he problem was in there, but I didn't have a reliable way to make it fail in unit testing - where I could debug. We didn't have great unit tests, and it wasn't possible to debug during system testing (when it failed)

The module was a bit of an unreadable mess. The indentation was all over the place and there were dozens of variables scoped to the highest level of the module.

I told my boss about this, and asked for permissions to take a few days to tidy up the indentation and fix the scoping issues. He told me we didn't have time. We were close to releasing, and we couldn't make those sort of changes.

I went with my gut. I lied about what I was doing in progress meetings and went ahead and fixed the indentation and refactored to limit the scope of everything as much as possible. I think I actually got rid of all the variables global to the module (There were the equivalent to instance variables, which remained, but these were defined elsewhere). It took about 3 days.

When I came to test again, with something more readable, as loosely coupled and highly cohesive as i could make it, I tried to find the bug again. I had gone.

I've actually no idea where the bug was, or what the change was that fixed it. The likely problem is that a global variable was being set when a local one should have been set, and this was now impossible with the new scoping restrictions. But I never found out exactly where.

I told my boss I'd found and fixed the bug, and gave some bullshit response to explain it.

{ … }

I don't really like curly braces, but I appreciate this is a personal thing. I don't dislike them nearly as much as I used to. I do prefer a good solid beginend, but at least the braces are language independent ... that is if someone is programming in Italian, they work just as well.

What I really, really hate is this …

   if (x==y) {
      …
   }


simply because the block delimiters don't line up. A far better style is this…

   if (x==y)
   {
      …
   }


I've seen coding standards that do it the right way, but I don't recall ever seeing it written the right way in a coding book. People may argue the right way this wastes a line. Bollocks! For the sake of readbility, a bit of white space delineates blocks quite nicely. Where you don't need it is when the block is very short. For example, both

   if (fileIsOpen) {closeFile();}

and

   if (fileIsOpen)
      {closeFile();}


are fine.

Python doesn't have this problem. It's one of the main reasons I like it.

++

This is something I used to dislike about C-like languages. I used to think that such statements should always be explicit assignments…
x := x + 1 ;
…but I don't mind it these days. I'd prefer a function…
inc(x);
…but I don't have a problem with
x++;
any more.

Saturday 21 February 2009

'==' = '='; '=' = ':='.

Or: One of the things I hate about C

If you started out with a language that uses this syntax, you probably think I'm nuts. Well, you're wrong. Just because you're used to the convention doesn't mean the convention makes any fucking sense.

Q. What is "=" ?

It's a equals sign, right ? Remember it from mathematics ? It's a sort of point of. Everything on the left amounts to the same as everything on the right. So in a condition, we're testing whether two expressions equal, whether they balance. So for reasons I simply cannot fathom, C-like languages use "==".

I think it's like this simply because whoever first defined the syntax didn't want to refactor the work he'd already done. Or it didn't occur to them, because they had Aspergers Syndrome or something. One of the first things you want to do when defining a language, after you've declared a variable, is assign something to it. This probably comes before testing for equality. There's not much point testing for equality until you've got things to test. So the language developer thinks to himself : "I've allocated a variable to be used. How shall I assign something to my variable ?", and he comes up with this…
x = 10
…and there's nothing wrong with this. Yet.

A bit later, he thinks to himself : "I need to check the content of my variables", and he comes up with this…
if x = 10
…and there's a problem because '=' has already been used. The next question he asks himself is "What can I use instead ?", but this is a mistake. the question should be "Should I use '=' for assignment or equality ?", and the right answer is "Equality".

BASIC uses '=' for both. Assignment should be preceded by the 'Let' keyword, but because assignment and equality can be contextually detected by the interpreter, 'Let' is optional these days.

Algol-like languages use '=' correctly and have ":=" for assignment. This is what I prefer, but anything except '=' would be fine. Just a ':' would probably do (but that might be better used elsewhere!).

Java (how and why I got started)

I'm a tester now, but I'm the only tester on the team with any programming knowledge. I've written some nice VBA to make our testing easier, but we were looking to migrate from using Excel to write and record manual test scripts to a proper dedicated testing tool.

The proper tool didn't report stuff in the way we wanted. It also didn't have an easy way to migrate our Excel scripts. Fortunately it did have an API that would allow us to access the metrics in the way that we needed and be able to take a lot of the drudgery out of migrating the scripts.

So I got stuck in. My first choice was Python. There were some examples on the vendor's website. I gave up fairly quickly though. The API was provided as a WSDL XML file, but the Python based WSDL gubbins just didn't work. I briefly looked at hand coding each message, but briefly was nought to know it would be too difficult / take too long.

Second choice was Java. I didn't have a budget for Visual Studio, and all the developers where I work use Eclipse anyway, so I figure I could get some help. In the end I didn't need their help, though I did get some pointers via email from a friend. By far the best help was from some online video tutorials. That got me up to speed pretty quickly.

Java (what I used to think)

There are things I hate about Java. Most of the things I hate aren't really Java problems at all. They're C. Some of them are in Python too. The languages I'd used - Pascal, Modula-2, Jovial, Ada and Occam - predisposed me to hating the differences.

I'd periodically look at C and periodically walk away. C sucks. It sucks for two reasons. One is syntax that I dislike. The other is that it's way to easy to compile a piece of type-unsafe crap. All those security issues with buffer overflows are because of this sort of shit in C.

The thing that I hated about Java whenever I looked at it were the enumerated types: there weren't any.

Before Eclipse

OK, the tools I used when I was a programmer were these :

The OS (pretty much always VMS)
A scripting language (on VMS this is DCL)
A text editor (I used EVE almost exclusively)
A compiler
A linker
A debugger

Note that these things do not an IDE make. This was all command line based. This is the sequence I'd use to fix an error in a pascal program might be this ...

$ eve program.pas
make a change
save and exit

$ pasccal program.pas
get a compilation error

$ eve program.pas
make a change
save and exit

$ pasccal program.pas
get a compilation error

$ eve program.pas
make a change
save and exit

$ pasccal program.pas
get a compilation error

$ eve program.pas
make a change
save and exit

$ link program.pas

$ run program.exe
check results

$ run program.exe /debug
issue commands to display code viewer
find a line number
set breakpoint at line number
continue program execution

etc. etc. etc. Everything manual, a lot of time to complete each step. The editor and the code viewer in debug, for instance, were completely different tools. There was no pointing, so no tooltips, no clicking to set breakpoints, no selecting to evaluate expressions, no code completion.

So when I finally got around to using an IDE, Eclipse, last year, it was something of a revelation. I think Eclipse is superb. I'd briefly used Visual Studio on a C++ course somewhere around 2004, but then never got to use it again. And I hated C++, so I had no incentive to use it again. The benefit of the course was an intro to OOP, but at the time I still interpreted OOP in terms of Ada 83, which I'd used on OOD based projects.

I used Eclipse with Java.

First impressions of Python

I've used Python a little over the past year. I think it's superb. The loose typing and everything-as-an-object idea really works for a scripting language. The simplicity of using indentation to delineate blocks was a major appeal. It reminded me of Occam, which I used way back in 1992.

Coming from a real-time critical system background, loose-typing is a bit of an anathema, but I'm largely converted. Not for critical systems, where languages such as Ada should be used, but for general use, and scripting in particular, it rocks.

Introduction

I'll use this for stuff that will not fit into 140 characters. It's going to be general, probably a bit Apple-centric, but with things like simple programming tips here and there as I attempt to find my coding mojo again.

I'm also 'daycoder' on Twitter.

Why 'dayCoder' ?…

Well, as I've suggested, I lost my coding mojo. The short version is that eight years of VMS Pascal does that to a guy.