Spot any errors? let me know, but Unleash your pedant politely please.

Saturday 7 May 2016

Reassuring

A few years ago, when was getting into Python, I wrote an XML parser.  It's slow and fairly terrible at namespaces and doesn't have any proper unit tests, but it's pretty easy to use.  Of course, I shouldn't have written it at all, but I learned to code before Stack Exchange was a thing. I didn't know that I could find all the answers from smarter people there. Or even that there were some built-in Python things for XML.

I'm still using that parser, which has its own sort-of-XPATH-ish searching, which was written before I  knew that XPATH was a thing.

When I wrote it, I'd never really written a parser for anything particularly complex or vague.

To parse, it uses a class called StringReader.  I've never been happy with the name, but haven't spent any time trying to figure out what it should be called.

Today swift-name-demangling popped up in Reeder. It talks about a 'Scanner' class, and shows the .cpp header:

class NameSource {
  StringRef Text;
public:
  NameSource(StringRef text);

  bool hasAtLeast(size_t len);
  bool isEmpty();
  explicit operator bool();
  char peek();
  char next();
  bool nextIf(char c);
  bool nextIf(StringRef str);
  StringRef slice(size_t len);
  StringRef str();
  void advanceOffset(size_t len);
  StringRef getString();
  bool readUntil(char c, std::string &result);
};

If Python had headers, this is what mine would look like:

class StringReader(object):
    def __init__(self, string):            
    def fwd(self):
    def rev(self):
    def next(self):
    def prev(self):
    def peek(self):
    def eos(self):
    def sos(self):
    def reset(self):
    def find(self, value):
    def checkFor(self, string):
    def skipAny(self, valueToSkip):
    def skipPast(self, valueToSkip):
    def peekAhead(self, length=20):
    def getUntil(self, value):
    def getUntilSequence(self, value):
    def rememberPosition(self):
    def restorePosition(self):
    def remainder(self):

It's really reassuring that even though I wrote this in the dark (metaphorically at least) there's a significant overlap with same/similar names (peek, next, getString/remainder, readUntil/getUntil, isEmpty/eos).  I have more methods. Some look like unnecessary duplicates. I hadn't discovered how to indicate private methods, hadn't discovered properties and hadn't discovered Python naming conventions. I'll have to look into that default length of 20. That looks really odd.

Given that I this grew organically as I needed methods in my XMLElement class, I'm pretty pleased with it today.  I think I'll rename it StringScanner.