Python drive-by
Making simple tree data structures out of things like lists and dicts in Python always pains me. In my head other languages, like maybe ECMAScript or Lua, have better syntax for this. Here’s an example of what I might write to set up a particular data structure:
day = {}
day["date"] = "date object here"
day["title"] = "string data here"
sections = day["sections"] = [dict(title="section title")]
jobs = sections[0]["jobs"] = []
jobs.append(dict(title="Google",
url="http://www.google.com"))
jobs.append(dict(title="Yahoo!",
url="http://www.yahoo.com"))
Perhaps pprint can make it easier to see what I’ve done here:
{'date': 'date object here',
'sections': [{'jobs': [{'title': 'Google', 'url': 'http://www.google.com'},
{'title': 'Yahoo!', 'url': 'http://www.yahoo.com'}],
'title': 'section title'}],
'title': 'string data here'}
You might want to argue that I made bad choices WRT that syntax, since the pretty printed version above probably looks better. Here’s an alternative, I guess:
day = {"date": "calculate date object here",
"sections": [{"title": "section title",
"jobs": [{"title": "Google",
"url": "http://www.google.com"},
{"title": "Yahoo!",
"url": "http://www.yahoo.com"},
],
},
],
}
# Can't assign this in-line, unless I calculate "date" into a local
# variable first.
day["title"] = "this is based on " + day["date"]
It’s not really fair to say “just write it like pprint has.”
pprint has a few advantages, such as having everything that it needs
to put in its rendered data structure up front, and also being a
computer. As a human writing this data structure, I think of the
title before I think of putting in the jobs for example; pprint
alphabetizes the keys, so it puts jobs first, which means you don’t
potentially have }]}]} at the end of the structure. Also note that
I needed one of the values in the structure to compute another;
pprint already had that value when it went to render the data
structure.
Also, I may need to think about switching from double quotes to single
quotes. To my overly picky mind, they now look a little “cleaner.”
My use of double quotes can be traced back to when I was frequently
programming in C, but is not helped by the fact that '' and ""
behave differently in languages such as Perl and the Bourne Shell. (C
also makes/made me do slightly weird things in Perl and sh, such as
writing 'x' and "xxx" with different quote types.)
So I went crazy and made a class which is currently called DataObject. It’s actually a somewhat disgusting set of wrappers over dictionaries, but check out the syntax:
day = DataObject()
day.date = "date object here"
day.title = "string data here"
day.sections[0].title = "section title"
day.sections[0].jobs.new_child(title="Google",
url="http://www.google.com")
day.sections[0].jobs.new_child(title="Yahoo!",
url="http://www.yahoo.com")
To me this is vastly more readable (and easier to write too), and it
works just like the data structure I made above with Python’s built-in
types. You can also ask for a clone of the data using built-in types
by calling day.to_native() (it operates recursively).
I think there may be some weird side-effects, like weird exceptions that happen when you make a typo on a “key” (since attributes are mapped to keys) and a new object springs into existence. I’m going to try using it a bit more before I pass judgment on whether or not it’s a useful idea.
This is one of the cases where a class would be better for storing your complex data structure. You would have the option of not pre-computing and freezing fields that depend on other fields like your title field. It could be
day.date = “this is based on %(date)”
And you could interpolate before fetching the value:
return self.date % self.__dict__
A class would also give you a place to document your datastructure and allow you to add a method to check that it is well-formed.
- Paddy.
Comment by Paddy3118 — Saturday, 15 September 2007 @ 01:23:15