December 26, 2007

The component system of my dreams

I’ve been working in Plone, so I’ve seen zope.component. I’m also thinking of making a (potentially) networked game in Python, and for that I was looking at things like Twisted and Kamaelia. Unsurprisingly, I’m not really satisfied with any of these.

Let me try to articulate what I’m looking for when I say “component system” (perhaps something more nebulous than “content management system”).

  • Obviously I want to divide my game into components that implement defined interfaces. For example, there’s a component that handles the network communication with players, another that handles streaming the game events to “observers” (people that watch but do not participate), a component to handle physics, a component to perform (for example) validation on incoming player commands, etc.

  • I want to define dependencies between components. These dependencies are then used to acquire a suitable implementation of the component’s interface.

    For example, when I start the game maybe I “start” the front end component that handles communications with player; lets call that the “player server.” Now, the player server generates events such as “player connected,” “player moved,” etc. Somewhere (in code, or maybe even in something like ZCML) I’ve defined that this implementation of the component needs a component implementing IPlayerEventConsumer. The component system then finds (using my configuration) an implementation for that interface and makes it available to the player server.

  • Assuming my components are written correctly, I want to be able to have a component execute in the same thread as other components (e.g. in the “main thread” as a microthread/coroutine), or in a new thread, or in a new process. For example, if I have lots of CPU-intensive physics code, maybe I want to run that on another processor, so I need threads. Of course, maybe I’m running on CPython, where I might need separate processes to bypass the GIL (ignoring for the moment the question of efficient IPC). Or maybe I’m working with something like a GUI where I need to have that run in a separate process (of course, doing a GUI can cause even bigger headaches).

  • I want the ability to implement components in other languages (see also: component executes in separate process, above). This means I want a standard protocol for communication between components. XML-RPC comes to mind. Something else easy enough to implement comes to mind. There’s things like pickling in Python, but I don’t know how much fun that would really be to implement in a non-Python language; maybe something more language-agnostic?

    For fun, I’ll add here that I might like to be able to communicate with components using a variety of methods: pipes, Unix sockets, TCP/IP, and so on. This is desirable.

Now, I think I’ve just described a billion other attempts at “component systems”: COM/DCOM, maybe EJB, CORBA, maybe KDE’s DCOP. Let me add a couple more requirements that should narrow the field a bit:

  • The “component system” must be platform-independent–or at least support including Linux, *BSD, OS X, and Win32 (actually, I could give up Win32 if I had to).

  • I want the component system to be mostly transparent to me, the coder. I expect to have to configure the bindings between interfaces and implementations, specify dependencies, configure the manner in which a component will execute (microthread/thread/process; subject to the “execution style” the component is prepared to run in), and configuring the location (e.g. host/port) of other components in the system. I don’t want to have to manually write up a proxy class for a remote component, for example. As much as is humanly possible, I don’t want my code to care whether a component is running in shared memory or on a box 1,000 miles away.

COM/DCOM is basically going to be Windows, right? DCOP might not be, what with KDE 4 supposedly running on Windows (right?). EJB has notorious boilerplate (though it has been a while for me), when I think CORBA I think IDL, etc. Those aren’t “mostly transparent.”

Kamaelia looks interesting. The system of “wiring” components together feels right to me; I was first exposed to this in NesC. However, the implementation needs to be updated to support the new features of generators in Python 2.5, as the current syntax strikes me as rather ugly. In fact, it looks like Kamaelia needs a recent release, period: the last one I saw was from 2006.

(As a side note: everything should be easy_installable. Kamaelia and Twisted are not, though Twisted has ongoing work to this end.)

Kamaelia also fails to offer multi-process operation, as far as I can tell (you could write it yourself without too much pain), and it needs a way to generalize its “wires” to support communication with, e.g., remote components. You could actually combine the marshalling component with the framing component and a TCP client/server components and make this work; but that might have shot straight past “configuration” into “programming.”

Twisted is a much, much, much larger framework than I ever realized, and a lot of it seems pretty good. Nonetheless, the centerpiece of Twisted (if you believe the docs) still seems to be their “reactors” which require you to handle concurrency by defining things like Protocol subclasses that receive event messages (i.e. method calls) like connectionMade and dataReceived. This programming model might feel a little strange. They’ve got this neat looking inlineCallbacks decorator, which looks like it might lend itself to a coroutine kind of style. Then you start to realize that you’re not sure what you can use it for. I actually started writing something like:

class HelloWorldProtocol (LineReceiver):
def connectionMade(self):
    self.transport.write("Hi, who are you?n")
    # Now I'll read their name:
    line = yield self.readLin  # ... hey, uh, there isn't a read method

I’ve seen several IRC logs where people try to figure out similar things. For what I’m doing, inlineCallbacks doesn’t seem like something I’m going to be able to use much.

I’ve considered building something like Kamaelia’s style of wiring up components inside Twisted. Twisted has some kind of support for things like processes and threads. I haven’t determined if these really meet my needs, and I keep reading scary things about them being deprecated.

If you believe Twisted’s finger tutorial they’ve also drank the “Zope Component Architecture” Kool-Aid, though thankfully I didn’t see any ZCML (yet…). Look at the final product of that tutorial and notice all the interfaces and “adapters” flying around. I don’t really feel like I gain enough for that extra code.

The question I ignored earlier, the one of performance, is still outstanding in my mind: can this kind of system be done [in Python] efficiently? I’m afraid I’m dreaming of a “component system” that’s going to be so slow as to be unpractical.

And, finally: do I really need these features?