Sunday, July 15, 2007

Reflections

Well, not a whole lot of coding happened this week. But I did see Tchaikovsky with fireworks Tues night. And the cops did show up at my door at 2am on Thurs, banged on it for awhile, and then wouldn't tell me why when I asked. *sigh*. LAPD. But the fireworks were really nice, and I hope to get pictures from one of my co-workers. That'll be my screenshot for this week :)

Anyways, I finally realized what I really needed to expose my objects interfaces was reflection. At first I had the fuzzy idea that maybe I could just map ids -> function pointers, but once I refreshed myself on function pointers in C++, I realized it just can't happen. Because you need something like this:

map<string,funcPointer*>

Where funcPointer can point at whatever function you want. Like Python :) But C++ makes you declare the args along your function pointer. At first I was like "Oh, there's probably a work around". Usually C++ has some sort of way of letting do what you want, if you really want to do it. I mean, just the keyword "friend" lets you violate encapsulation in all kinds of interesting ways. But then I ran across an erroneous statement somewhere along the lines of "function pointers aren't like normal pointers, they hold some extra information". My realization came in three states:


  • Oh!, ok...

  • Wait, no, bullocks!

  • Hmm.. ok they're wrong, but I know what they mean, or a least what they repeated incorrectly

The Oh!, ok.. speaks for itself. The "Wait, no... bollocks!" comes from the fact that I look at a lot of disassembly in my line of work, and I realized that no, a function pointer is just an address, like any other pointer. A function call using a pointer is just going to look something like this:

call [address]

And that can even be broken into:

push eip+5
jump [address]

(jump is one byte, address is four, this is on a x86).
Hmm. So what's this business about "A function pointer has extra info"? Well, what they should have said is a function pointer declaration has extra info, namely, what arguments to push on the stack. This information is used by the compiler, and has nothing to do with the run-time encoding of function pointers. For a compiled language, figuring out what to push and pop on the stack pretty much has to happen at compile time. Unless you want to drop into asm and self-modifying code. Sooo, yah, dynamically creating functions - ain't gonna happen. RTTI is _not_ going to somehow magically create:

push c
push b
push a
call [address]

Just because it likes you.
In a statically compiled language, the only way you could do that is to have your arguments on the heap, not the stack. Oh well.... Or not?

Well, maybe there is a little wiggle room with that whole heap business. We could expose a function like:

invoke( string funcSpecification, void* args, void* reply )

Of course, now we have the whole messy business of, badabum:


  • A means to specify and expose functions

  • A standard way to parse and create those "void* args, void* reply" arguments

In other words, the same questions as my last post, but further along ;) Or Progress! Yes, the nagging first item: you do something highly structured and and formal like xml (WSDL), and it's just a bitch to specify everything, and send it across the network. But you have a nice standard parser, and a clear structure. You do something loose and custom, and you have wonky stuff like the IDL that just invents a new language, and not necessarily a good one.. So I think I'm going to start off having the objects expose something wonky like:


vector reflection;
reflection.push_back("getData( string, int )=myDataType")

to a library that gets loaded by the server. This library can load and generically manage any object that has the method vector getReflection() and invoke( string id, void* args, void* reply), and a little parsing logic... Clients will get something nicer and more structured from the server. Possibly even something that can be used to generate code. So not much different than the current setup. With one big difference: (1) that's _one_ library can manage any object with the right interface, instead of having a different library wrap every object. Yuck.

So what about that second point?


  • A standard way to parse and create those args,reply arguments

Why my friends. That's just serialization! :)


One final reflection: an interesting effect of coding late at night, after you get home from work, is that it leads to things like this:


template<class T>
bool serialize( T data, char*& state, unsigned int& size )
{
if( typeid( T ).name() == typeid( string ).name() )
{
//go ahead and give it to'm
cout << "String!" << endl;
return serializeString( data, state, size );
}
if( sizeof(T) > INT_MAX )
{
cout << "Data type way too big" << endl;
return false;
}
state = new char[sizeof(T)];
size = sizeof(T);

//this is kind of silly, but a byte isn't always 8 bits,
//so we can't just say tmp & 0xFF below
//I really can't believe I bothered to handle this case.
//However, this loop _is_ worth it just to think of the C coders screaming bloody murder
//and pounding their heads in the desk
char mask = 0x0; //a char is always 1 byte though
for( int i = 0; i < CHAR_BIT; ++i )
{
mask = (mask << 1) | 0x01;
}
//index needs to handle negative so we can count down, no unsigned int
for( int i = sizeof(T) - 1; i >= 0; --i )
{
T tmp = data >> i * CHAR_BIT;
state[i] = tmp & mask;
}
}



Then you come home the next night and realize that what you really meant was:


template<class T>
bool serialize( T data, char*& state, unsigned int& size )
{
size = sizeof(T);
state = new char[size];
memcpy( state, (void*)&data, size );
}

Nothing like a good nights sleep!

Cheers!



PS. Blogger sucks at trying to put code up. Try Ctrl-Shift-+.

No comments: