The ultrageeky Category

CTotW : Sporestruck

Tuesday, June 24th, 2008

I’m sitting here totally awestruck by Spore.  And I want to tell you all about it.

First, no there’s no Sporn here.  Second, I don’t even have the Creature Creator yet, so I don’t have some non-obscene thing to say.  I’ve got very little to say about tools I haven’t used or anything (although I’ve watched you youtube vids of people making things.)  I don’t want to talk about the DRM in Spore, either, except that I’m probably going to suck it up and buy it when things come out.

No, today I saw something technically awesome.  Something that made me look at the glorious obviousness of it, and know that I was in the presence of greatness.  Something so simple that no one had thought of it before.  I haven’t even read about this, but maybe I haven’t been sucking on the Spore Hype teat long enough.  But I still saw it today, and I’m stunned.  In a word, sporestruck.

Crystaltips, aka Alice of Wonderland blog, did some searching for her own name on Sporepedia. And someoen had made a creature with that name.  That was cool, and I was having trouble viewing it, so I went to the Sporepedia myself to see what was up.  I hadn’t been there yet.

I went, and was met with this absolutely wonderful amzing tip:

How do you see one of these Creatures in your Spore Creature Creator? Right Click on the thumbnail image of the creature. Save the image to your desktop. Drag the saved image from your desktop into the Spore Creature Creator. Voila! The Creature is in your game.

Okay, so I did that. What did it download? A .png file.  Not some bizarre .xml file.  Not a .spore (.spr?) file. No, just .png.  An image, just like it says.  Something I could put on my website.  Something I could print out, or set to my desktop.  Just that, an image.

Yet, I can drag that image into the creature createer and the creature is in my game. “Voila!” Indeed.

My first reaction was that they were using steganography to accomplish this.  That would be stupendously cool. Steganography is the study/practice of storing secret information inside of images. If you think of a computer graphic as an array of bytes, where each byte (or more) has color information for each pixel in the graphic, then you have more bits than you need to store graphical information (especially on a lower quality image, stored in a higher quality format).   You can then use these higher-order bits to store extra information in some pre-agreed upon code.  That would mean the image would be all that’s required to pass on the full info of the creature.

However, on reflection, I think they are using the filename as a key index to look things up in the Sporepedia.  That’s still pretty impressive, and allows for me sharing things, so long as I don’t change the name of the file.  I want to experiment some with it, as there’s a potential for exploitation with a filenaming convention.  The advantage here, is that Spore, and EA get to control what’s really available.  So, if I make a penis monster, and they don’t want to store it, then it’s gone from the shared universe.  And that’s why the steganography idea is so much cooler.

I’ll probably mess with this Wednesday night, and update how it’s working.

CTotW : First OSS credits

Wednesday, May 14th, 2008

I’ve used Open Source Software for a long time. As part of my position at work, I get to advocate and select solutions, and I’ve been moving us away from the IIS/ColdFusion proprietary licenses to Apache/PHP. It’s more cost saving all around, and for our application, we are actually stabler on Apache. (IIS required weekly reboots f the CF server, but it runs a bit more nicely under PHP). I’d like to get rid of the Windows license altogether, but that may not be an option in the long term. There are just too much .Net software out there I might have to support.

My client side is completely OSS, with Dojo Toolkit forming the basis of my web applications. Most of my stuff is driven by my PHP templating library, but I found that I wanted people to work with the data at their end, and produce the reports without needing to pass data back and forth. That means some sort of Javascript reporting library, which, unfortunately, wasn’t built into Dojo.

I found one called EJS, or Embedded JavaScript. I spent some time with it, it’s templates are pretty basic, and it doesn’t require a huge framework (after all, I’ve already got one of those). It’s technically part of JavaScriptMVC, which I’m not using so much (since my MVC mostly lives on the server itself). I got it mostly working, but then started having some odd problems with it.

JavaScript is a funny language, with functions as basic data types, and the ability to write and modify (and evaluate) it’s own code as part of the language. EJS does this for it’s templates, basically converting them into a function which it later calls — that means you can use JavaScript within the templates itself. I seized on this to make recursive templates (which works for recursive multiple group-by reports like I was creating). My recursive templates came out really weird, with part of it right, and part of it just … missing.

What sort of amazes me was how quickly I was able to find the problem (and my correct identification of it as a scoping problem). It’s hard to explain to non-programmers, but JavaScript is fundamentally different than most languages, and learning it has been kind of mind-expanding for me. My first brush with it was sort of distasteful, it was a poorly named bastard stepchild of a language.

Today, it runs the web. Google Apps, Flash (via ActionScript), any sort of web interactivity,really, is done with JavaScript, so it’s a good language to learn. Both practical, and mind expanding;) (Now I just need to learn LISP.)

Anyway, I fixed the problem, and got my reports working, and kept meaning to let the EJS folks know about it, and subscribed to the list to hear more about it as I worked. A day or two later, someone had the same problem, so I posted my patch. It was four characters long, just “var ” at the right place in the code, but it fixed the problem.

They wrote me earlier this week, as they merged my “patch” into the primary JavaScriptMVC code, and they wanted to give me credit. That’s pretty awesome, and not why I did it, but there I am, a padawan and everything.

California bans/limites electronic voting machines

Tuesday, August 7th, 2007

As well they should.

Look, I’m a technologist. I like computers, they do a lot of great things. But as a general principle, software sucks. People don’t understand computer security, and most of the e-voting machines don’t fail elegantly to real-world options. (Like the obvious choice of printing a receipt which a voter looks at, confirms as his vote, and drops in a big bin. The receipt could be machine readable, to allow for easier counting, but would also be countable by regular people if required.) There is a great deal on the study at the California Secretary of State site, including all the reports of the various teams.

One of the good things about this review is that not only were the machines themselves reviewed, but the code itself (under strict NDAs) was looked at. Matt Blaze, one of the researchers says (emphasis mine)

I was especially struck by the utter banality of most of the flaws we discovered. Exploitable vulnerabilities arose not so much from esoteric weaknesses that taxed our ingenuity, but rather from the garden-variety design and implementation blunders that plague any system not built with security as a central requirement. There was a pervasive lack of good security engineering across all three systems, and I’m at a loss to explain how any of them survived whatever process certified them as secure in the first place. Our hard work notwithstanding, unearthing exploitable deficiencies was surprisingly — and disturbingly — easy.

Um… “not built with security as a central requirement”? WTF Mate?

Look, I may be a nut, but I can’t think of much that should be more secure than our voting process. It is the single most important way we remain free. (And the irregularities in the past elections make me worry about the “remain free” part.) We have to be vigilant to protect our democracy, the Founders knew this, I wish sometimes that we could remember it ourselves.

Bruce Schneier, security guru (he knows Alice and Bob’s shared secret), writes about it as well. Evidently the teams were given only a few weeks, and not enough documentation or support to actually do a realistic security review of the machines. And still they discovered enough to have the machines lose certification. As Schneier says “the voting machines tested were so horribly bad that the reviewers found vulnerabilities despite a ridiculous schedule.” And that, my friends, is bad.

I know my code could never pass a review like this, but I’m not writing voting machine software. I’m not even handling money. I don’t even have anyone’s social security number. (And all the information I have is pretty much available under FOIA, anyway). Security for me is just not corrupting data, and I do pretty well with that (better than my predecessors, anyway). Still, it’s a shame that Diebold is here in Ohio, and is probably goign to get a pass with whatever crap they have available.

Update: Speaking of Diebold, here’s a link about their crap. I’m in a meeting with the BoE today, about keys (you know the metal kind, not the encryption kind), I wonder if they understand that to screw up an election all you need is access to one machine (or one person to corrupt).

In-Secure Featuritis

Monday, June 11th, 2007

I wonder sometimes what was going through the brain of my predecessor here at work. I met her, worked for her for about a month, and when applying for her job meant taking instruction from her (while she worked somewhere else) as well as taking a deep pay cut, I said no thanks and walked away. Now I’m doing the job at close to what I was making before, and she was on retainer to me for a month, during which I called her twice.

Because reading code is the way to understand.

You understand, or learn, two things when you read someone’s code. The first, at least hopefully, you understand what the code does — and sometimes why. The second is perhaps the deepest insight into another’s thought processes that I’ve ever seen — in effect you learn how they think.

You can learn if someone can think by giving them a puzzle, something solvable say, or a broad design question, and see where they go with it. But if you want to see how they break a problem down, and piece it together step by step, have them write a program to do it.

I don’t know what to make of my predecessor’s thought processes.

It’s obvious she knew little about best coding practices, and thus was largely self taught, with a poor teacher (mainly the example of the cold fusion code we bought). Today I realized that one of the biggest problems with her code was being used to make another feature work.

Let me describe the problem (this is where it starts getting ultrageeky). The app in question is a web-based application, written in cold fusion for the management of “work tickets”. It’s essentially a process application: someone creates a work ticket, it get assigned to a particular person who performs the job, notes what he or she did on the application, and marks their part done. This bumps it to their boss, who closes the work ticket, and it goes to historical work ticket land.

This is a pretty simple application, there’s a bunch of hard-to-edit supporting tables, tasks, results, assignments, and workgroups, and stuff, but the core is really just a queue-based process application.

There are two ways a ticket can be entered: by someone in an agency, who logs in with just the name of their agency (so we hope they enter real contact info), or by one of our employees. In the former case, it goes in an “incoming” queue for the appropriate workgroup, and is assigned to one of that workgroup’s members. The workgroup member enters some tasks, marks it complete and so on.

In the latter case, the workgroup member enters the ticket (which is appropriate for some cases where they see the problem and fix it, or they follow a set route every day, or something). After entering the ticket it is still technically unassigned, but they are sent to a screen that would allow them to add time. They do, and that action “assigns” the ticket to them. (Their boss could still edit the ticket, and assign someone else to it, but that rarely happens).

Now if I was designing this application, the workgroup members wouldn’t be able to see tickets that weren’t assigned to them — trying to open one up would result in an error, so I would assign tickets they entered to them automatically. This means that someone could assign themselves any ticket they wanted by opening it up in a browser. This can be done by editing the URL (which isn’t necessarily a bad thing, it’s view data, so a GET type string is appropriate). No check is made to see if they can edit it or do anything with it, but they could add work to it, effectively assigning it to themselves.

Now, no one in their right mind is going to go around assigning themselves more work, but they could screw with each other, across groups, if they knew about this. It’s just stupid, and if my predecessor had used a decent security model she’d never have been able to do it this way.

This is particularly bad because she decided to model “assignment” as being equivalent to “has done work on”. That is, it’s assigned to you if you have entered time applied against the work ticket. This is an ok definition for route 2, where the one who does the work is the one who enters the ticket (so the work is done already). But it’s a horrible definition for the other route: where the ticket comes from outside the organization.

Because it means that in order to implement the “assign” action, the application has to insert a dummy record in the table where applied time is — a 0 hours, no-task, completed in 1900 record which just “assigns” the task to the user.

The thing that really gets me is that this is a completely non-obvious way to implement this. There’s a table of workers, there’s a table of work requests, just create a table of assignments, which is just worker+work request. Easy.

Based on the (incredibly poor) naming of the tables, I suspect that the table which stores the time worked/task done information was originally just the assignment table. But they repurposed it to contain all the other info, forcing this bizarre conflagration. Table should have one purpose, really. They should be one thing — sometimes that thing is a noun-like thing, “employees” or “work requests” and sometimes it’s a relationship “employees assigned to work requests”.
Because once you change “employees assigned to work requests” to “employees assigned to work requests and the tasks they did on the work requests”, you’ve changed your key from (employee, work_request) to (employee,work_request) + some other stuff.

They told me that they used to be able to enter one task/work request. The key in that table is almost at the end (but not quite) (and my predecessor always added fields to the end of a row). So I bet this is why it’s like it is.

I just don’t fathom why anyone would make the decisions that way. It’s crazy to me.

And, to be honest, this is jut one of the things — and perhaps a minor one — that is wrong with this application.

Site Notes

Wednesday, May 30th, 2007

The site should be a bit more responsive for a while. I realized some things about why it was having problems (including the couple-hour outage yesterday afternoon).

I use Squirrelmail, which in turn used IMAPD to do email based things. While I’ve been at home, I’ve been watching top run, checking the processes, etc, and see that the biggest processes, using the most resources are imapd, spamd, and clamscan. The apache process will peg things, too, but mainly when it’s doing email. Spamd and clamscan are important (and new) since they filter out virii and spam — but they also contend with the mailbox files, so they hold up (and are in turn, held up by) the imapd processes.

Yesterday, when I got home, I discovered I’d left my web-based email open. Now, Squirrelmail will try to refresh the “folders” listing every so often (I had it set at 5 minutes, which was a concession I made about a year ago. I’m only a little obsessive about mail.) I also had it set to tell me the number of unread mails in each of my folders — some of which have thousands of unread mails (and a few read ones). If I read mail at work, then both of them — the work and home one — compete for resources that Sarah just doesn’t have. The past three times she’s completely locked up, I’ve gotten home and voila– there’s squirrelmail open in my Mozilla at home.

Yes, I’m the one killing SarahBellum. Bad me.

I used to care — I used to read a lot of the groups I’m subscribed to, but Gmail has changed that a lot — I read my inbox and filter most stuff out, and a lot of the groups I actually read go to gmail. And more of my internet-time is spend writing and reading blogs, instead of dealing with mail groups. So I turned off the folders option, and at the same time I also kicked it back to a twenty-minute timer. I can refresh whenever I want to, and the folders are bolded if they have unread messages (for the 2 or 3 folders I actually read).

This should help out with everyone else using the site.

So in normal English: I’m not asking Sarah to do as much work, so now she has more time for everybody else. And since keeping a blog is about “everybody else” seeing and responding to what I write, that’s pretty important to me.