Monthly Archives: January 2007

Why Java Server Faces gets it wrong

Several years ago, Sun came up with a technology called Enterprise Java Beans (EJB). Because of their marketing clout (they own the Java language), for a while you could harldy get a job as a Java programmer without EJB on your resume. I know; I was job hunting at the time. The problem was, rather than making your life easier, EJB made you describe, in triplicate, the data you were trying to pass around. It took someone writing a book about how EJB is a bloated mess for developers to come to their senses and revolt.

A programming framework is supposed to make your life as a programmer easier. If you find it’s making your life harder, you shouldn’t use it. In fact, it should be more than a little helpful, since it must overcome the cost of learning the framework, installing it, and staying on top of its bugs and security holes.

Perhaps the best rule of thumb is the DRY principle: Don’t Repeat Yourself. EJB was a terrible violation of DRY. Not only did you have to write three classes to define one “bean” of data, you also had to run an extra program to generate additional classes and description files.

About the same time that EJB was all the rage, I learned about a new framework called Struts from a job interview. Struts is supposed to make web forms easier to write, and it has a lot going for it. These days, Sun has cloned Struts with something called Java Server Pages, or JSF. Struts got it wrong, although they eventually started to improve. JSF is just as wrong.
It handles data entered from web forms, keeps track of data validation problems, and perhaps most important, refills the form so that every little error doesn’t force the user to re-enter everything. This sort of stuff is a real pain to program, but very important. It’s not uncommon to have a form online that asks for your name, address, email address, phone number, and a zillion other details, all of which must be valid before you can proceed. If every time you forget to put in a required field you have to re-enter everything, you’re likely to give up pretty quickly.

My honeymoon with Struts ended when I realized that, to keep track of what was happening on my web pages, I had to keep track of Struts, JSP (the Java technology Struts lives on), Servlets (the technology underlying JSP), Tomcat (the program which runs the servlets), and all of the HTML/HTTP stuff that comprises the web. And a bug on your web page could be caused by any of these, or by a weird interaction between any of them.

Struts violates the DRY principle. You start out with a web form. You need a Java class (a “bean”) which describes the data in the form. If the bean isn’t the same as the Java class you ultimately want to populate (and it usually isn’t), you’ve just repeated yourself three times: the form, the bean, and the target class. And on top of that, you need to write an XML file which maps from the name of the form bean Java class and the name of the form bean that you use on your web pages.

The thing you want to do from a DRY perspective is to write one java class, and have a little JSP tag that says “here’s my data, generate a form.” In reality, that’s usually not what you want, since your Java class may have lots of interal-use-only fields that you don’t want just anyone to modify. And a particular web page often has a particular look that an auto-generated form wouldn’t match. So it is necessary to repeat yourself, if only to mention which fields go where on the web page.

I ultimately gave up on Struts because VocaLabs does a lot of surveys. Every survey has a different set of questions, so you really do need the entire form to be auto-generated. Struts ultimately introduced something called a DynaActionBean, which allows you to define your form data in the XML file, rather than writing custom code. Even so, the fields are fixed, so it wouldn’t work for surveys. As far as I know Java Server Faces still doesn’t have this feature.

So today I decided to give Java Server Faces a look, since I’m working on something similar to the problem JSF is supposed to solve. Earlier this month I finished writing an internal utility which allows us to edit all the data in our database through web forms. It was surprisingly easy, considering we have 72 tables spread across two databases. The utility lets us avoid editing the database directly, thus enforcing the rules that are defined in our Java classes. And every field in the web form is cross-referenced against documentation gleaned from our source code, so when our system administrator (or–worse– our CEO) pokes at things, he can see the trouble he’s getting into.

Today I’m writing a web form so that our clients can change their account information. I can’t just give them access to our internal utility, but I’d like to leverage the work I did on it. JSF, as I mentioned, is completely the wrong tool. And I realized what the right tool is.

The right approach

To fill in a web form, Java already has some conversion routines to, for example, turn the string “123″ into an integer. What I’m working on builds on those conversion routines. To put it simply, I work on a field-by-field basis, not a form-by-form basis. Each conversion class generates the HTML for its part of the form, and knows how to convert that back into the appropriate data type. The conversion classes are easy to write, so that special cases are easy to handle. When I write the code to generate the web page, I tell it where the data comes from, and it does all the rest. Rather than having special “beans” to remember what the web form looks like, when the form is submitted it works straight from the HTTP request.

Remarkably, writing my own custom framework that does the right thing should take less time than reading the documentation for JSF.

Romance vs. Everything You Need to Know About Artificial Intelligence

This is a follow-up to my last post.  In it, I was talking about how fascinating Artificial Life is.  Genetic algorithms (trying to do something useful with Artificial Life) and neural networks may be among the most overused artificial intelligence (AI) algorithms for the simple reason that they’re romantic.  The former conjures up the notion that we are recreating life itself, evolving in the computer from primordial soup to something incredibly advanced.  The latter suggests an artifical brain, something that might think just like we do.

When you’re thinking about AI from a purely recreational standpoint, that’s just fine.  (Indeed, I have occasionally been accused of ruining the fun of a certain board game by pointing out that it is not about building settlements and cities, but simply an excercise in efficient resource allocation.)

But lest you get seduced by one claim or another, here are the three things you need to know about artificial intelligence.

First, knowledge is a simple function of alleged facts and certainty (or uncertainty) about those facts.  Thus, for any circumstance the right decision (from a mathematical standpoint) is a straightforward calculation based on the facts.  This is simply the union of Aristotle’s logic with probability.  Elephants are grey.  Ruth is an elephant.  Therefore Ruth is grey.  Or, with uncertainty:  elephants are grey 80% of the time.  Ruth is probably (90%) an elephant.  Therefore there is an  (0.80*0.90=) 72% chance that Ruth is grey.  If you have your facts straight and know the confidence in them– something you can learn from a statistics class– you can do as well as any intelligence, natural or artificial.

Think about this for a moment.  For just about every situation, there are one of three cases.  In the first case, there isn’t enough data to be sure of anything, so any opinion is a guess.  How popular a movie will be when almost nobody has seen falls into this category.   Second, there is enough data that one possibility is likely, but not certain.  (The same movie, once it has opened to a small audience, and opinions were mixed.)  And third, evidence is overwhelming in one direction.  But in none of these cases will a super smart computer (or human analyst) be able to do better on average than anyone else doing the same calculations with the same data.  Yet we tend to treat prognisticators with lucky guesses as extra smart.

Which leads us to the second thing you need to know about AI:  computers are almost never smarter than expert, motivated humans.  They may be able to sort through more facts more quickly, but humans are exceptionally good at recognizing patterns.  In my experience, a well-researched human opinion beats a computer every time.  In fact, I’ve never seen a computer do better than a motivated idiot.  What computers excel at is giving a vast quantity of mediocre opinions.  Think Google.  It’s impressive because it has a decent answer for nearly every query, not because it has the best answer for any query.  And it does as well as it does because it piggybacks on informal indexes compiled by humans.

And the third and final thing you need to know about AI is that every AI algorithm is, at one level, identical to every other.  Genetic algorithms and neural networks may seem completely different, but they fit into the same mathematical framework.  No AI algorithm is inherently better than any other, they all approach the same problem with a slightly different bias.  And that bias is what determines how good a fit it is for a particular problem.

Think about a mapping program, such as MapQuest.  You give it two addresses, and it tells you how to get from your house to the local drug store.   Internally, it has a graph of roads (edges, in graph terminology) and intersections (vertices).  Each section of road has a number attached to it.  The maps I get from AAA have the same number– minutes of typical travel time.  MapQuest finds the route where the sum of those numbers– the total travel time– is minimized.  In AI terminology, the numbered graph is the search space, and every AI problem can be reduced to finding a minimal path in a search space.

What makes AI interesting is that the search space is often much too large to be searched completely, so the goal is to find not the shortest path, but a path which is short enough.  Sometimes the path isn’t as important as finding a good destination, for example, finding the closest drug store.
In the case of artifical life, each “creature” is a point in the search space.  Consider Evolve, the program I wrote about the other day.  In it, each creature is defined by a computer program.  The search space is the set of all possible programs in that programming language– an infinite set.  And any transformation from one program to another– another infinite set– is an edge.  By defining mutation and reproduction rules, a limited number of edges are allowed to be traversed.

So, to summarize:  certain AI algorithms sound romantic, but they are all essentially the same.  And humans are smarter than computers.

Artificial life

For some reason, yesterday I was obsessed with artificial life. It started out on Friday night when I was trying to sleep but found myself thinking about things that are entertaining but don’t help me sleep. (This happens to me a lot. In one particularly insomnia-provoking bout a few years ago, I figured out how long division works as a way of proving that 0.9999999… equals one, and discovered that that is true for the highest digit in base other than base ten.)

In this case, in the morning I searched online to find something interesting enough that I won’t have to write my own to satisfy my curiosity. Right now I barely have enough time to read about other people’s software, let alone write my own. Unfortunately the field is so diverse that I just might have to write my own someday. Not because the world needs more artificial life, but just because.

Artifical life consists of three things: a genome; a method for mutation and “natural” selection, a universe, which provides the domain where the life forms reside, as well as the physical laws of that universe. (The body of an artifical creature can be thought of as the union of the genome with the universe.) All three of these things are incredibly broad in their possibilities.

At one extreme is framstics, which is a research project (as well as shareware and other stuff) which models systems that could be physically possible in our world. Think of it as a construction kit, not unlike an Erector Set or Lego Mindstorms, that exists entirely in software. There are sticks, muscles, neurons, and sensors, all of which are modeled on real-world things. For the genome, they have several options. One is a low-level language which simply describes body parts and connections between them, much like a blueprint or circuit diagram. The rest are all intended to be more realistic or interesting genomes. They wrote a paper which describes in engrossing detail something I regard as completely obvious (and which makes artificial life compelling to me): the genome you choose has a profound impact on the organisms you get. In the low-level Framsticks language, a simple mutation (changing one letter in the genome) is likely to result in a useless change. Languages which have symbols for repetition or recursion allow simple mutations to yield symmetrical bodies or repeating patterns.

On the topic of natural selection, the defining quality of artifical life is that there is mutation (and optionally gene sharing, a.k.a. sexual reproduction) and culling of the less fit. In some cases it’s more like breeding, where the computer chooses the most fit organisms each generation according to an explicit “fitness function”– for example, choosing the fastest or the tallest organisms. That is analagous to breeding. The option is to build reproduction into the organisms and limit the resources so that some of them die off (or worse, none die and your computer gets swamped calculating the actions of random, unevolved– and therefore uninteresting– critters.) The former case has practical applications– a genetic algoritm has yielded a patentable circuit design– but the latter strikes me as more entertaining. Framsticks supports both modes.

Even though framsticks is a complete toolkit which includes great eye candy and enough complexity to keep you busy for years, I’m somewhat more fascinated by a little program called Evolve. It is in every way the opposite of Framsticks. It’s a one-person project. The genome is simple computer code, a simple version of Forth. So it’s not unlike what I write for a living. The universe is a grid, like a chess board. (The technical term for this sort of grid universe is a cellular automaton, the most famous of which is Conway’s Game of Life.)

What makes Evolve so fascinating is it’s so simple and comprehensible to a programmer like me.  Its author has learned a lot about what sort of languages and mutations yield intersting results.  Are these things modelling plants or animals or what?  Do they have brains, or is it all just DNA?  In this case, it’s the latter, and the DNA contains commands like “MOVE”, “EAT”, “GROW” and “MAKE-SPORE”.  It is far more like a computer program than a life form, and yet it is all the more compelling because it is so unlike earthly life.

Finally, this isn’t artifical life, but it’s the inspiration for a lot of artificial life sims.  Core Wars.   First described in a popular Scientific American article, Core Wars is a battle to the death between two computer programs in a simulated machine.  The assembly language for the Core Wars computer (called Redcode) is simple enough that the article suggests running it as a pen-and-paper game.  (Or you could write to the author to get the computer code; this was before the Internet.)

Until yesterday, I’d heard of Core Wars but never looked into it, since I tend to think of assembly language as difficult.  Which it is, if you’re trying to write something complicated like a web server.  But for this sort of thing, it’s simple and approachable– not to mention the best language for doing the sort of mischief that other languages try to keep you from doing.  Self-modifying code in particular, which is the core of the game:  modify your opponent before it modifies you.

The worst product idea in a very long time

I read a story in the paper yesterday about a product called Nicogel. It is a hand lotion substitute for the nicotine patch. Only it’s available without a prescription, because it’s not a drug. It contains tobacco extract, rather than purified nicotine, so it is presumably categorized as an herbal supplement. Or it would be if it were a food. As it is, it appears to be completely unregulated. And it contains a highly addictive substance. And it’s in a form that kids love.

I can just imagine lots of 12-year-olds trying to get high on hand lotion, only to find themselves addicted.  And perhaps getting skin cancer.

Computerized philosophers

A week ago they had a researcher on the radio talking about computational morality. I don’t remember if that’s what they called it, but that’s the gist of it: using logic to solve moral dilemmas. The example they gave was this: a train is going out of control down a track toward a group of people. You are in the position to flip the switch to make it go down another track, where there is only one person. If you act, one person will die. If you do nothing, many people will die. When posed with this question, most people will say they would flip the switch.

However, if you change the question just slightly, people’s answers change. Instead of a switch, there is a very large person standing in front of you. If you push him, his body will stop the train before it hits the other people.

The computer sees these as morally equivalent. In both cases, it is a choice between the death of one person or the death of several people. Most people, on the other hand, would not actively kill one person to save the lives of several.

The researchers when on to talk about how computers could help to make people more rational by pointing out these situations. Now, my professional life is filled with statistics and probability, and I spend a lot of time arguing for rationality. But when it comes to morality, this is a case of garbage in equals garbage out.

Humans are exceptionally well-adapted to their environment, especially when it comes to surviving in an emergency situation. So are rabbits, squirrels, and even squid. Millions of years of survival of the fittest tends to hone a brain pretty well. And when it comes to social creatures, the calculus extends to saving the lives of others.

The logic of the computer is pretty simple. Saving one life versus saving many. It’s so easy, an ant could do it with a single neuron. So why the different answers?

It all comes down to certainty. The question is posed as a case of two certain outcomes. But in life there is no certainty. The certainty itself puts the question into the realm of the hypothetical. Brains are optimized for real world use, not hypothetical situations.

The history of artifical intelligence is riddled with robots which could plot a perfect path from point A to point B, but couldn’t tell a shadow from an obsticle. Or which became perplexed when a sensor malfunctioned. Even in the most controlled conditions, it’s hard to completely eliminate ambiguity and its buddy uncertainty.

In a real world train situation, there’s no guarantee that the people standing in the train tracks would die. They might manage to get out of the way, or they might only be injured. When you act to cause harm, there is a greater chance that you will be successful than when it is merely a side effect of your action.

Rain on New Year’s Eve

It was raining all day on New Year’s Eve.  That’s not supposed to happen in Minnesota!  When the sun went down (before 5:00), it turned into snow and we had a nice blizzard.  Even so, if I wanted rain on December 31, I would live in Seattle.