Sunday 28 October 2007

The Journey - Java - Yahtzee - Darwin

Well I decided to create my own blog. The title is "The Journey - Java - Yahtzee - Darwin", but I could not think of anything useful to put in in it's place.

The root of this is that I wanted to learn some of the basics of programming in Java. The problem with this is it seems that every who knows Java has their head up their a** and will confuse the hell out of you with excessively long titles (inheritance, polymorphism). If you ask .. what is polymorphism .. the reply is akin to some teacher looking down at the pupil and saying "you can look these things up.. that question is beneath my intellectual powers". I actually NOW translate the above to mean. "This is a fundamental question of which I should know the answer, but I either don't know or am too limited in my powers of expression to give to you".

That said I decided to write a game of Yahtzee in Java (there are loads of sites that explain Yahtzee, and there are some handheld electronic games. These games should come with an addiction warning.. I've worn one down already). Basically Yahtzee is a simple game, relatively easy to understand. The tactics to play are not so easy. You can play defensively and get an average score and throw away some opportunities, or be optimistic, which occasionally pays off (My top score is 620, the average is in the region of 200-250).

Objectives: Write Yahtzee program in Java. Learn Java in the process.

Simple objectives... The big hurdle is the learning java bit. But actually this was not as hard as I thought. I am experienced in writing programs (I started life as a FORTRAN developer, then used c shell, awk, bourne shell, and perl) All of these were process based and Java is object oriented (this may not make lots of sense, but perl, c shell etc is based on writing modules that "do" things or process data. Java is based on storing objects or data. In reality both need to do both, it is just where the focus is.) From my discussions with various people it should have been a totally different experience. My attitude is that learning Java can't be too hard (or not as complicated as Java aficionados make out) as people do use it, even 12 year old children write Java. (That said 12 year old children also use mobile phones features that I can only dream of of.. hey ho). In essence I bought a book, Learn Java in 60 Minutes... I practiced some of the exercises and bingo I can now do a bit of Java (I would not say I'm the best programmer, but I can usually do things in it).

The Yahtzee program was really much easier than I thought. Because of the modular nature of Java programs I simply broke down the process into natural chunks. Getting a module to throw a dice. Getting a module to hold the game state or position (score, bonuses, what was free, how many turns were left etc). Outputting the result. To ensure that I had not introduced any bugs I was able to use JUnit tests. (J stands for Java, Unit is a type of test that focuses on a single feature or function). I really wish I had a unit testing framework when I was developing. Back then (wheels weren't round, hills were young, you get the idea) Unit testing was not considered to be useful. You merely said that the program worked and that was enough. You did unit testing but hid it from your colleagues as "irrelevant" data. As such it was very much a hit and miss affair as to how good your code was, or even if you had broken someone else's function. It is very easy to edit someone else's code, and make it do what you think it should do, and in the process break someone else's functions. This is why you should write lots of unit tests, and they should be run regularly and maintained as part of the code. This way, if you do break someone else's functions, then you at least know what you've done. ...Upshot code written and works and continues to do so

Success: Yahtzee code written, Learn some Java (enough to write a Yahtzee program)

Result: Now I had the program, I was curious if the computer could play it? If so, it would play really fast and get really high scores... loads of Yahtzees...

Objective2: Write a module that will play Yahtzee very quickly.

I think this is very Douglass Adams (or Douglass Adams might find it amusing enough to comment/write about it). Can I write a program such that I am no longer required. It will do everything I can do, far quicker. I will become redundant... Is this a good or bad thing?

So writing the module that will play Yahtzee was interesting. Basically, it does not need to be very intelligent, because the computer I was using was able to play about 150-200 games per second what you lost in intelligence you can make up for in sheer number of games. That said, you do need to write something with some intelligence such that you have some chance of it giving genuine.

Attempt1: "Max score" was my first algorithm. It was simplicity itself. Basically
  1. look at the dice & score it against all the different options
  2. Hold the highest scoring combination
  3. Use all 3 attempts to improve score.
  4. The logic was the same no matter if it was the first throw or the last throw.
As an example if attempt 1 was to throw 4,4,1,3,6, you would score 8 against 4s, 1 3 & 6 against 1s, 3s, and 6s respectively, zero against everything else. So hold 4 & 4. This may yield another 4 (12 against 4s) or it may yield 12344, which is a small straight. If so the system would hold 1234 & un-hold the last 4. The last throw would essentially mean you could get either a small straight or a large straight. This seemed to be a good algorithm, except when you realise that if you roll 1,1,3,4,6, it will go for 6s and not 1s. Time to improve the system...

Now for some stats on the system. Because the logic was so simple it could play many games very quickly (263 games per second, with logging turned off). I managed 9,000,000 games in under 10 hours (run over night). My best scores were 835 with 6 Yahtzees. The average score was 207, with a top 1% of 445 and bottom 1% 113. It also managed an average of 0.39 Yahtzees per game. Not too bad. I recon my average manual score is about 200, and I certainly don't get any where near 0.39 Yahtzees per game (admittedly this is measured over millions of games). The performance for the game logic vs game response was that 51% of the time was spent doing the response, and 49% was game logic.

Attempt2: Count good Dice. Well the above was a good start and I could see that the max score method would not yield a good strategy in some fairly simple scenarios. To do this a different type of approach was needed. I basically wanted to solve the problem of two 1's being ignored in favor of a single higher number. To this end it seemed sensible to simply count the number of good dice. so 11234 would score 2/5 for 1's 1/5 for 2's, 3's 4's 4/4 for small straight and 4/5 for large straight. What was also needed was a way of ordering them, so 11366 it would go for 6's in preference of 1's, similarly 11111 would be logged under yahtzee in preference to 1's, 3 of a king 4 of a kind etc. The way I did this was to assign a weighting to each score (1s, 2s etc). So my basic premise was to fill the 1s, 2s 3s before the 3 of a kind, fill 4 of a kind in preference to 3 of a kind and so on. the weighing looked like this, where count is the number of good dice
1s, 2s, 3s .. 4x(count-2)
3 of a kind 3x(count-2)
4 of a kind 7x(count-3)
full house 4x(count-4)
small straight 3x(count-3)
large straight 7x(count-4)
yahtzee 45x(count-4)-30
chance -1
The above was done the way it was after ranking some of the responses. I did not go into details of all permutations, just enough to get some weighting factors.

There are some anomalies.. The first was that if you have already got a yahtzee and you are luckiliy enough to roll a second yahtzee, you should really make the most of it. log it under the highest scoring number you can (usually large straight) and so on. However, this method would use 4 of a kind next, then what ever the number was (say 1s), then 3 of a kind, full house, etc with small straight and large straight very low down. The next major anomaly was that it was difficult to tune the weighting for a specific logic. The weighting was arrived at using only a few examples. The last problem was similar to the first solution. the system did not differentiate between the first or last throw.

What happened from here, was that I spotted that the system would log the numbers in a order of preference. a ranking if you like. As such I used this as a unit test. From this I could look at which order it would log the dice under for a specific pattern.

On the game statistics side of things. Because the game logic was more complex the system slowed down to 136 games per second with logging off (and 74 with logging on). 1,000,000 games were played in just over two hours. I stopped running large millions of games.. getting lazy. The max score was 927 with 7 yahtzees. Average score was 225, top 1% was 475 and bottom 1% was 106. The system managed 0.49 yahtzees per game. I think this was very good scoring. The performance for the game logic vs game response was 75% of its time calculating responses 25% doing game logic. This is why the speed dropped so much.

Attempt3: Rank Order response .. From the unit testing of count good dice I realised that a way of scoring was to order the responses. I then realised, that the algorithm could simply be a ranked order sequence of what to go for for when certain patterns were seen. All I had to do was generate the patterns, set the order. I already had a good head start of what the orders might be from the count good dice.

The details of this method are even more complex. However, I did not spend too much time on this phase as it was quite plain that I could use genetic Algorithms to breed the best solution, and hence make my logic would become redundant.

As such more work is required in this phase to optimise the responses, and run the tests. The only algorithm I did do was a reverse engineer of the Max Count response. There ARE differences between the two, so the results will be different. I basically categorised all patterns into one of 20 patterns. The easiest of which are things like five of the same number etc. (if someone want details of the 20 categories, post a comment I will gladly oblige). However, the categories are broadly split into 5 based on 5 of a kind, 4 of a Kind, 3 of a kind, 2 of a kind and 1 of a kind (runs). 3 of a kind and 2 of a kind are the most complex sub categories.

On the game statistics side of things. There are two factors here, the fact that every thing is passed as a string (and computers do not like strings), and the complexity of the logic. The upshot is that the system slowed down to 100 games per second with logging off (no results for logging on). 100,000 games were played in 16 minutes. I stopped running large millions of games.. getting lazyer. The max score was 698 with 5 yahtzees. Average score was 215, top 1% was 471 and bottom 1% was 98. The system managed 0.49 yahtzees per game. I think this was pritty good scoring. The performance for the game logic vs game response was 82% of its time calculating responses 18% doing game logic. This is why the speed dropped (again).

.. TODO .. get a better ranking and get higher score., Run the game for 1,000,000 games and stop being lazy...

The age of Darwin beckons, or Genetic Algorithms.. The last attempt was really a proof of concept. It showed the possibility that the game can be broken down into a map. This map will control the game logic. All that was required was for a genetic algorithm to optimise the whole map. Just a couple of minor problems .. I don't know about genetic algorithms .. the learning curve for Java never seems to stop. (what is a shallow copy of an array!!) Hey ho .. I wonder if 12 year old children know about genetic algorithms.. They certainly are a product of them.

On the genetic algorithm front I actually sit next to a chap at work who is an expert in computer algorithms. So after a conversation with him I felt sufficiently well armed that I could at least give the gene thing a go. Also wikipedia has a few articles on it. I think there is some philosophical statement along the lines of .. just do it ..

My approach..
  1. Generate 4 random parents
  2. from these 4 parents breed 12 other children (two parents can breed two children)
  3. Run these 16 and keep the top 4 scores (average score over 1000 games)
  4. repeat steps 2&3 for 50 generations or so
Some details..
Generate 4 random parents .. not too hard. I have 20 categories, each category has 18 possible responses (13 are the 1s 2s 3s, full house etc, and 5 others which are 1st, 2nd, 3rd .. 5th most frequent numbers).

The breeding phase. I took what is called "one point cross over" which basically means I pick a random number from 1-18 (which corresponds to the 18 responses). I split the 20 categories at this point. The father will donate up to this point, the mothers sequence will fill in the rest. This is why you breed father mother and then mother father. I do not use the same random number for both breeding's (perhaps I should).

Top scores not too much to say here, simply pick the top 4.

Problems..
Well the above is pretty rubbish at breeding anything approaching an optimal solution. Even inserting the best solution from attempt3 (rank order response, which was based on counting good dice) does not improve things. Why is this?
  1. Not enough games being played, so actual game statistics are too unreliable. In other words you cant tell the difference between lucky games and good maps.
  2. Not enough generations.. well I suppose we took millions of years to evolve
  3. Not enough mixing of the genes
  4. expectations too high. I was rather hoping that this approach would yield a good set of results. In fact I was half expecting someone to say .. of course this sequence is correct because .. insert some mumbo jumbo here .. If you were to ask the same person before hand, you would get a very non committal reply. This is the difference between doers and thinkers. Sometimes doers think and do, where as thinkers just think, and usually of reasons of why inaction is the best course of action (or is it inaction).