I started into the last weekend being happy, loaded with energy and wrapped all around in Self Piece. The obligatory house cleaning was already done, so many plans regarding fun things waited to be tackled . Too many, as always. That might be already part of the problem.
Now I’m still working hard to recover from a wave of self-doubt that rolled over me. I don’t mean to blame anyone, besides myself, but it all started with my desperate struggle to participate in Olimex’s Weekend Programming Challenge #6 and continued with an unadorned view of our society, only a visit of the central post office can give. More on the latter in a next post.
As I write this, I realize that it might have started even earlier: this Friday’s noon visit of the coffee crew. The coffee crew are some coworkers that visit my office regularly – because that’s the place where the coffee machine ended up – for having a little geek talk about movies, programming languages or the shittiest things about Java and a mug of coffee, of course.
I planned to create a n-gram based fulltext search in my current project and was looking for advice from them, regarding a proper strategy of refreshing the index. They are not really into the project, as it is basically a one-man show, but they are hired as software developers (in contrast to myself) and I always appreciate some feedback from them.
They ripped the whole idea of n-gram based search apart! It took me some time to convince them, that it indeed might be a worthwhile solution. But at the same time I realized, that I always wanted to build something like this, since I heard about it (in my graduate course: Information Retrieval Systems). I’m actually not sure, if it is really such a good idea and if it would produce results that were good enough. I was puzzling about it, but Fridays are short and I was looking forward for the new programming challenge from Olimex.
Well I didn’t have that much time over the weekend: two tired hours on Friday evening, one hour on Sunday noon and two hours at the same days evening. As I’m writing this, I still have the feeling of uttering lame excuses. Olimex gave us a hard nut to crack. Another lame excuse? Well, the problem itself (a TSP with some additional things to tackle) wasn’t that hard. One could take a straight and simple route.
It was the openness of the solution space that allured me to come up with a real intelligent solution. And in the consequence with no solution at all. OK, what was the assignment?
The overall goal was to come up with a program to optimize some drilling holes machine movement. We needed to find the shortest path to drill the given amount of holes at predefined spots. There were different steps outlined in the assignment.
- We had to read a file. -> Easy
- Filter double holes out. -> Easy
- Stop if their are intersecting holes or holes out of bound. -> Challenging
- Find the shortest path to drill all the holes. -> Challenging
- Save the target order in a file on it’s own. -> Easy
In another course (Soft Computing) I learned about genetic algorithms (GA) and I programmed a few, so far. I also knew that you can utilize them to solve TSPs. So I decided to go that way. You know when you find your hammer. And GAs were my hammer, until then. I started with the file reading and then jumped straight to the optimizing part. I had the basic structure quickly working. That was Friday evening.
A GA mimics the natural process of genetics. You create a population of solutions: the order of holes to drill. Then you rate each of them regarding their fitness to solve the problem. In our case how short their path will be. You take the top performers and makes offsprings of them by combining two randomly chosen into a new one. You proceed until you reach the original population size. Then you apply a little bit of random mutation: push some holes around. The offsprings are merged with their parents, and the top performers are chosen for the next run.
But the next encounter (on Sunday) left me with no hope to complete the assignment on time. I learned that you won’t get any viable solution if you simply start out with randomly chosen solutions and cross and mutate them randomly. I made some modifications played with the size of the population, mutate rates and other parameters, but I couldn’t get even close to an optimal path.
I decided to modify the way the initial population is created. Instead of simply doing it randomly, I decided to mix random ones with greedily created ones. Basically, always taking the closest hole as the following one. This resulted in very short paths for the initial population. But it was time consuming. In my naive implementation the program had to check the distance to every other hole everytime, to find the closest one. One optimizing step could (should) have been to find all distances once and reuse them. I basically need this for step number three from above, anyways.
Another and even bigger problem was, that all population members were basically the same. My first attempt to get rid of this problem, was to select the first hole randomly. But that results only in a shift of the whole path. There had to be some more randomness for a GA. I decided for each following hole, to pick from a random number of holes, the one with the shortest distance. That resulted in slightly worse initial populations, but it was more diverse. I also fall back to use the fixed start position for all populations.
Unfortunately, it was already Monday, and there were still many parts missing. And even the optimizer wasn’t working. The initial population was OK, but the algorithm didn’t come up with better solutions. Crossing and mutating yielded only worse results. Well I figured, that the crossing scheme, were I took the holes alternating the parents, should incorporate the distance somehow.
But it was too late and my attempts to complete it, somehow, didn’t show any signs of progress. I stopped. Defeated.
In case you are interested in the source code of the solution here it is. Unfortunately, I only have the last version.
When you want to create test data for the challenge here is the script. I made test data from the TSP files referred in the comments section of the challenge. You might want to remove the scaling part, though.