开发者

How do I define a fitness function?

I'm working on a project which will have a select开发者_运维技巧ed set of data and each data will have different attributes. I will need to use a fitness function to choose the data that best matches my selected scenario using the attributes.

However, I don't really find any sites explaining how to define my own fitness function. All I've got is that it's part of genetic algorithm, and this is as far as I got. So, can I be given some pointers here?


This is the hard part of GAs (well, that and data representation) and really you can only learn by experience.

Stating the obvious, the function has to be something that measures how good the results are. In particular, it has to be smooth across a wide range of data - whatever the data, your fitness function has to show the right way to improve.

So, for example, a fitness function that is zero unless the answer is right is no good, because it doesn't help you get close to the right answer when you are starting.

And a fitness function that increases as things get better, but doesn't identify the very best solution is not so good either, because your population will improve to a certain point and then get stuck.

So you need to sit down, write out some examples of your data, and then think about what kind of function you can use. You want something that gives low values for bad data and high values for good data. And that adjusts nicely between the two.

Try any crazy idea you can think of at first, and then see how you might put that into a nice mathematical form. Just brainstorm and keep trying and iterating... you will probably find that your first choice isn't so good, and once you run the GA you'll be able to look at what is happening in more detail and improve it.


Are you sure what you need is actually a fitness function?

Fitness function is, as you said, something used in Genetic Algorithm. It is used in each iteration of the algorithm to evaluate the quality of all the proposed solutions to your problem in the current population. The fitness function evaluates how good a single solution in a population is, e.g. if you are trying to find for what x-value a function has it's y-minimum with a Genetic algorithm, the fitness function for a unit might simply be the negative y-value (the smaller the value higher the fitness function).

What I'm basically trying to say, fitness functions don't deal with the attributes that much, just evaluating the results.

If you want to choose the most representative sample of data that contains attributes, maybe you should also look into classification or clustering methods? You didn't give much info in what way the selected scenario will be represented, but maybe you could cluster your data (you might try k-means clustering algorithm and try increasing the number of clusters until the classification error stops falling significantly?) and than choose a representative data cluster once you have the scenario requirement?

If you have given more details about how the queries are represented in respect to the data representation, you might have gotten a different (or better) answer from someone.

Then again, if you only goal is to learn the Genetic Algorithm or any other part of AI / Machine Learning field, you should do exactly what phs suggested and look for a book, audio lecture, enroll in a class for that or something similar.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜