开发者

Learning Optimal Parameters to Maximize a Reward

I have a set of examples, which are each annotated with feature data. The examples and features describe the settings of an experiment in an arbitrary domain (e.g. number-of-switches, number-of-days-performed, number-of-participants, etc.). Certain features are fixed (i.e. static), while others I can manually set (i.e. variable) in future experiments. Each example also has a "reward" feature, which is a continuous number bounded between 0 and 1, indicating the success of the experiment as determined by an expert.

Based on this example set, and g开发者_Python百科iven a set of static features for a future experiment, how would I determine the optimal value to use for a specific variable so as to maximise the reward?

Also, does this process have a formal name? I've done some research, and this sounds similar to regression analysis, but I'm still not sure if it's the same thing.


The process is called "design of experiments." There are various techniques that can be used depending on the number of parameters, and whether you are able to do computations between trials or if you have to pick all your treatments in advance.

  • full factorial - try each combination, the brute force method
  • fractional factorial - eliminate some of the combinations in a pattern and use regression to fill in the missing data
  • Plackett-Burman, response surface - more sophisticated methods, trading off statistical effort for experimental effort
  • ...and many others. This is an active area of statistical research.

Once you've built a regression model from the data in your experiments, you can find an optimum by applying the usual numerical optimization techniques.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜