开发者

Dataset for Apriori algorithm

I am going to develop an app for Market Basket Analysis (using apriori algorithm) and I found a dataset which has more than 90,000 Transaction records .

the problem is this dataset doesn't have the name of items in it and only contains the barcode of the items .

I just start the project and doing research on apriori algorithm , can anyone help me a开发者_如何学JAVAbout this case , how is the best way to implement this algorithm using the following dataset ?


these kind of datasets are consider critical information and chain stores will not give you these information but you can generate some sample dataset yourself using SQL Server .


The algorithm is defined independent of the identifiers used for the object. Also, you didn't post the 'following data set' :P If your problem is that the algorithm expects your items to be numbered 0,1,2,... then just scan your data set and map each individual barcode to a number.

If you're interested, there's been some papers on how to represent frequent item sets very efficiently: http://www.google.de/url?sa=t&source=web&cd=1&ved=0CB8QFjAA&url=http%3A%2F%2Fciteseerx.ist.psu.edu%2Fviewdoc%2Fdownload%3Fdoi%3D10.1.1.163.4827%26rep%3Drep1%26type%3Dpdf&ei=QdVuTsn7Cc6WmQWD7sWVCg&usg=AFQjCNGDG8etNN2B4GQ52pSNIfQaTH7ajQ&sig2=7r3buh8AcfJmn2CwjjobAg


The algorithm does not need the name of the items.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜