Specify Hadoop mapreduce input keys directly (not from a file)

2023-01-27 07:10 问答作者：

I'd like to generate some data using a mapreduce. I'd like to invoke the job with one parameter N, and get Map called with each integer from 1 to N, once.

Obviously I want a Mapper<IntWritable, NullWritable, <my output types>>...that's easy. But I can't figure开发者_如何学Python out how to generate the input data! Is there an InputFormat I'm not seeing somewhere that lets me just pull keys + values from a collection directly?

Do you want each mapper to process all integers from 1 to N? Or do you want to distribute the processing of integers 1 to N across the concurrently running mappers?

If the former, I believe you'll need to create a custom InputFormat. If the latter, the easiest way might be to generate a text file with integers 1 to N, each integer on one line, and use LineInputFormat.

Specify Hadoop mapreduce input keys directly (not from a file)

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

王昌瑞《潜梦追凶》剧组庆生新锐演员未来可期？

Is it allowed to ask users to enter credit card details for own payment method?

Escaping "<" in Perl-generated XML

imessage会显示已读吗？

微信重新建群怎么建？