Hadoop_开发者

开发者

Hadoop

相关标签：Mysql sql c django mongodb

Chaining multiple MapReduce jobs in Hadoop
In many real-life situations wh开发者_开发百科ere you apply MapReduce, the final algorithms end up being several MapReduce steps.
问答阅读(4)
OpenStreetMap and Hadoop
I need some ideas for a weekend project about Hadoop and OpenStreetMap. I have access to AWS EC2 instance with OpenStreetMap snapshot in my EBS volume.
问答阅读(4)
How to pick random (small) data samples using Map/Reduce?
I want to write a map/reduce job to select a number of random samples from a large dataset based on a row level condition. I want to minimize the number of intermediate keys.
问答阅读(9)
Does throwing an exception in an EvalFunc pig UDF skip just that line, or stop completely?
I have a User Defined Function (UDF) written in Java to parse lines in a log file and return information back to pig, so it can do all the processing.
问答阅读(7)
Hadoop application development, and PHP
For hadoop application development, are PHP frameworks less popular ?If so, why?Else,please do开发者_如何学Python pointliterature/documentation/tutorials for a specific framework? (stuff for Symfony w
问答阅读(4)
what is a data serialization system?
according to Apache AVRO project, \"Avro is a serialization system\". By saying data serialization system, does it mean that avro is a product or api?
问答阅读(4)
Better to build or buy a compute grid platform?
I am looking to do some quite processor-intensive brute force processing for string matching.I have run my prototype in a multi-threaded environment and compared the performance to an implementation u
问答阅读(8)
hadoop beginners question
I\'ve read some documentation about hadoop and seen the impressive results.I get the bigger picture but am finding it hard whether it would fit our setup. Question isnt programming related but I\'m ea
问答阅读(5)
Handling multiple connections to the host simultaneously
How can I handle a number of connections to开发者_如何学C the host at the same time?From nutch-default.xml:
问答阅读(4)
Map Reduce Frameworks/Infrastructure
Map Reduce is a pattern that seems to get a lot of traction lately and I start to see it manifest in one of my projects that is focused on an event processing pipeline (iPhone Accelerometer and GPS da
问答阅读(2)

首页上一页第61页下一页共67页