Skip to content Skip to sidebar Skip to footer

Riak Map Reduce In Js Returning Limited Data

So I have Riak running on 2 EC2 servers, using python to run javascript Mapreduce. They have been clustered. Mainly used for 'proof of concept'. There are 50 keys in the bucket, a

Solution 1:

In order to avoid having to load all data from the preceding phase into memory on the coordinating node before running the reduce phase (which would be problematic for large mapreduce jobs), the reduce function is run multiple times. Every iteration gets a batch of results from preceding phase together with any output from earlier reduce phase iteration(s). The default batch size is 20, but this is configurable. As the results from one reduce phase iteration will be fed in as input to the next iteration, reduce phase functions need to designed to handle this, and some strategies are described here.

It is also possible to force Riak to only run the reduce phase once for the entire input set by specifying the 'reduce_phase_only_1' parameter, but this is generally not recommended, especially for large jobs.

Post a Comment for "Riak Map Reduce In Js Returning Limited Data"