Questions tagged [mapreduce]
MapReduce is an algorithm for processing huge datasets on certain kinds of distributable problems using a large number of nodes
28 questions
4
votes
3
answers
327
views
Given two sparse vectors, compute their dot product
Problem Statement:
Given two sparse vectors, compute their dot product.
Implement class SparseVector:
SparseVector(nums) Initializes the object with the vector nums
dotProduct(vec) Compute the dot ...
4
votes
0
answers
478
views
Subclass of Python's multiprocessing.Pool which allows progress reporting
For context, the whole of the project code can be found here. This question was created specifically for the progress.py file.
The goal behind it is to allow ...
4
votes
2
answers
78
views
Put every object in a specified bucket
I have this array of objects:
...
0
votes
2
answers
5k
views
Javascript + Filter object of values
I have the object with values. I trying to filter based on values.
...
2
votes
1
answer
230
views
The best way for inserting multiple objects into array
I have a transformer helper function. It reduces over the array and transform key/value pairs. At the end of the loop there is the key 'EXAMPLE1' exists and I should insert two objects after the first ...
2
votes
1
answer
94
views
Efficiently aggregating nested data
Problem
Given the following data:
...
1
vote
2
answers
213
views
Extract unique words from given text and group by letter count
The task is for training go-lang. The idea is to extract unique words sorted and grouped by length. Might be useful in learning new words. The program uses command line argument assuming it's a file ...
2
votes
1
answer
95
views
Looping over a bidimensional array and extract data to a new one
I have a bidimensional array like this:
...
2
votes
1
answer
104
views
Building objects in javascript, without "if(!a[k]) a[k] = []"
When building objects using reduce, I often have crappy code like this:
...
2
votes
0
answers
345
views
Groovy Map/Reduce for Jenkins DSL
Jenkins DSL doesn't support collect and inject from what I can tell (I get missing method exceptions when I try), so I ...
3
votes
2
answers
835
views
Summarizing the score of a personality quiz
This function takes a list of questions and list of answers provided by the user.
The list of answers is always a list of booleans (for true and false) and the list of questions takes the following ...
6
votes
0
answers
100
views
Pyspark Solver for Tiered Board Games
I've written a Pyspark program that will completely solve a tiered board game (no loops, each game position is a member of only one tier) and writes each tier to a file. It also determines the ...
8
votes
3
answers
1k
views
Find Top 10 IP out of more than 5GB data
I have a few of files, and total size of them is more than 5 GB. Each line of the files is a IP address, looks like:
127.0.0.1 reset success
...
127.0.0.2 reset success
how can i find Top10 ...
3
votes
1
answer
133
views
Classifying and counting database entries using Scala map and flatMap
I am new to Spark and Scala and I have solved the following problem. I have a table in database with following structure:
...
5
votes
2
answers
670
views
Accepting user defined functions for custom map reduce functionality in C++
I am implementing map and reduce - style functions for processing geospatial raster datasets.
I would like the ...