[Apache Spark][spark] is generating quite some buzz right now. [Databricks][databricks], the company founded to support Spark [raised $14M from Andreessen Horowitz][investment], [Cloudera][cloudera] has decided to fully support Spark, and others chime in that it's the next [big][gigaom] [thing][strata]. So I thought it's high time I took a look to get an understanding what the whole buzz is around. I played around with the Scala API (Spark is written in Scala), and to be honest, at first I was pretty underwhelmed, because Spark looked, well, so small. The basic abstraction are Resilient Distributed Datasets (RDDs), basically distributed immutable collections, which can...
Comments (0)
Sign in to post comments.