Apache Spark Adoption Fuels Big Data Analytics Growth

Big Data Analytics

1 - Apache Spark Adoption Fuels Big Data AnalyticsApache Spark Adoption Fuels Big Data Analytics Growth

Rising demand for faster big data analytics apps is driving the adoption of Apache Spark, a core technology for modernizing data warehouses.

2 - Most Important Arributes of Apache SparkMost Important Arributes of Apache Spark

A full 91% cited performance as the top attribute, followed by ease of programming, at 77%; ease of deployment, at 71%; and advanced analytics, at 64%.

3 - Other Important Apache Spark FeaturesOther Important Apache Spark Features

Real-time streaming (52%) and DataFrames (47%) are additional important features. A full 64% of respondents are running the latest version of Apache Spark.

4 - Use Cases for Apache SparkUse Cases for Apache Spark

Business intelligence was ranked highest, at 68%, followed by data warehousing (52%), recommendation systems (44%) and log processing (40%).

5 - Most Common Apache Spark Deployment ModelsMost Common Apache Spark Deployment Models

Nearly half (48%) deploy Apache Spark in stand-alone mode followed by YARN running on Hadoop, at 40%. Just over half of respondents (51%) are running Apache Spark on public cloud.

6 - Most Common Apache Spark PlatformsMost Common Apache Spark Platforms

A full 75% are running Apache Spark on a Linux/Unix platform, while 47% are running on OS X. The fastest growing platform is Windows (23%), which grew 17 percentage points from 2014.

7 - Most Used Apache Spark ComponentsMost Used Apache Spark Components

Nearly seven in 10 (69%) are using Spark SQL, followed by DataFrames (62%), MLib + GraphX (58%) and streaming (58%). Three-quarters (75%) are using two or more Apache Spark components.

8 - Programming Languages Used With Apache SparkProgramming Languages Used With Apache Spark

At 71%, Scala is the most widely used programing language, followed by Python (58%), SQL (36%) and Java (31%). Python use is up 49% year-over-year.

9 - Apache Spark Components Used in ProductionApache Spark Components Used in Production

Nearly a quarter (24%) cite SQL, followed by DataFrames and advanced analytics (at 15% each), and streaming (14%). SQL use grew 380% from 2014.

Michael Vizard
Michael Vizard
Michael Vizard is a seasoned IT journalist, with nearly 30 years of experience writing and editing about enterprise IT issues. He is a contributor to publications including Programmableweb, IT Business Edge, CIOinsight, Channel Insider and UBM Tech. He formerly was editorial director for Ziff-Davis Enterprise, where he launched the company’s custom content division, and has also served as editor in chief for CRN and InfoWorld. He also has held editorial positions at PC Week, Computerworld and Digital Review.


Must Read