We are excited to announce the release of the Cask Data Application Platform (CDAP) v3.1.0. In this release we have added support for MapR, that provides users with more distro choice when using CDAP. Furthermore, this release expands our footprint to support CDH 5.4, HDP 2.2 and Apache Hadoop with Hbase 1.0 and Hive 1.1.
In a previous release of CDAP we introduced Spark integration as an experimental feature, with Spark programs running in standalone mode only. We are now proud to support Spark 1.2 and 1.3 for distributed CDAP. This means that CDAP users will have a wider choice of processing paradigms with the ability to run MapReduce, Realtime, Spark on production use-cases.
In addition we made number of improvements to CDAP with v3.1, including
- Enabling Workflow token persistence
- Custom and system metadata for fileset partitions
- Incremental processing in workflows for partitioned filesets
- Ability to consume existing files in HDFS as CDAP datasets
- An quick and easy way to create real-time and batch ETL pipelines via the UI.
A complete list of new features, improvement, and bug fixes available in this release can be found in the Release Notes.
CDAP v3.1 also introduces an easy way to create real-time and batch ETL pipelines via the UI, which makes it very easy to set up and configure your Realtime or Batch ETL pipelines.