CDAP Blog

AeroCask – Real-time Flight Data Analytics using CDAP

One of the many things that I love about Cask are the hackathons before every release. It is not only a way for us to dog-food new features in the CDAP platform but it is also an opportunity to let your imagination run loose and implement an integration with another system; or develop an interesting … Read more


Scalable Distributed Transactional Queues on HBase

A real time stream processing framework usually involves two fundamental constructs: processors and queues. A processor reads events from a queue, executes user code to process them, and optionally writing events to another queue for additional downstream processors to consume. Queues are provided and managed by the framework. Queues transfer data and act as a … Read more


The Shift to Realtime Processing: An Easier Way to Build Hadoop Apps

I was inspired by the recent Google I/O talk on Cloud Dataflow, a data processing service used internally at Google, which evolved from a model based on MapReduce and successor stream processing technologies such as MillWheel and FlumeJava. Based on the premise of focusing on your application logic rather than the underlying infrastructure, I set … Read more