Through this rich set of apis it’s possible to make automatic the typical workflow of data ingestion, data set reduction and publishing of results to the serving layer. Once the data has been published to a serving layer, like Impala, they are easily accessible by any third party tool like Excel, traditional BI tools, etc.
The user can manipulate the data sitting in front of its browser and a system integrator could easily integrate the Harpoon’s apis with a process engine for the full automation of the data collection, analysis and publishing process.
Harpoon provides a strict and controlled way to lay out the data on top of HDFS. All the data is also listed and registered on the Harpoon metadata repository. In this way the user is forced to organise the data following a well specified pattern. All the data is organised in terms of databases and tables. However, everything is stored using the standard Hadoop apis allowing Hadoop native applications to access the data inside the Harpoon’s workspace.
Besides being an Hadoop based BI tool, Harpoon, through its powerful REST APIs, dramatically reduces the time for integrating Hadoop inside the enterprise following the modern approach of the Lambda Architecture.