Apache NiFi

Apache NiFi is an open source project which enables the automation of data flow between systems, known as “data logistics”. The project is written using flow-based programming and provides a web-based user interface to manage data flows in real time.

The project was created by the United States National Security Agency (NSA), originally named Niagarafiles. In 2014 the NSA released it as open-source software. Apache NiFi continued to be developed at Onyara, Inc., which was subsequently acquired by HortonWorks.

Connect all your data sources to any data warehouse

HortonWorks CEO says that NiFi can be significant for managing data in the world of IoT: “The nature of NiFi also allows users to manage their bandwidth more efficiently, which is a significant issue given the potentially vast volumes of data generally by Internet of Things apps and the physical and economic restrictions on bandwidth that exist in many parts of the world (...) It's important to be able to prioritise the data that gets sent and maybe only send the summary level information and whether an anomaly is detected. Then from the central processing area you can go back and request more data from that particular unit. NiFi (...) allows only relevant data to be sent, the processing algorithm and a graphical user interface to help monitor and manage the bidirectional data flow.”

What Apache NiFi Does

Apache NiFi is an integrated data logistics platform for automating the movement of data between disparate systems. It is data source agnostic and supports sources of different formats, schemas, protocols, speeds and sizes. Some common formats are geo location devices, click streams, files, social feeds, log files and videos and more. NiFi provides a configurable plumbing platform for moving data, and enables tracing data in real time.

Apache NiFi is designed from the ground up to be enterprise ready - flexible, extensible and suitable for a range of devices from a network edge devices such as a Raspberry Pi to enterprise data clusters and the cloud. Apache NiFi can also adjust to fluctuating network connectivity that could impact the delivery of data.

Apache NiFi flow

Image source: https://hortonworks.com/apache/nifi/

Apache NiFi Features

NiFi supports directed graphs of data routing, transformation, and system mediation. Features include:

  • Web-based user interface - covering design, control, feedback, and monitoring.
  • Highly Configurable - enables a balance between loss tolerance and guaranteed delivery, and low latency vs high throughput. Enables dynamic prioritization of flows, modification of flows at runtime, and back pressure thresholds, which specify amount of data that may exist in the queue, to avoid overrunning the system with data.
  • Data Provenance - enables tracking data flows from beginning to end.
  • Extensible - enables users to build their own processors and more. Enables rapid development and effective testing.
  • Secure - supports SSL, SSH, HTTPS, encrypted content, and more. Provides multi-tenant authorization and internal policy management.

Connect all your data sources to any data warehouse