Alooma is a data pipeline as a service, providing data teams a modern cloud-based streaming ETL solution. It brings data from various data silos together to a data warehouse, all in real-time. Alooma can handle complex ETL workflows at scale, without the extensive setup, integration and testing work required with traditional on-premise tools.
Alooma provides several components that make it easy to build a data pipeline:
- Code Engine, which allows data teams to enrich data streams programmatically,
- Mapper, which performs data type conversion and management of schemas and schema changes,
- Live, which provides visualization and transparency for a data pipeline,
- Restream Queue which facilitates exactly-once processing.
Alooma has built-in integrations with a variety of databases, SaaS applications and data sources.
Alooma was founded in 2013 and has raised $15 million in funding.
Alooma Code Engine
The Code Engine enables data teams to customize data exactly how they need it by writing real code to enable data uses such as real-time alerts, sessionization, and anomaly detection.
Capabilities of the Code Engine include:
- Geo-location enrichment - Events with IP information can be enriched with location using an integrated geo-location package.
- Sessionizing - Create custom sessionizing logic around timeouts, click paths, or other conditions.
- Mismatch cleansing - Make sure all your data arrives uniform and ready to query in your data warehouse by cleansing mismatches on the stream.
- Custom notifications - Emit a custom notification when a specific event arrives in the pipeline, or when you’ve identified a noteworthy trend.
- GitHub integrated - Write versioned and managed code in your own environment, updated automatically in Alooma as you commit.
- Testable and modular - Code Engine code can be written in a modular way, with unit and integration testing capabilities built right in.
- Custom packages - Use your own Python packages, or Alooma’s built-in packages, to run any business logic on the data stream.
- Stateful processing - Use Alooma’s stateful processing engine to unlock scenarios like sessionizing, alerting, anomaly detection, and more.
The Mapper helps data teams convert data from any type and source exactly the way they need, as well as manage schema changes and infer schemas automatically.
- OneClick and Custom Modes - Whether data is structured or semi-structured, Alooma infers schemas and enables automatic mapping of integrations with its OneClick mode. Alternatively, data teams can use custom mappings for full control.
- Managed Schema Changes - In the real world, data changes happen and can even be unexpected. When data changes, Alooma responds in real time to make sure the stream never loses an event. Choose to manage changes automatically, or get notified and make changes on demand.
Alooma Restream Queue
Never lose an event with Alooma’s safety net. The platform catches any error, for any reason, and “restreams” it through the pipeline for exactly-once processing. Data is always safe with the Restream Queue.
- Safety Net - Code can be buggy, but the Restream Queue captures any errored events until the bug is fixed.
- Schema changes - handle schema changes automatically, or have them captured and retained on the site until you choose how to map them.
- Data type mismatches - Data is rarely perfect, so when an event with a bad data value shows up, it will be stored for inspection and treatment.
- S3 Retention - Alooma provides an additional layer of protection with S3 Retention. Stream raw, untransformed events directly into an S3 bucket to create a backup of the data stream. S3 Retention enables replaying data through Alooma at any time, for any reason.
Alooma functions as an Integration Platform as a Service, transparently connecting numerous data sources, both on-premises and in the cloud, into your data pipeline:
- Data Warehouses - Alooma supports Amazon Redshift, Google BigQuery, Snowflake, and more.
- Databases - MySQL, MemSQL, Amazon Aurora, MongoDB, Oracle, MS SQL Server, and more.
- SaaS and Cloud Services - SalesForce, Asana, Intercom, JIRA, NetSuite, Magento, Mixpanel, Zapier, Shopify, Trello, and many more.
- Marketing and Email Services - Google AdWords, Google Analytics, Facebook Ads, Hubspot, MailChimp and Mandrill, Marketo, SendGrid, Vero and more.
- On-Premise and Cloud Storage - Azure Storage, Amazon S3, Box, FTP/SFTP, Google Cloud Storage and Google Drive.
- BI and Analysis Tools - Chart.io, Tableau, Redash, SQL Workbench, Looker, QuickSite, Mode Analytics, and more.
A real-time visualization tool that enables data scientists and engineers to monitor live data streams.
Alooma Live provides:
- Connectivity - connects either to Kafka topics or Kinesis streams to see all events in real-time.
- Visibility - collect live data samples, measure real-time metrics and filter streams on the fly.
- Control - Monitor data behavior, debug incorrect data events, and identify patterns as they form.