Your browser is out of date

Update your browser to view this website correctly. Update my browser now

×

Overview

Securely store, process, and analyze all your structured and unstructured data at rest 

Hortonworks Data Platform (HDP) is an open source framework for distributed storage and processing of large, multi-source data sets. HDP modernizes your IT infrastructure and keeps your data secure—in the cloud or on-premises—while helping you drive new revenue streams, improve customer experience, and control costs. 

HDP enables agile application deployment, machine learning and deep learning workloads, real-time data warehousing, and security and governance. It is a key component of a modern data architecture for data at rest.

 

Why HDP?

The latest version of HDP delivers new capabilities for the enterprise to enable agile application deployment, new machine learning/deep learning workloads, real-time data warehousing, & security and governance. It is a key component of the modern data architecture.

HDP Diagram

Key benefits & features

A container-based service makes it possible to build and roll out applications in minutes. Containerization makes it possible to run multiple versions of an application, allowing you to rapidly create new features and develop and test new versions of services without disrupting old ones. HDP also supports third-party applications in Docker containers and native YARN containers. Erasure coding boosts storage efficiency by 50%, allowing efficient data replication to lower TCO.

HDP provides the basis for supporting GPUs in Apache Hadoop clusters, enhancing the performance of computations required for data science and AI use cases. It enables GPU pooling for sharing of GPU resources with more workloads for cost effectiveness. It also supports GPU isolation, which dedicates a GPU to an application so that no other application has access to that GPU.

HDP includes a containerized TensorFlow tech preview that, combined with GPU pooling, delivers easier designing, building, and training for deep learning models.

HDP gives you the freedom to deploy big data workloads in hybrid and multi-cloud environments without vendor lock-in to a particular cloud architecture. Customers are able to seamlessly create and manage big data clusters in any cloud setting. HDP is cloud agnostic and automates provisioning to simplify big data deployments while optimizing the use of cloud resources.

Cloud storage support to store endless amounts of data in its native format including Microsoft ADLS, WASB, AWS S3, and Google Cloud Storag. Cloudbreak provides easy provisioning of clusters in the cloud by deploying HDP to your cloud provider of choice

HDP includes improved query performance to focus on faster queries. Hive LLAP, the fastest Apache Hive engine, runs in a multi-tenant environment without causing resource competition. This integration drastically speeds up queries commonly used in Business Intelligence scenarios, such as join and aggregation queries. In addition to query optimization, Hive also allows the creation of resource pools, for fine-grained resource allocations.

HDP enables ACID transactions by default making it easier to updates in Hive tables and support GDPR requirements. Hive, as a real-time database, eliminates the performance gap between low latency and high throughput workloads to process more data at a faster rate.

HDP continues to provide comprehensive security and governance. HDP’s security is integrated in layers and includes features for authentication, authorization, accountability, and data protection. The integration of security and governance allows security professionals to set classification-based security policies. In addition, data governance tools empower organizations to apply consistent data classification across the data ecosystem.

Additional features allow the auditing of events to get more fine-grained and detailed, making it easier for auditors to do their job. Auditors and users can see full chain of custody as the data moves through the ecosystem. Tag propagation to allow auditors and users to see where the data is going across the enterprise and to retain context of data that is sensitive. Time base polices allow temporary access to a given user.

Data Hub facilitates management, monitoring, and orchestration of all services from a single pane of glass across all environments.

Your form submission has failed.

This may have been caused by one of the following:

  • Your request timed out
  • A plugin/browser extension blocked the submission. If you have an ad blocking plugin please disable it and close this message to reload the page.