For Cloudera ensuring data security is critical because we have large customers in highly regulated industries like financial services and healthcare, where security is paramount. Also, for other industries like retail, telecom or public sector that deal with large amounts of customer data and operate multi-tenant environments, sometimes with end users who are outside of […]
For Cloudera ensuring data security is critical because we have large customers in highly regulated industries like financial services and healthcare, where security is paramount. Also, for other industries like retail, telecom or public sector that deal with large amounts of customer data and operate multi-tenant environments, sometimes with end users who are outside of their company, securing all the data may be a very time intensive process. At Cloudera we want to help all customers to spend more time analyzing data than protecting data. Cloudera secures your data by providing encryption at rest and in transit, multi-factor authentication, Single Sign On, robust authorization policies, and network security.
Cloudera Data Warehouse (CDW) is a cloud native data warehouse service that runs Cloudera’s powerful query engines on a containerized architecture to do analytics on any type of data. It is part of the Cloudera Data Platform, or CDP, which runs on Azure and AWS, as well as in the private cloud. The CDW service helps you:
This post explains how CDW helps you maximize the security of your cloud data warehousing platform when running in Azure.
CDW has long had many pieces of this security puzzle solved, including private load balancers, support for Private Link, and firewalls. As of a recent release it now also supports the ability to use Private Azure Kubernetes Service (AKS) clusters. Private AKS ensures private communication between the Kubernetes control plane and the Kubernetes nodes, which are run in the user’s Virtual Network (VNET). As such, it is now possible to run a private CDW environment in Azure.
For the most security-conscious customers, it is a requirement that all network access be done over private networks. This reduces the threat surface area, rendering impossible many of the most common attack vectors that rely on public access to the customer’s systems. When using AKS there are two types of network access:
For network access type #1, Cloudera has already released the ability to use a private load balancer. This ensures that your users who are interacting with the services running within the AKS cluster – such as HUE, or Impala and Hive via JDBC/ODBC – can only do so when using a private network. The image below shows the relevant network communication when using a private (or internal) load balancer and only private IP addresses.
For network access type #2, CDW originally only supported communication over public endpoints, which meant that your CDW environment was not completely walled off within a private network. However, now that CDW supports Private AKS, all communication with the Kubernetes control plane remains on a private network.
We can now create a private CDW environment in Azure. So customers can run their analytics without having to worry about securing the data. The following sections provide additional details on other aspects of how this is implemented, as well as information on steps to take to set this up for yourself.
CDW uses various Azure services to provide the infrastructure it requires. In addition to AKS and the load balancers mentioned above, this includes VNET, Data Lake Storage, PostgreSQL Azure database, and more. We are careful to ensure that each of these are also used in a secure manner, as explained below.
CDP provides a component called Cluster Connectivity Manager version 2 (or CCMv2) which enables the CDP Control Plane to communicate with the Kubernetes control plane and other resources in your network, such as virtual machines, using an inverting proxy solution. This ensures that all traffic goes through a secured HTTPS tunnel. In addition, you can use the Azure Private Link service to ensure that the CDP Control Plane can only be accessed through private endpoints.
For network egress coming out of the AKS cluster running in your environment, there is a transparent proxy that controls which traffic can pass. Rules are added for the required CDP control plane services, for the AKS service, and for storage account endpoints so that this outbound traffic is permitted – but no other.
By default Azure Data Lake Storage, PostgreSQL Database, and Virtual Machines are accessible over public endpoints. But for private CDW environments it is required to use private endpoints. If this is done then communication between these resources and with the CDW services running within the AKS cluster are done over private networks. This uses the Azure Private Link service.
Custom DNS is configured on the VNET to resolve Azure Private DNS zones. To resolve private endpoint DNS records, the VNET DNS servers must be capable of resolving Azure DNS records. Additionally, user-defined routing (UDR) is configured on the VNET to forward all traffic to an egress firewall and link it to the subnet.
The image below shows a representative architecture diagram for how a private CDW environment on Azure looks.
CDW support for Private AKS and the other aspects required for a private CDW environment is currently offered as a Technical Preview, and is under entitlement. In order to try this out, please contact your Cloudera representative.
In the meantime, the setup steps are summarized below at a high level, so you can get a sense of how easy it is to get this up and running. The full steps are included in our public documentation.
With the support for Private AKS, as well as a host of other network security related enhancements, CDW can now run in full private mode within Azure. This helps bring the benefits of CDW to the most security conscious customers. Please try CDW out and let us know how it works for you.
The post Create your Private Data Warehousing Environment Using Azure Kubernetes Service appeared first on Cloudera Blog.