Skip to content

You are viewing documentation for Immuta version 2.8.

For the latest version, view our documentation for Immuta SaaS or the latest self-hosted version.

Immuta CDH Integration Installation

Audience: System Administrators

Content Summary: The Immuta CDH integration installation consists of the following components:

  • Immuta NameNode plugin
  • Immuta Hadoop Filesystem plugin
  • Immuta Spark 1.6 Partition Service (DEPRECATED)
  • Immuta Spark 2 Partition Service

This page outlines the installation steps required to successfully deploy these components on your CDH cluster.

Prerequisites

Follow the Immuta CDH Integration Prerequisites to prepare for installation.

Installation

Begin installation by transferring the Immuta .parcel and its associated .parcel.sha files to your Cloudera Manager node and placing them in /opt/cloudera/parcel-repo. Once copied, ensure files have both their owner and group permissions set to cloudera-scm

chown -R cloudera-scm:cloudera-scm /opt/cloudera/parcel-repo

Next, transfer the Immuta CSD (.jar file) to /opt/cloudera/csd, and ensure both its owner and group permissions are set to cloudera-scm as well.

chown -R cloudera-scm:cloudera-scm /opt/cloudera/csd

You will need to restart the Cloudera Manager server in order for the CSD to be picked up:

systemctl restart cloudera-scm-server
service cloudera-scm-server restart

Follow Cloudera's instructions for distributing and activating the IMMUTA parcel.

Once the parcel has been successfully activated, you can add the IMMUTA service:

  1. From the Cloudera Manager select Add Service.
  2. Choose Immuta.
  3. Click Continue.
  4. Select nodes to install the services on. Your options are
    • For maximum redundancy, choose all.
    • Choose a single node.
    • Choose a few nodes. Set up a Load Balancer in front of the instances to distribute load. Contact Immuta support for more details.
  5. Proceed to the end of the workflow.

Configuring HDFS

After adding the Immuta service to your CDH cluster, there is some configuration that needs to be completed.

If your cluster is configured with Kerberos, note that the default configuration expects to run Immuta services using the immuta principal. If you need to use a different Kerberos principal, see Running as a Non-Default User for detailed instructions on how to configure that. After running through these steps, note that you may need to manually run the Create Immuta User Home Directory command from the Actions menu for the Immuta service.

For more details on Immuta's HDFS configuration, please see Hadoop Cluster Configuration for Immuta.

NameNode-Only Configuration

Warning

The following settings should only be written to the configuration on the NameNode. Setting these values on DataNodes will have security implications, so be sure that they are set in the NameNode only section of your Hadoop configuration tool. For example:

Under the HDFS service of Cloudera Manager, Configuration tab, search for key:

NameNode Advanced Configuration Snippet (Safety Valve) for hdfs-site.xml

and, using "View as XML", add/set the value(s) similar to:

<property>
    <name>dfs.namenode.authorization.provider.class</name>
    <value>com.immuta.hadoop.ImmutaAuthorizationProvider</value>
    <final>true</final>
</property>
<property>
    <name>immuta.permission.fallback.class</name>
    <value>org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider</value>
    <final>true</final>
</property>
<property>
    <name>immuta.permission.allow.fallback</name>
    <value>false</value>
    <final>true</final>
</property>
<property>
    <name>immuta.system.api.key</name>
    <value>0ec28d3f-a8a2-4960-b653-d7ccfe4803b3</value>
    <final>true</final>
</property>
<property>
    <name>immuta.permission.users.to.ignore</name>
    <value>hdfs,yarn,hive,impala,llama,mapred,spark,oozie,hue,hbase,livy,immuta</value>
    <final>true</final>
</property>
Mark Immuta values final

We recommend that all Immuta configuration values be marked final.

Detailed Explanation:

  • dfs.namenode.authorization.provider.class
    • Configures Hadoop to use the Immuta Authorization Provider.
    • Default: com.immuta.hadoop.ImmutaAuthorizationProvider
  • immuta.permission.fallback.class
    • This class will be used as a fallback authorization/permission checker if Immuta is not protecting the target directory. This will also be used if fallback is explicitly enabled. If the deployment also requires Sentry, this should be set to org.apache.sentry.hdfs.SentryAuthorizationProvider.
    • Default: org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider
  • immuta.permission.allow.fallback
    • Set to true if a user's access should be determined by the permission fallback class even if they are explicitly denied access by Immuta. WARNING! Setting this to true is DANGEROUS in that a user may be forbidden from seeing data through Immuta but still able to see the data in HDFS.
    • Default: false
  • immuta.system.api.key
    • This must be set to the value of the hdfsSystemToken configuration item in Immuta. This API key is used to create user API keys in Immuta, so it is important that it can be trusted and cannot be accessed by users. This must be set when using the Immuta FileSystem. Use the value of HDFS_SYSTEM_TOKEN generated earlier.
    • Example: 0ec28d3f-a8a2-4960-b653-d7ccfe4803b3
  • immuta.permission.users.to.ignore
    • Comma separated list of HDFS user accounts that will bypass the Immuta authorization provider. If the principal being used as the Immuta system user is anything other than "immuta", that user should be appended to this list. This should match the principal in the username configuration mentioned below under Immuta Web Service configuration.
    • Default: hdfs,yarn,hive,impala,llama,mapred,spark,oozie,hue,hbase,livy,immuta

Shared Configuration

The following configuration items should be configured for both the NameNode processes and the DataNode processes. These configurations are used both by the Immuta FileSystem and the Immuta NameNode plugin. For example:

Under the HDFS service of Cloudera Manager, Configuration tab, search for key:

Cluster-wide Advanced Configuration Snippet (Safety Valve) for core-site.xml

and, using "View as XML", add/set the value(s) similar to:

<property>
    <name>immuta.base.url</name>
    <value>https://immuta.hostname</value>
    <final>true</final>
</property>
<property>
    <name>immuta.spark.partition.generator.user</name>
    <value>immuta</value>
    <final>true</final>
</property>
<property>
    <name>immuta.credentials.dir</name>
    <value>/user</value>
    <final>true</final>
</property>
<property>
    <name>immuta.visibility.cache.timeout.seconds</name>
    <value>600</value>
    <final>true</final>
</property>
<property>
    <name>fs.immuta.impl</name>
    <value>com.immuta.hadoop.ImmutaFileSystem</value>
    <final>true</final>
</property>
<property>
    <name>hadoop.proxyuser.immuta.hosts</name>
    <value>*</value>
    <final>true</final>
</property>
<property>
    <name>hadoop.proxyuser.immuta.users</name>
    <value>*</value>
    <final>true</final>
</property>
<property>
    <name>hadoop.proxyuser.immuta.groups</name>
    <value>*</value>
    <final>true</final>
</property>
Mark Immuta values final

We recommend that all Immuta configuration values be marked final.

Detailed Explanation:

  • immuta.base.url
    • Specifies the base URL of the Immuta API.
    • Example: https://immuta.hostname
  • immuta.spark.partition.generator.user
    • Specifies the system user and/or Kerberos principal that the Immuta Partition Service will run as. By default, configuration and other support files will be placed in this user's hdfs directory.
    • Default: immuta
  • immuta.credentials.dir
    • This directory must contain a directory with each user's username. The directory must be owned by the user and readable only by that user. An Immuta credential file will be written that is readable only by the owning user.
    • Default: /user
  • immuta.visibility.cache.timeout.seconds
    • Specifies the amount of time the user's visibility report is cached in the Spark context.
    • Default: 600
  • fs.immuta.impl
    • Specifies the class used for the immuta file system protocol.
    • Default: com.immuta.hadoop.ImmutaFileSystem
  • hadoop.proxyuser.<IMMUTA_SERVICE_PRINCIPAL>.hosts
    • Specifies the other hosts the Immuta service principal is allowed to proxy (either a comma-separated list or *).
    • Default: *
  • hadoop.proxyuser.<IMMUTA_SERVICE_PRINCIPAL>.users
    • Specifies the end users the Immuta service principal is allowed to proxy (either a comma-separated list or *).
    • Default: *
  • hadoop.proxyuser.<IMMUTA_SERVICE_PRINCIPAL>.groups
    • Specifies the user groups the Immuta service principal is allowed to proxy (either a comma-separated list or *).
    • Default: *

Make sure that user directories underneath immuta.credentials.dir are readable only by the owner of the directory. If the user's directory doesn't exist and we create it, we will set the permissions to 700.

Enabling TLS for the Immuta Partition Service

You can enable TLS on the Immuta Partition Service by configuring it to use a keystore in JKS format.

Server-side TLS Configuration

These settings need to be set up for both the Spark 2 and Spark 1.6 (DEPRECATED) Partition Servers.

Under the Immuta service of Cloudera Manager, Configuration tab, search for key:

Immuta Spark 2 Partition Server Advanced Configuration Snippet (Safety Valve) for context/generator.xml

and, using "View as XML", add/set the value(s) similar to:

<property>
    <name>immuta.secure.partition.generator.keystore</name>
    <value>/etc/immuta/keystore.jks</value>
    <final>true</final>
</property>
<property>
    <name>immuta.secure.partition.generator.keystore.password</name>
    <value>secure_password</value>
    <final>true</final>
</property>
<property>
    <name>immuta.secure.partition.generator.keymanager.password</name>
    <value>secure_password</value>
    <final>true</final>
</property>

Under the Immuta service of Cloudera Manager, Configuration tab, search for key:

Immuta Spark 1.6 Partition Server Advanced Configuration Snippet (Safety Valve) for context/generator.xml

and, using "View as XML", add/set the value(s) similar to:

<property>
    <name>immuta.secure.partition.generator.keystore</name>
    <value>/etc/immuta/keystore.jks</value>
    <final>true</final>
</property>
<property>
    <name>immuta.secure.partition.generator.keystore.password</name>
    <value>secure_password</value>
    <final>true</final>
</property>
<property>
    <name>immuta.secure.partition.generator.keymanager.password</name>
    <value>secure_password</value>
    <final>true</final>
</property>
Mark Immuta values final

We recommend that all Immuta configuration values be marked final.

Detailed Explanation:

  • immuta.secure.partition.generator.keystore
    • Specifies the path to the Immuta Partition Service keystore.
    • Example: /etc/immuta/keystore.jks
  • immuta.secure.partition.generator.keystore.password
    • Specifies the password for the Immuta Partition Service keystore. This password will be a publicly available piece of information, but file permissions should be used to make sure that only the user running the service can read the keystore file.
    • Example: secure_password
  • immuta.secure.partition.generator.keystore.password
    • Specifies the password for the Immuta Partition Service keystore. This password will be a publicly available piece of information, but file permissions should be used to make sure that only the user running the service can read the keystore file.
    • Example: secure_password
  • immuta.secure.partition.generator.keymanager.password
    • Specifies the KeyManager password for the Immuta Partition. Service keystore. This password will be a publicly available piece of information, but file permissions should be used to make sure that only the user running the service can read the keystore file. This is not always necessary.
    • Example: secure_password

We recommend using file permissions to secure the keystore from improper access:

chown immuta:immuta /etc/immuta/keystore.jks
chmod 600 /etc/immuta/keystore.jks

Client-side TLS Configuration

You must also set the following properties under the following client sections:

For Spark 2, under the Immuta service of Cloudera Manager, Configuration tab, search for key:

Immuta Client Advanced Configuration Snippet (Safety Valve) for immuta-conf/session/generator.xml

and, using "View as XML", add/set the value(s) similar to:

<property>
    <name>immuta.secure.partition.generator.keystore</name>
    <value>true</value>
    <final>true</final>
</property>

For Spark 1.6 (DEPRECATED), under the Immuta service of Cloudera Manager, Configuration tab, search for key:

Immuta Client Advanced Configuration Snippet (Safety Valve) for immuta-conf/context/generator.xml

and, using "View as XML", add/set the value(s) similar to:

<property>
    <name>immuta.secure.partition.generator.keystore</name>
    <value>true</value>
    <final>true</final>
</property>
Mark Immuta values final

We recommend that all Immuta configuration values be marked final.

Detailed Explanation:

  • immuta.secure.partition.generator.keystore
    • Set to true to enable TLS
    • Default: true

Impala Configuration

You must give the service principal that the Immuta Web Service is configured to use permission to delegate in Impala. To accomplish this, add the Immuta Web Service principal to authorized_proxy_user_config in the Impala daemon command line arguments.

Under the Impala service of Cloudera Manager, Configuration tab, search for key:

Impala Daemon Command Line Argument Advanced Configuration Snippet (Safety Valve)

and add/set the value(s) similar to:

-authorized_proxy_user_config=<IMMUTA_SERVICE_PRINCIPAL>=*
Note

If the authorized_proxy_user_config parameter is already present for other services, append the Immuta configuration value to the end:

-authorized_proxy_user_config=hue=*;<IMMUTA_SERVICE_PRINCIPAL>=*

Spark 2 Configuration

No additional configuration is required.

Note: Immuta will work with any Spark 2 version you may have already installed on your cluster.

Spark 1.6 Configuration (DEPRECATED)

Deprecated

Spark 1.6 support is deprecated as of Immuta v2.7.0 and is slated for removal in Immuta v2.8.0

Caution

Enabling Immuta's Spark Access Pattern in spark-defaults.conf will cause all Spark based tools such as Hive on Spark to not function properly. Skip this step if you are using such tools.

Under the Spark service of Cloudera Manager, Configuration tab, search for key:

Spark Client Advanced Configuration Snippet (Safety Valve) for spark-conf/spark-defaults.conf

and add/set the value(s) similar to:

spark.broadcast.factory=org.apache.spark.broadcast.ImmutaSerializableBroadcastFactory
spark.executor.extraClassPath=file:///etc/immuta/conf.cloudera.immuta_partition_service/context/generator.xml:/opt/cloudera/parcels/IMMUTA/lib/immuta-hadoop-filesystem.jar:/opt/cloudera/parcels/IMMUTA/lib/immuta-spark-context.jar
spark.driver.extraClassPath=file:///etc/immuta/conf.cloudera.immuta_partition_service/context/generator.xml:/opt/cloudera/parcels/IMMUTA/lib/immuta-hadoop-filesystem.jar:/opt/cloudera/parcels/IMMUTA/lib/immuta-spark-context.jar
spark.driver.extraJavaOptions=-Djava.security.manager=com.immuta.security.ImmutaSecurityManager -Dimmuta.security.manager.classes.config=file:///etc/immuta/conf/allowedCallingClasses.json -Dimmuta.spark.encryption.fpe.class=com.immuta.spark.encryption.ff1.ImmutaFF1Service
spark.executor.extraJavaOptions=-Djava.security.manager=com.immuta.security.ImmutaSecurityManager -Dimmuta.security.manager.classes.config=file:///etc/immuta/conf/allowedCallingClasses.json -Dimmuta.spark.encryption.fpe.class=com.immuta.spark.encryption.ff1.ImmutaFF1Service
spark.hadoop.fs.hdfs.impl=org.apache.hadoop.hdfs.ImmutaSparkTokenFileSystem

Under the Spark service of Cloudera Manager, Configuration tab, search for key:

Spark Client Advanced Configuration Snippet (Safety Valve) for spark-conf/spark-env.sh

and add/set the value(s) similar to:

export PYTHONPATH=/opt/cloudera/parcels/IMMUTA/python/context
export PYTHONSTARTUP=/opt/cloudera/parcels/IMMUTA/python/context/initialize-context.py

Immuta Partition Service configuration

The Immuta Partition Service requires the same system API key that is configured for the Immuta NameNode plugin. Be sure that the value of immuta.system.api.key is consistent across your configuration.

For Spark 2, under the IMMUTA service of Cloudera Manager, Configuration section, search for key:

Immuta Spark 2 Partition Server Advanced Configuration Snippet (Safety Valve) for session/generator.xml

and, using "View as XML", add/set the value(s) similar to:

<property>
    <name>immuta.system.api.key</name>
    <value>0ec28d3f-a8a2-4960-b653-d7ccfe4803b3</value>
    <final>true</final>
</property>

For Spark 1.6 (DEPRECATED), under the IMMUTA service of Cloudera Manager, Configuration section, search for key:

Immuta Spark 1.6 Partition Server Advanced Configuration Snippet (Safety Valve) for context/generator.xml

and, using "View as XML", add/set the value(s) similar to:

<property>
    <name>immuta.system.api.key</name>
    <value>0ec28d3f-a8a2-4960-b653-d7ccfe4803b3</value>
    <final>true</final>
</property>
Mark Immuta values final

We recommend that all Immuta configuration values be marked final.

Immuta Web Service configuration

The Immuta Web Service needs to be configured to support the HDFS plugin. You can set this configuration using the Immuta Configuration UI.

Though generally unnecessary given the configuration through the Application Settings of the Web UI, below is an example YAML snippet that can be used as an alternative to the Immuta Configuration UI if recommended by an Immuta representative.

client:
    kerberosRealm: YOURCOMPANY.COM
plugins:
    hdfsHandler:
        hdfsSystemToken: 0ec28d3f-a8a2-4960-b653-d7ccfe4803b3
kerberos:
    ticketRefreshInterval: 43200000
    username: immuta
    keyTabPath: /etc/immuta/immuta.keytab
    krbConfigPath: /etc/krb5.conf
    krbBinPath: /usr/bin/

Detailed Explanation:

  • client
    • kerberosRealm
      • Specifies the default realm to use for Kerberos authentication.
      • Example: YOURCOMPANY.COM
  • plugins
    • hdfsHandler
      • hdfsSystemToken
        • Token used by NameNode plugin to authenticate with the Immuta REST API. This must equal the value set in immuta.system.api.key. Use the value of HDFS_SYSTEM_TOKEN generated earlier.
        • Example: 0ec28d3f-a8a2-4960-b653-d7ccfe4803b3
  • kerberos
    • ticketRefreshInterval
      • Time in milliseconds to wait between kinit executions. This should be lower than the ticket refresh interval required by the Kerberos server.
      • Default: 43200000
    • username
      • User principal used for kinit.
      • Default: immuta
    • keyTabPath
      • The path to the keytab file on disk to be used for kinit.
      • Default: /etc/immuta/immuta.keytab
    • krbConfigPath
      • The path to the krb5 configuration file on disk.
      • Default: /etc/krb5.conf
    • krbBinPath
      • The path to the Kerberos installation binary directory.
      • Default: /usr/bin/

Additionally, you must upload a keytab for the immuta user as well as a krb5.conf configuration file to the Immuta Web Service. This can also be done via the Immuta Configuration UI.

Native Workspace Configuration

If you want users to be able to create derived data sources and/or native Hive or Impala tables within Immuta's native project workspaces, you will need to grant a Sentry admin role to the immuta user. This requires adding the immuta user to Admin Groups and Allowed Connecting Users under Sentry's configuration in Cloudera Manager.

You should also create a new sentry role for immuta, with all privileges granted. Run the SQL snippet below in beeline or impala-shell as either the immuta user or as any user with sentry admin privileges.

CREATE ROLE immuta;
GRANT ALL ON SERVER <server name> TO ROLE immuta WITH GRANT OPTION;
GRANT ROLE immuta TO GROUP immuta;

You will also need to enable the ImmutaGroupsMapping service in Hive and/or Impala's configuration to allow Immuta to manage Sentry permissions for Immuta users. For instructions on how to do this, please see Enabling ImmutaGroupsMapping.