Setting up multi-tenant environment Zeppelin on Amazon EMR - ZEPL

Setting up multi-tenant environment Zeppelin on Amazon EMR

We’re seeing more and more people trying to use Zeppelin on EMR with YARN but having some challenges with the setup. Here’s a tried and true set of instructions:

1. Create EMR cluster with Hadoop / Spark / Zeppelin enabled.

Create EMR cluster using CLI or web console with Hadoop / Spark / Zeppelin enabled.

– Setup your VPC with subnet Auto-assign public IP : yes

– Setup Key pairs to ssh

– Setup your Security group and allow port 8890

After cluster status becomes “Waiting” you’ll able to browse

http://[Master public DNS]:8890

You’ll see Zeppelin running.

2. Enable authentication

SSH to master node. And enable authentication

sudo cp /etc/zeppelin/conf/shiro.ini.template /etc/zeppelin/conf/shiro.ini

Open /etc/zeppelin/conf/shiro.ini with text editor and configure user / password in [users] section.

Alternatively you can setup different authentication method like LDAP, AD, or ZeppelinHub. You can find more details on integrating with ZeppelinHub here.

Once it’s done, restart Zeppelin daemon

sudo -u zeppelin /usr/lib/zeppelin/bin/ restart

if you browse url again, then now you’ll see login button.

3. Configure Interpreter

By default one interpreter session is being shared to all users and all notebooks. If you’d prefer provide each user different interpreter session,

Open interpreter menu (http://[Master public DNS]:8890/interpreter) and press ‘edit’ button on ‘Spark’ interpreter.

Configure option as “The interpreter will be instantiated ‘Per User’ in ‘scoped’ process”. and click “Save”.

Alternatively you can choose other combinations. This article may help understand shared/scoped/isolated mode.

4. Configure ACL of your notebook

Once you create each notebook, by default it’s being shared with all users. (set ‘false’ zeppelin.notebook.public property in /etc/zeppelin/conf/zeppelin-site.xml to not share by default)

Top right corner of each notebook, there’s small ‘lock’ icon and that’s where you can configure ACL of notebook.

Each notebook can configure owners, writers, readers.

For further collaboration and sharing of notebooks among different users, you can try ZeppelinHub as well. Hope this was helpful.