I recently deployed HortonWorks on a small lab environment. 4 virtual machines were used on a Intel Nuc barebone with 16gb memory. I saved the history of the command line and made several screenshots. In case you want to deploy HortonWorks yourself, this might be useful.
For my VMWare ESXi I had to install the open-vm-tools, see below.
sudo apt-get install open-vm-tools
The installation requires a root account or an account with enough privileges. I used the root account option. For Ubuntu you need to enable this first:
sudo nano /etc/ssh/sshd_config
After changing the sshd_config, restart ssh and change the password:
sudo service ssh restart
sudo passwd root
Next thing I have done is that I added all the servers to the hosts file on the first machine I wanted to use for installation. You can use DNS for this or the hosts file. In my small lab environment I went for the hosts file.
Add the following lines of code;
Next thing is that we need to be able to access all machines from the first node. Create a ssh key for this first:
Now copy the ssh key to all authorized_keys:
cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
cat ~/.ssh/id_rsa.pub | ssh root@hadoop02 "mkdir -p ~/.ssh && cat >> ~/.ssh/authorized_keys"
cat ~/.ssh/id_rsa.pub | ssh root@hadoop03 "mkdir -p ~/.ssh && cat >> ~/.ssh/authorized_keys"
cat ~/.ssh/id_rsa.pub | ssh root@hadoop04 "mkdir -p ~/.ssh && cat >> ~/.ssh/authorized_keys"
HortonWorks requires transparent hugepages on each server. Disable this setting by doing the following:
echo never > /sys/kernel/mm/transparent_hugepage/enabled
We also need NTP on every machine. Install NTP by:
sudo apt-get install ntp
Now we’re ready to start the installation. Add the HortonWorks ambari repository:
wget -nv http://public-repo-1.hortonworks.com/ambari/ubuntu14/2.x/updates/184.108.40.206/ambari.list -O /etc/apt/sources.list.d/ambari.list
apt-key adv --recv-keys --keyserver keyserver.ubuntu.com B9733A7A07513CAD
Update everything and install ambari server:
apt-get install ambari-server
Now you can run the initial setup. Use the following command line below and see how I answered the questions below:
Customize user account for ambari-server daemon [y/n] -> n
Checking JDK… ->  Oracle JDK 1.8 + Java Cryptography Extension (JCE) Policy Files 8
Do you accept the Oracle Binary Code License Agreement [y/n] -> y
Enter advanced database configuration [y/n] (n)? -> n
Accept the Oracle JDK license when prompted. You must accept this license to download the necessary JDK from Oracle. The JDK is installed during the deploy phase.
Select n at Enter advanced database configuration to use the default, embedded PostgreSQL database for Ambari. The default PostgreSQL database name is ambari. The default user name and password are ambari/bigdata. Otherwise, to use an existing PostgreSQL, MySQL or Oracle database with Ambari -> y
Now you are ready to start the server. Use the following command:
Navigate to the 8080 port on the ambari server. In my case I used http://192.168.0.162:8080/
Use the admin/admin combination, see below:
Next step is that we want to launch the install wizard, use this button.
Give the cluster a name. In my case I used the name “hadoop”.
Select the distribution version. I used the latest HDP 2.4
Select the nodes you want to install on. I used all four nodes, see below. For the communication I had to copy paste the ssh key into this screen. Use the command below and copy paste the entire ssh key into this field:
—–BEGIN RSA PRIVATE KEY—–
Confirm and start to install:
When ready, select the services you want to use.
Next thing is to assign the masters. The first node I used as namenode, zookeeper, atlas and grafana. For the next snamenode, history, app timeline server, etc. You can divide them all to one server, but at least make sure you have enough memory.
Next step is to assign the slaves and clients. I made all hosts a data node and node manager. You might to make an exception for the first node.
Next step is for hive to create a new mysql database. This is where all the management information will be stored on.
I had to type a password for grafana in order to complete my installation:
Review and finish the installation:
All the packages will be deployed:
After the installation the admin user was not able to connect to the HDFS client. In order to do so, switch to the hdfs system account user.
su - hdfs
hadoop fs -mkdir /user/admin
Set the ownership on the newly created directory:
hadoop fs -chown admin:hadoop /user/admin