Express setup for worlds most advanced PostgreSQL 9.3 server on a CentOS-6.5 x64 instance

Setup a PostgreSQL 9.3 database on CentOS 6.5 Server

postgresql

Worlds most advanced database is a killer punchline though.

Assuming you are a root user and looking for just the command that will get you through the installation.
Pre-Requisites
  - CentOS 6.5 x64 ISO minimal installation 
    Download Link: https://www.centos.org/download 
  - I prefer to install most of the dev dependencies beforehand via 
    $ sudo yum groupinstall 'Development Tools'

1. Create a User
   $ adduser postgres
   $ passwd postgres

2. Add this user to the sudoers list
   $ vim /etc/sudoers
   postgres ALL=(ALL)    ALL
 
3. Get the postgres repo for installation
   $ yum localinstall http://yum.postgresql.org/9.3/redhat/rhel-6-x86_64/pgdg-centos93-9.3-1.noarch.rpm
     yum localinstall https://yum.postgresql.org/9.3/redhat/rhel-7-x86_64/pgdg-redhat93-9.3-3.noarch.rpm
   $ yum list postgres*
   $ yum install postgresql93-server

4. Data folder is at the location
   $ ls /var/lib/pgsql/9.3/
  
5. Make this run as a service 
   $ service postgresql-9.3 initdb
     OR
     /usr/pgsql-9.3/bin/pg_ctl init -D /var/lib/pgsql/9.3/data/ -o --pwprompt -U postgres
     OR
     /usr/pgsql-9.3/bin/initdb --pgdata=/var/lib/pgsql/9.3/data/ -W -U postgres
   $ chkconfig postgresql-9.3 on
  
6. Control commands for the server
   $ service postgresql-9.3 start
     /usr/pgsql-9.3/bin/pg_ctl -D /var/lib/pgsql/9.3/data -l logfile start
   $ service postgresql-9.3 stop
     /usr/pgsql-9.3/bin/pg_ctl -D /var/lib/pgsql/9.3/data -l logfile stop
   $ service postgresql-9.3 restart
  
7. Change to user postgres for accessing the database through psql client
   $ su postgres
   $ psql

8. Modify the IPv4 access to this machine,
   by editing "/var/lib/pgsql/9.3/data/pg_hba.conf"

   host    all             all             127.0.0.1/32            md5
   host    all             all             192.168.1.0/24          md5
   host    all             all             0.0.0.0/0               ident

9. Modify the service host to listen for all external clients
   by editing "/var/lib/pgsql/9.3/data/postgresql.conf"

   listen_addresses = '*'

10. Update the iptables policy for database server access
   $ vim /etc/sysconfig/iptables
     -A INPUT -m state --state NEW -m tcp -p tcp --dport 5432 -j ACCEPT
   $ service iptables restart
   $ service iptables restart 
   
   Diagnostics for port forwarding. The DB server is unreachable hence switching the firewall OFF
   $ netstat -nlp | grep 5432
   $ sudo service iptables save
   $ sudo service iptables stop
   $ chkconfig iptables off

11. Update the password of the default user postgres. 
   This is done by logging in via the localhost psql client as above via default postgres user. 
   postgres# GRANT ALL privileges ON database postgres TO postgres; 
   postgres# ALTER USER postgres WITH password 'postgres'; 

   This will allow password based authentication via user postgres. 
   Verification can be done by attempting access as below: 
   $ psql -h 127.0.0.1 -U postgres -d postgres

Direct Reference : https://wiki.postgresql.org/wiki/YUM_Installation

Universal Database Manager: http://dbeaver.jkiss.org

In case the YUM repository is not set properly use the relevant EPEL repo 

## RHEL/CentOS 7 64-Bit ##
# wget http://dl.fedoraproject.org/pub/epel/7/x86_64/e/epel-release-7-8.noarch.rpm
# rpm -ivh epel-release-7-5.noarch.rpm
## RHEL/CentOS 6 64-Bit ##
# wget http://download.fedoraproject.org/pub/epel/6/x86_64/epel-release-6-8.noarch.rpm
# rpm -ivh epel-release-6-8.noarch.rpm

 

_______________________________________________________________

 

A quick way to have a portable version running on windows locally is mentioned below,

Donwload: http://www.enterprisedb.com/products-services-training/pgbindownload

Extract the archive into a folder at “C:/Dev/pgsql”

Init: "C:/Dev/pgsql/bin\pg_ctl" -D  "C:/Dev/pgsql/data" initdb
Start: "C:/Dev/pgsql/bin\pg_ctl" -D "C:/Dev/pgsql/data" -l logfile start
Stop: "C:/Dev/pgsql/bin\pg_ctl" -D "C:/Dev/pgsql/data" -l logfile stop
Connect: "C:/Dev/pgsql/bin\psql" -U postgres

You can initially connect to database postgres, and later use the following to have a new user created with privilege on a db

CREATE USER postgres WITH PASSWORD 'postgres';
CREATE DATABASE sample;
GRANT ALL PRIVILEGES ON DATABASE "sample" to postgres;

HP Vertica community edition Setup for the dummies, Not really !!

——————————————————————————————————————————– HP_Vertica_Logo_0 ——————————————————————————————————————————–   PHEW !!!! Here I have noted the steps involved in bringing a single node of HP Vertica alive on a local Virtual box VM. I hope it covers all the pre-requisites that may ruin the fun.

A. Register and download the Vertica 7 rpms from here https://my.vertica.com/community/
B. Configure the virtual box so that we dont lose the IP. 
Memory provisioned is 4GB RAM, 2 processors and 50GB of disk storage 
I have two network adapters enabled in the virtual box VM settings.
The first network adapter configured as Bridged-Adapter and can connect to internet
The second network adapter is attached to Host-only-Adapter which generates a local IP like 192.168.56.103
C. Installation of a CentOS 6.5 x64 machine to serve as the host to Vertica 7. Download CentOS http://isoredirect.centos.org/centos/6/isos/x86_64/
LVM is not supported by Vertica 7, need to have EXT3 or EXT4 partitions 
Create a non-LVM partition with ext4 mounted at root / and a swap of 2GB
D. Add relevant proxy and variables if needed in your development environment
 export http_proxy=http://web-proxy.xxxxxxxx.com:8080
 export HTTP_PROXY=http://web-proxy.xxxxxxxx.com:8080
 export proxy=http://web-proxy.xxxxxxxxx.com:8080
 export TZ="Asia/Kolkata"
 
In case bridged and local IP is not reachable then use dhclient for the ethernet address discoveryby configuring the DHCP
E. Setup a SAMBA share for file transfer to and from this virtual machine 
I had to setup the EPEL repository so that I have the a test updated for download (http://www.rackspace.com/knowledge_center/article/installing-rhel-epel-repo-on-centos-5x-or-6x)
I had to configure proxy in ~/.bashrc and yum.conf for making sure my VM is connecting to the internet
$ sudo yum install samba
$ mkdir /home/dbadmin/share
Configure as in http://www.techotopia.com/index.php/Sharing_Files_between_CentOS_6_and_Windows_Systems_with_Samba
$ vim /etc/samba/smb.conf
Add the following in the end of conf where Share Definitions are mentioned
[share]
        path = /home/dbadmin/share
        writeable = yes
        browseable = yes
        valid users = dbadmin
 
Any user that requires to connect to this share needs to be given a samba password
$ sudo smbpasswd -a dbadmin
 
Start the samba server
$ sudo service smb restart
$ sudo service nmb restart
 
 Make sure you have a user created for samba for this share 
 \\192.168.56.103
F. Install Vertica (vertica-7.0.1-0.x86_64.RHEL5.rpm) Database, The installation is being done as a super user here.

PREFACE: Just to scare the administrator I am starting with the horror I witnessed,when I just went ahead with the usual rpm and install_vertica route

Pre-Requisites and if not adhered to is served by these beautiful errors
In case pre-requisite are not adhered to then following error occur

Failures during local (OS) configuration for verify-10.0.3.15.xml:
 HINT (S0151): https://my.vertica.com/docs/7.0.x/HTML/index.htm#cshid=S0151 
 These disks do not have known IO schedulers: '/dev/mapper/VolGroup-
 lv_root' ('') = ''
 HINT (S0231): https://my.vertica.com/docs/7.0.x/HTML/index.htm#cshid=S0231
 dbadmin user ('dbadmin') primary group is not verticadba ('verticadba')
 HINT (S0305): https://my.vertica.com/docs/7.0.x/HTML/index.htm#cshid=S0305
 TZ is unset for dbadmin. Consider updating .profile or .bashrc
 WARN (S0170): https://my.vertica.com/docs/7.0.x/HTML/index.htm#cshid=S0170
 lvscan (LVM utility) indicates some active volumes.
 FAIL (S0150): https://my.vertica.com/docs/7.0.x/HTML/index.htm#cshid=S0150
 These disks do not have 'deadline' or 'noop' IO scheduling: '/dev/sda1'
 ('sda') = 'cfq'
 FAIL (S0020): https://my.vertica.com/docs/7.0.x/HTML/index.htm#cshid=S0020
 Readahead size of sda (/dev/sda1) is too low for typical systems: 256 <
 2048 Readahead size of (/dev/mapper/VolGroup-lv_root) is too low for
 typical systems: 256 < 2048
 FAIL (S0030): https://my.vertica.com/docs/7.0.x/HTML/index.htm#cshid=S0030
 ntpd process is not running: ['ntpd', 'ntp']
 FAIL (S0310): https://my.vertica.com/docs/7.0.x/HTML/index.htm#cshid=S0310
 Transparent hugepages is set to 'always'. Must be 'never' or 'madvise'.
1. Create a 2GB swap space in case a swap partition was not created initially during installation of OS
$ grep SwapTotal /proc/meminfo
$ dd if=/dev/zero of=/mnt/swapfile bs=1024 count=2097152 
$ mkswap /mnt/swapfile 
$ swapon /mnt/swapfile
2. Network Time Protocol (NTP) 
$ runlevel 
$ chkconfig --list ntpd
$ /sbin/service ntpd restart 
$ /sbin/chkconfig ntpd on 
$ /usr/sbin/ntpq -c rv | grep stratum
3. Disable SELinux at vim /etc/selinux/config by editing SELINUX=disabled
$ setenforce 0

4. CPU Scaling set to performance.
Make sure you configure the performance parameter in cpu scaling info at /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
for CPUFREQ in /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor; do [ -f $CPUFREQ ] || continue; echo -n performance > $CPUFREQ; done

5. Vertica doesn’t support transparent hugepages. 
$ cat /sys/kernel/mm/redhat_transparent_hugepage/enabled
$ sysctl -a | grep hugepage
$ vim /etc/grub.conf and add transparent_hugepage=never to the kernel line between root and initrd
 
Also, if the above solution is a thread to your kernel then add the following tp your /etc/rc.local file
  if test -f /sys/kernel/mm/redhat_transparent_hugepage/enabled; then
     echo never > /sys/kernel/mm/redhat_transparent_hugepage/enabled
  fi
6. Disk Readahead, Vertica requires that Disk readahead by set to at least 2048 
• Do it for sda (which is the block storage): 
$ /sbin/blockdev --setra 2048 /dev/sda
$ echo '/sbin/blockdev --setra 2048 /dev/sda' >> /etc/rc.local

• Do it for sda1 (which is the block storage): 
$ /sbin/blockdev --setra 2048 /dev/sda1
$ echo '/sbin/blockdev --setra 2048 /dev/sda1' >> /etc/rc.local 
Verify at cat /etc/rc.local
7. Vertica requires either deadline or noop I/0 scheduler. And CFQ creates performance bottlenecks.

Permanent setting can be done by adding the following in kernel cmd line in vim /etc/grub.conf 
 elevator=deadline
Configure the I/O Scheduler - Changing the Scheduler Through the /sys Directory
• Check the current status: 
$ cat /sys/block/sda/queue/scheduler 
  noop anticipatory deadline [cfq] 
 
• Change for the current run: 
$ echo deadline > /sys/block/sda/queue/scheduler 
 
• Add the change to rc.local so it survives reboot: 
$ echo 'echo deadline > /sys/block/sda/queue/scheduler' >> /etc/rc.local 
 
• Check the status: 
$ cat /sys/block/sda/queue/scheduler 
  noop anticipatory [deadline] cfq 
 
$ find / -iname "scheduler"
Ok, we have sda device. You can confirm it by: blkid command or fdisk -l
NOTE: There should not be any LVM partitions associated with vertica 
Make the change to /dev/mapper/VolGroup-lv_root as well
so Check the dev device name for partition like /dev/mapper/VolGroup-lv_root, lvdisplay -v or ll /dev/VolGroup/
In my case it is -> Block device 253:0s
Unable to resolve this one as LVM was used and had to reinstall my OS with EXT4
8. TZ Environment Variable
$ yum update tzdata
$ cat /etc/sysconfig/clock
$ export TZ="Asia/Kolkata"
9. Add dbadmin to the verticadba group as primary user
Setup vertica with dbadmin@192.168.56.103
$ adduser dbadmin $ passwd dbadmin Add this to the /etc/sudoers list - create a verticadba group and add dbadmin to it $ id dbadmin $ groupadd -f verticadba //usermod -a -G verticadba dbadmin $ usermod -g verticadba dbadmin $ usermod -m -d /home/dbadmin/ dbadmin $ chown -R dbadmin:verticadba /home/dbadmin
10. Disable firewall for now, later add specific port forwarding
$ service iptables save
$ service iptables stop
11. Append the host name to /etc/hosts 
  192.168.56.103 verticalocalhost
12. Finally we arrive at the actual installation of Vertica7, I was so happy with vertica 6 where my installation notes amounted to about 10 lines only.
Install vertica RPM
$ rpm -ivh vertica-7.0.1-0.x86_64.RHEL5.rpm
Check the installation by: ls /opt/vertica/ and initialize the vertica nodes using the install
$ /opt/vertica/sbin/install_vertica --hosts verticalocalhost --rpm vertica-7.0.1-0.x86_64.RHEL5.rpm -u dbadmin
13. Initialize the node and create the database
Use the configuration tool and go to the configuration menu and create a new database
You will have to accept the license for community edition as well.
Here you can create a database and associate a password through the configuration menu.
$ /opt/vertica/bin/adminTools

14. Connect to the database through the VSQL Vertica client
$ /opt/vertica/bin/vsql -U dbadmin

Here is a snapshot of the vertica adminTools interface. Its great for me as i prefer putty and this interface is smooth there.

admin_vertica

Now, that is some nasty list of command for the setup. I just poured it all here. I hope it could help somebody. I will be laying my hands on the UDTF functions which is the core of analytics there. I hope i would be able to cover how beautifully it integrates with R-language in order to provide for extreme big data statistical analytics within the heart of Vertica.

There is a great database client called dbeaver  which is also called a free universal database manager as it can connect to most known sql databases easily over JDBC. Make the connections as per the settings below and kickstart your Vertica experience

DBeaver_VerticaConnection

Here is a bash script to initialize and start the database when we start the linux instance. This will take the burden off the simple db initialization, start, stop operations.

#!/bin/bash
# description: Vertica service
export VERTICA_HOME=/opt/vertica
export VERTICA_USER=dbadmin
export VERTICA_DB=XXXDB
export VERTICA_DBPASS=xxxdb
export VERTICA_PORT=5433
export VERTICA_HOST=verticalocalhost
echo "------------------------------------STATE---------------------------------------"
# Pre check of the running process
vertica_ps=`bash -c "ps -eaf | grep vertica | grep $VERTICA_DB | grep $VERTICA_PORT"`
echo $vertica_ps
vertica_pid=`bash -c "ps -eaf | grep vertica | grep $VERTICA_DB | grep $VERTICA_PORT | awk '{print $2}'"`
#echo "PID: $vertica_pid"
echo "--------------------------------------------------------------------------------"
case $1 in

initdb)
 sudo bash -c "service iptables stop"
 if sudo bash -c "/opt/vertica/sbin/install_vertica --hosts $VERTICA_HOST --rpm /home/dbadmin/share/vertica-7.0.1-0.x86_64.RHEL5.rpm -u $VERTICA_USER"; then
 echo -n "Vertica $1 successful"
 $0 start
 else
 echo -n "Error in Vertica $1: $?"
 fi
 ;;

stop)
 if bash -c "/opt/vertica/bin/admintools --tool stop_db -d $VERTICA_DB -p $VERTICA_DBPASS"; then
 echo -n "Vertica $1 $VERTICA_DB successful"
 else
 echo -n "Error in Vertica $1: $?"
 fi
 ;;

restart)
 $0 stop
 $0 start
 ;;

start)
 if bash -c "/opt/vertica/bin/admintools --tool start_db -d $VERTICA_DB -p $VERTICA_DBPASS"; then
 echo -n "Vertica $1 $VERTICA_DB successful"
 else
 echo -n "Error in Vertica $1: $?"
 fi
 ;;

status)
 echo "To be done"
 ;;
*)
 echo $"Usage: $0 {start|stop|restart|initdb|status}"
 exit 1
esac

 

15. Change in IP address of the server

If you are a developer like me who works on machines with dynamically assigned IPs. It is a pain to start the server with the new IP. here I have tried to provide a fix by going into the internal configurations in order to get this right.

  1. Make sure the /etc/hosts file is updated with the new IP
  2. Edit the /opt/vertica/config/admintools.conf and replace all the older IPs with the new ones.
  3. Edit the DB spread configuration file with the new IP.  /home/dbadmin/XXXDB/v_teamdb_node0001_catalog/spread.conf
  4. Edit the DB catalog configuration file with the new IP.  /home/dbadmin/XXXDB/v_teamdb_node0001_catalog/Catalog/config.cat
  5. Go to the admintools and start the database with your fingers crossed.

In case your database started with this configuration then its your lucky day today.

 

16. Backup the server data

This is the most important section.Now that you have a database in production its prudent to have a mechanism for backup and restore. Here I am adding a mechanism to backup the data.

1. Create your configuration file with the needed parameters. 
   $ vbr.py  --setupconfig
     Snapshot name (backup_snapshot): XXXDB_2016_08_10
     Number of restore points (1): 1
     Specify objects (no default):
     Object restore mode (coexist, createOrReplace or create) (createOrReplace):
     Vertica user name (dbadmin):
     Save password to avoid runtime prompt? (n) [y/n]: y
     Database user password to save in vbr password config file (no default):
     Node v_teamdb_node0001
     Backup host name (no default): verticalocalhost
     Backup directory (no default): /home/dbadmin/backup
     Change advanced settings? (n) [y/n]: n
     Password file name (no default): 123
     Saved vbr password to 123.
     Config file name (XXXDB_2016_08_10.ini):
     Saved vbr config to XXXDB_2016_08_10.ini.

2. Initialize a backup folder
     vbr.py --task init --config-file XXXDB_2016_08_10.ini

3. Backup your data 
     vbr.py --task backup --config-file XXXDB_2016_08_10.ini

You will find manifest and a snapshot folder, apart from the backed up node in the /opt/dbadmin/backup directory.

17. Restore the servers data

This can be achieved using the same vbr config file, but will require the target database to be down.