Adding and implementing R statistical analytics capability in Vertica using User Defined Transform Function (UDTF) as Vertica-R-UDX

Illustrated in the vertica forums. I am adding my own examples that got me kickstarted in the R-UDX area.

We shall start simple with a UDF that does multiplication operation on the data provided in the query. The rhlTestAlgebra.R is the UDF that will take two inputs and return their product.

1. Write the R-UDF code for the multiplication operation

#--------------------*.R---------------#
# Write a UDTF named rhlTestAlgebra.R #
#--------------------------------------#
# A test UDTF for testing
rhlTestAlgebraFactory <- function()
{
# Takes i
list(
name = multiplyValues,
udxtype = c("transform"),
intype = c("int","int"),
outtype = c("int"),
outnames = c("product")
)
}

# INPUT num1, num2
multiplyValues <- function(input)
{
tryCatch(
{
# Initialize libraries
# initialize the inputs
num1 <- input[,1]
num2 <- input[,2]
num3 <- (num1 * num2)
dataResult <- data.frame(num3)
return(num3)
}, error = function(errStr)
{
# Format and log error in DB
}, finally =
{
# Close DB connection
})
}

# Function to load necessary packages
{
tryCatch(
{
# Use RJDBC for connecting to database
library(RJDBC)
# For logging time
library(lubridate)
# Use R2JAGS for calling JAGS from R
library(R2jags)
# binomial library
library(binom)
# plot library
library(ggplot2)
# Set the random number generator seed
set.seed(seed="12345")
}, error = function(e)
{
stop(paste("INIT_LIB_ERROR:",e))
})
}

2. Compile and install this library within Vertica

DROP LIBRARY algebralib CASCADE;
CREATE LIBRARY algebralib AS '/home/dbadmin/share/workspace/rhlTestAlgebra.R' LANGUAGE 'R';
CREATE TRANSFORM FUNCTION multiplyValues AS LANGUAGE 'R' NAME 'rhlTestAlgebraFactory' LIBRARY algebralib;

3. Execute the newly written UDTF for the results

SELECT multiplyValues(9,74) OVER();