function
Definition
A function
in Veezoo allows users to create custom SQL functions or transformations that may not be readily available in Veezoo off-the-shelf. It's a useful feature for adding specialized functionality that is available in your database, e.g. linear regression functions.
Usage
A function
is defined with several properties:
name
: A name for the function, used when calling it.synonym
: Alternative ways to refer to this function in a question.dynamic_display_name
: The name of the function as it will be displayed in Veezoo, including placeholders for the arguments.dynamic_autocomplete
: (Optional) The pattern that should be displayed in the autocomplete, when a user is typing a question.tag
: The type of function. Can be one of the following:CB_Many2One
for aggregation functions. These functions take multiple values and return a single value, likesum
oraverage
.CB_One2One
for transformation functions. These functions transform one value into another, like converting temperatures or changing data types.CB_One2Maybe
for filter functions. These functions return true or false.
arguments
: The types of arguments the function accepts.return
: The return type of the function. Left out forCB_One2Maybe
.sql_select
: The SQL equivalent of the function for the SELECT clause. It uses placeholders ($1
,$2
, etc.) for the arguments.sql_where
: Especially forCB_One2Maybe
, the SQL equivalent of the function for the WHERE clause. It uses placeholders ($1
,$2
, etc.) for the arguments.
Examples
CB_Many2One: Correlation
Let's consider a real-world example where we define a function to calculate the Correlation
between two number series:
function Correlation {
name: "Correlation"
// That's what gets displayed to the end user
dynamic_display_name: "Correlation of $1 and $2"
// It is an aggregation
tag: CB_Many2One
// It takes 2 series of numbers
arguments: [
number,
number
]
// It returns another number
return: number
sql_select: "CORR($1, $2)"
}
In this example:
- The
name
of the function is "Correlation". - The
dynamic_display_name
is "Correlation of $1 and $2", where$1
and$2
are placeholders that will be replaced with the actual arguments when the function is called. - The
tag
indicates this is an aggregation function (CB_Many2One
), meaning it takes multiple values (in this case, two series of numbers) and returns a single value (the correlation coefficient). - The
arguments
indicate that this function expects two arguments, both of which should be numbers. - The
return
type isnumber
, meaning the function will return a numeric value. - The
sql_select
contains the actual SQL code that will be executed when the function is called, with placeholders for the arguments. In this case, it's a call to theCORR
function in SQL, which calculates the correlation coefficient between two series of numbers.
One caveat here is that the arguments will only be number
and therefore not integer
. To account for this, you have to adapt it like:
arguments: [
{ type: [ number, integer ] },
{ type: [ number, integer ] }
]
This means that each argument accepts multiple types.
CB_One2One: Length
Let's say we want to implement a function that outputs the length of a string attribute.
function Length {
name: "Length"
synonym: "long"
// The name with the argument.
dynamic_display_name: "Length of $1"
// In this case we map each string value to an integer
// Therefore we use CB_One2One
tag: CB_One2One
arguments: [ string ]
return: integer
sql_select: "LENGTH($1)"
}
Now you can ask for instance "What is the average length of an email?", if you have an string Email
in your Knowledge Graph.
CB_One2Maybe: Leap Year
In PostgreSQL, we can extract the year from a date using EXTRACT(YEAR FROM $1)
and then find out if a year is a Leap Year by
doing (EXTRACT(DOY FROM DATE (EXTRACT(YEAR FROM $1)::text || '-12-31')) = 366)
.
Using this, we can define a general function
in Veezoo that takes an arbitrary date attribute and filters for only dates in a leap year.
function Leap_Year {
name: "Leap Year"
// The name with the argument.
dynamic_display_name: "$1 is in leap year"
// To display this function in autocomplete
dynamic_autocomplete: "$1 in leap year"
tag: CB_One2Maybe
arguments: [
date
]
sql_where: "(EXTRACT(DOY FROM DATE (EXTRACT(YEAR FROM $1)::text || '-12-31')) = 366)"
}
Note here that we used sql_where
instead of sql_select
, because it's a CB_One2Maybe
. Another difference is that we don't have a return
.
Finally, we need to explicitly add dynamic_autocomplete
, so it appears in the AutoComplete when typing a question. For instance, when you type something like 'How many customers with birthdate in', you will see 'birthdate in leap year' as a suggestion.
CB_One2Maybe with single_value_only: Regex match
There are cases where you want to specify that one of the arguments is not a column, but rather a literal value. For that, we will use single_value_only: true
.
As an example, let's look into the POSIX Regular Expressions operator '~' in PostgreSQL.
function matches_regex {
name: "matches regex"
synonym: [
"matches",
"regex"
]
dynamic_display_name: "$1 matches regex $2"
dynamic_autocomplete: "$1 matches regex $2"
tag: CB_One2Maybe
arguments: [
{ type: [string, class] },
{ type: string, single_value_only: true}
]
sql_where: "$1 ~ $2"
}
This allows you to ask questions like: "Show me all customer with an email that matches '.*?@gmail.com'"
Another thing to notice here is that the first argument can take either a string
or a class
. In the case of the class
, the resulting SQL uses only the sql
in the class definition, not name_sql
, nor any renamings on entities that you may have done in the Knowledge Graph.