Skip to main content

function

Definition

A function in Veezoo allows users to create custom SQL functions or transformations that may not be readily available in Veezoo off-the-shelf. It's a useful feature for adding specialized functionality that is available in your database, e.g. linear regression functions.

Usage

A function is defined with several properties:

  • name: A name for the function, used when calling it.
  • synonym: Alternative ways to refer to this function in a question.
  • dynamic_display_name: The name of the function as it will be displayed in Veezoo, including placeholders for the arguments.
  • dynamic_autocomplete: (Optional) The pattern that should be displayed in the autocomplete, when a user is typing a question.
  • tag: The type of function. Can be one of the following:
    • CB_Many2One for aggregation functions. These functions take multiple values and return a single value, like sum or average.
    • CB_One2One for transformation functions. These functions transform one value into another, like converting temperatures or changing data types.
    • CB_One2Maybe for filter functions. These functions return true or false.
  • arguments: The types of arguments the function accepts.
  • return: The return type of the function. Left out for CB_One2Maybe.
  • sql_select: The SQL equivalent of the function for the SELECT clause. It uses placeholders ($1, $2, etc.) for the arguments.
  • sql_where: Especially for CB_One2Maybe, the SQL equivalent of the function for the WHERE clause. It uses placeholders ($1, $2, etc.) for the arguments.

Examples

CB_Many2One: Correlation

Let's consider a real-world example where we define a function to calculate the Correlation between two number series:

function Correlation {
name: "Correlation"

// That's what gets displayed to the end user
dynamic_display_name: "Correlation of $1 and $2"

// It is an aggregation
tag: CB_Many2One

// It takes 2 series of numbers
arguments: [
number,
number
]

// It returns another number
return: number

sql_select: "CORR($1, $2)"
}

In this example:

  • The name of the function is "Correlation".
  • The dynamic_display_name is "Correlation of $1 and $2", where $1 and $2 are placeholders that will be replaced with the actual arguments when the function is called.
  • The tag indicates this is an aggregation function (CB_Many2One), meaning it takes multiple values (in this case, two series of numbers) and returns a single value (the correlation coefficient).
  • The arguments indicate that this function expects two arguments, both of which should be numbers.
  • The return type is number, meaning the function will return a numeric value.
  • The sql_select contains the actual SQL code that will be executed when the function is called, with placeholders for the arguments. In this case, it's a call to the CORR function in SQL, which calculates the correlation coefficient between two series of numbers.

One caveat here is that the arguments will only be number and therefore not integer. To account for this, you have to adapt it like:

    arguments: [
{ type: [ number, integer ] },
{ type: [ number, integer ] }
]

This means that each argument accepts multiple types.

CB_One2One: Length

Let's say we want to implement a function that outputs the length of a string attribute.

function Length {
name: "Length"

synonym: "long"

// The name with the argument.
dynamic_display_name: "Length of $1"

// In this case we map each string value to an integer
// Therefore we use CB_One2One
tag: CB_One2One

arguments: [ string ]

return: integer

sql_select: "LENGTH($1)"
}

Now you can ask for instance "What is the average length of an email?", if you have an string Email in your Knowledge Graph.

CB_One2Maybe: Leap Year

In PostgreSQL, we can extract the year from a date using EXTRACT(YEAR FROM $1) and then find out if a year is a Leap Year by doing (EXTRACT(DOY FROM DATE (EXTRACT(YEAR FROM $1)::text || '-12-31')) = 366).

Using this, we can define a general function in Veezoo that takes an arbitrary date attribute and filters for only dates in a leap year.

function Leap_Year {
name: "Leap Year"

// The name with the argument.
dynamic_display_name: "$1 is in leap year"

// To display this function in autocomplete
dynamic_autocomplete: "$1 in leap year"

tag: CB_One2Maybe

arguments: [
date
]

sql_where: "(EXTRACT(DOY FROM DATE (EXTRACT(YEAR FROM $1)::text || '-12-31')) = 366)"
}

Note here that we used sql_where instead of sql_select, because it's a CB_One2Maybe. Another difference is that we don't have a return.

Finally, we need to explicitly add dynamic_autocomplete, so it appears in the AutoComplete when typing a question. For instance, when you type something like 'How many customers with birthdate in', you will see 'birthdate in leap year' as a suggestion.

CB_One2Maybe with single_value_only: Regex match

There are cases where you want to specify that one of the arguments is not a column, but rather a literal value. For that, we will use single_value_only: true.

As an example, let's look into the POSIX Regular Expressions operator '~' in PostgreSQL.

function matches_regex {
name: "matches regex"
synonym: [
"matches",
"regex"
]

dynamic_display_name: "$1 matches regex $2"
dynamic_autocomplete: "$1 matches regex $2"

tag: CB_One2Maybe

arguments: [
{ type: [string, class] },
{ type: string, single_value_only: true}
]

sql_where: "$1 ~ $2"
}

This allows you to ask questions like: "Show me all customer with an email that matches '.*?@gmail.com'"

Another thing to notice here is that the first argument can take either a string or a class. In the case of the class, the resulting SQL uses only the sql in the class definition, not name_sql, nor any renamings on entities that you may have done in the Knowledge Graph.