VQL (Veezoo Query Language)
The Veezoo Query Language (VQL) acts as a bridge between natural language and SQL. When users pose questions to Veezoo, the intelligent Veezoo model translates these queries into VQL, which is then deterministically transformed into SQL.
VQL offers the same expressiveness as SQL but operates at a higher abstraction level by utilizing the classes and attributes defined in the Veezoo Knowledge Graph instead of directly working on underlying tables and columns.
While viewer-users don't need access to VQL for everyday tasks, creator-users can benefit significantly from understanding VQL, particularly because:
When Veezoo answers a question, it provides both the SQL and the corresponding VQL used. VQL is often shorter and more human-readable.
Measures, functions, and rules are defined using VQL, making it an essential tool for creators looking to leverage Veezoo's advanced capabilities.
Variables and Select
Let's take a look at a simple VQL query that returns a list of all products:
var product: kb.Product
singleton(product)
select(product)
A typical VQL query starts by defining variables. Similar to many programming languages, variables in VQL can represent different types, such as:
- a number, string, or boolean
- a date
- a specific entity of a class
- a class itself, as in this case
var product: kb.Product
In VQL, a variable like product
doesn’t represent a single instance of the class but instead refers to all possible instances. Think of a class in VQL as corresponding to an entire column or table in SQL.
The singleton()
function is used to indicate that the variable has no relationships with other variables. This is mostly a legacy feature and is only necessary when working with a class in isolation.
Finally, select(product)
specifies what the query will return. This is similar to the SELECT
statement in SQL. However, unlike SQL, the VQL select()
function can only appear once in a query (while SQL SELECT
can be used multiple times, such as in subqueries).
The VQL shown to users when answers are generated is not directly translated to SQL but undergoes several rewriting steps: rewriting rules are applied, functions are replaced by their definitions, and display_with
is added.
The rewriting happens after the VQL is shown, keeping the shown VQL easier to understand. However, it can make it harder to see how a particular VQL translates to SQL.
Showing Attributes
In VQL, if you want to display additional attributes, you can do it like this:
var product: kb.Product
var category: kb.Product.Category
category = product.Category
select(product, category)
Here, we define a second variable called category
, which is of type kb.Product.Category
. In the third line, we assign category
to represent the categories associated with product
using the dot-operator. The expression product.Category
indicates that the class represented by product
has an attribute or a relationship called Category
.
Make sure to explicitly connect the different variables in the VQL. Veezoo won't do that automatically for you.
Note that the dot-operator used here is different from the one in the first two lines, where it is part of the type's path.
This VQL can be simplified to:
var product: kb.Product
var category = product.Category
select(product, category)
In this case, the variable type is inferred automatically. The code can be further shortened (as Veezoo would generate it):
var product: kb.Product
select(product, product.Category)
Notice that the singleton()
function is not needed here, as there is no single isolated candidate.
Filters
In most cases, we don't want to just return one or more attributes; we want to filter them based on certain criteria. For example, the following VQL will be generated when asking: Show me Orders from the Region West:
var order: kb.Order
order.Region = kb.Order.Region.West
select(order)
The second line here performs the filtering. Instead of overwriting something (as one might expect from other programming languages), we restrict the variable order
to only those with a Region
equal to the entity kb.Order.Region.West
.
Filtering for numeric conditions can be done using mathematical comparisons (e.g., Show orders with a quantity greater than 5):
var order: kb.Order
var quantity = order.Quantity
quantity > 5
select(order, quantity)
Negation
Use the not{}
function to negate filters:
var customer: kb.Customer
not {
var order: kb.Order
order.Customer = customer
}
select(customer)
Here, all customers without any orders are returned. It filters for no possible value within the curly brackets.
String Filters
For attributes of type string, Veezoo provides the following filter functions:
starts_with_string()
contains_string()
ends_with_string()
These functions take two arguments: the string to be filtered and the value to compare it to:
var customer: kb.Customer
var name = customer.fullName
starts_with_string(name, "Lukas")
select(name, customer)
Here, we select customers whose full name starts with Lukas.
All of these string filters are case-insensitive.
Joins
Connecting two classes in VQL, which corresponds to joining two tables in SQL or accessing a column of a class, works similarly to filtering entities. For example, here is the VQL for the question Show me all orders from customers with segment consumer:
var order: kb.Order
var customer = order.with_Customer
customer.Segment = kb.Customer.Segment.Consumer
select(order, customer)
In the second line, we connect the order and customer classes, which will join the order and customer tables in SQL. We define the customer
variable to be the customers from the orders, i.e. order.with_Customer
, which will be mapped to the customer
foreign key column in the order table in SQL.
In the next line, we specify that we only want customers with the segment Consumer
. Since the orders and customers are already connected, this filter also affects the orders.
Finally, we return the orders and their customers.
Dates
VQL has rich support for dates, allowing you to filter and convert them to different granularities.
Date Filters
If a class has a date
field, filtering for dates can be done similarly to filtering for other numeric values:
var order: kb.Order
var orderDate = order.Order_Date
orderDate > date("2022-05-01")
select(order, orderDate)
The expression date("2022-05-01")
returns a date object for May 1st, 2022.
To filter for dates within a specific period, use the date_in()
function:
var order: kb.Order
var orderDate = order.Order_Date
date_in(orderDate, date("2023"))
select(order, orderDate)
Similarly, the date_in_range()
function can be used to filter for dates between a start and end point:
date_in_range(date("1995-10-24"), orderDate, date("2008-05-13"))
Dynamic Dates
Besides static dates (which refer to a fixed point in time) VQL also offers support for dynamic dates like this year or last month. This is helpful in keeping dashboards up to date. If a widget is created by asking the question How many orders last week, the widget will show data for the week prior to the current date, instead of always keeping to show data for the week before the question was asked.
There are two types of dynamic dates. The first option is using the date()
function with certain special arguments:
var order: kb.Order
var orderDate = order.Order_Date
date_in(orderDate, date("#thisYear"))
select(order, orderDate)
These values include:
#now
#today
#yesterday
#thisYear
#lastYear
#nextYear
#thisQuarter
#thisMonth
#lastMonth
#thisWeek
The other option is to use the dynamic_date()
function, which takes a phrase and a language as input:
var order: kb.Order
var orderDate = order.Order_Date
date_in(orderDate, dynamic_date("last month", "en"))
select(order, orderDate)
When both are able to express the same dynamic date (e.g. date("#lastMonth")
and dynamic_date("last month", "en")
), these two methods are equivalent,dynamic_date()
does offer more possible arguments though.
Aggregation Functions
VQL aggregation functions work similarly to SQL:
var order: kb.Order
var profit = order.Profit
var sumProfit = sum(profit)
var averageProfit = average(profit)
var countOrder = count(order)
select(sumProfit, averageProfit, countOrder)
The aggregation functions can also be used directly in the select
statement:
var order: kb.Order
var profit = order.Profit
select(sum(profit), average(profit), count(order))
As in SQL, aggregation functions can be grouped by one or more attributes. For example, if we want the Average Profit of Orders by Shipping Mode, instead of calculating the average profit for all orders, we divide the orders into groups based on their shipping mode and then calculate the average per group:
var order: kb.Order
var shippingMode = order.Shipping_Mode
select(average(order.Profit) by (shippingMode), shippingMode)
The shipping mode must be included in the select
statement (not just in the by
clause) so that users can see which case the aggregated numbers represent. Let's look at another example, Average Profit of Orders by Shipping Mode and Region:
var order: kb.Order
var region = order.Region
var shippingMode = order.Shipping_Mode
select(average(order.Profit) by (shippingMode, region), shippingMode, region)
Aggregation Over Time
A common form of aggregation is over a certain timespan, such as day, week, or year. This involves some extra steps. The following example calculates the Average Discount of Orders by Week:
var order: kb.Order
var week = periodicity("P1W")
var orderDate = order.Order_Date
project_date(orderDate, week)
select(average(sum(order.Discount) by (week)))
The line var week = periodicity("P1W")
defines the time duration for aggregation. The variable week
is then passed into the project_date()
function, where the week
periodicity is assigned the week of the Order_Date
. Finally, we can aggregate by week.
Arithmetic Functions
VQL supports arithmetic functions like +
, -
, /
, or *
to perform calculations. For example, to calculate the total order value minus the value-added tax:
var order: kb.Order
select(sum(order.value) - sum(order.VAT))
Arithmetic functions can also be used with grouped aggregations. Here, we calculate the same value but grouped by month:
var order: kb.Order
var month = periodicity("P1M")
project_date(order.order_date, month)
select(sum(order.value) by (month) - sum(order.VAT) by (month), month)
In such cases, both operands need to be aggregated at the same level of granularity.
Percentages
To calculate percentages, VQL provides the proportion() {}
function. For example, to calculate the percentage of customers who live in Zurich:
var customer: kb.Customer
var proportion = proportion(customer) {
customer.lives_in = kb.City.Zurich
}
select(proportion)
The proportion()
function works by calculating the overall count or sum as the denominator, and then applying the filter inside the curly braces to calculate a restricted count or sum as the numerator.
This means that, for example, if an order
has a value
and a VAT
, it is NOT possible to use this function to calculate what percentage of the value the VAT represents, since order.VAT
is not a subset of order.value
.
The proportion() {}
function can also be used with grouped data. For instance, to calculate the same percentage but grouped by customer segment:
var customer: kb.Customer
var segment = customer.customer_segment
var proportion = proportion(customer) {
customer.lives_in = kb.City.Zurich
} by (segment)
select(segment, proportion)
Union
In many cases, users want to filter for multiple possible options. For example, Show customers from Zurich and Ljubljana or Show premium customers and customers with high cancellation risk. To achieve this, Veezoo provides two versions of the union
function.
In simpler scenarios, where you want to combine different entities of the same class (as in the first example), you can do it like this:
var customer: kb.Customer
var city = union(kb.City.Zurich, kb.City.Ljubljana)
customer.lives_in = city
select(city, customer)
Here, var city
represents the union of Zurich and Ljubljana, and the following line filters out all customers who do not live in either city. The union
function can include any number of entities.
For more complex cases, where you want to combine different conditions, the syntax changes. In the following case we want to select customers, which are either premium customers, or have a high cancellation risk associated with them:
var customer: kb.Customer
var caseIdentifier = {
customer.is_premium = true
} union {
customer.cancellation_risk = kb.Customer.cancellation_risk.high
}
select(caseIdentifier, customer)
This version of the union
function does two things:
- It restricts
customer
to those who are either premium OR have a high cancellation risk. - It creates
var caseIdentifier
, which contains labels for all the possible cases that are part of the union. This is also selected, allowing the user to see why a particular customer is shown (e.g., Yanick Brecher is shown because he is a premium customer).