class
Definition
A class can be an attribute of another class. Like its outer class, it is also a category of things.
The main difference is that, as an attribute, it doesn't contain much more information and therefore doesn't "deserve" to be modelled more prominently in a separate file.
Let's take a look at an example to understand it better and explain the difference between a string
and a class
.
Example
Let's say we have a dataset with a table CITY(id, name, lat, long, country)
, where country
is just a text column containing the name of the country the city is in.
In ER-modelling, you would choose to model City as an entity type, but Country would simply be a string attribute of City. That's because we don't have any other information about it, e.g. population, continent, etc.
Now let's say suddenly the data model changes, because the needs of the business changes, and this information is there:
CITY(id, name, lat, long, country_id)
COUNTRY(id, name, population, gdp_in_usd, continent)
.
The ER model would have, instead of a string attribute country
, an entity type Country.
Now, in Veezoo, we believe that the knowledge-base layer should
- model the reality as closely as possible: a Country is not a String
- be as independent as possible from the underlying physical representation of the data: who cares if a country is in a column or a separate table
So Country, in the first example is a class (not a string) and in the second example it's still a class.
Here is in VKL for the first example:
kb {
class City {
name.en: "City"
sql: "${CITY.id}"
...
class Country {
name.en: "Country"
extends: onto.Country
sql: "${CITY.country}"
}
}
}
And after we extended the data model:
File: knowledge-base/classes/City.vkl
kb {
class City {
name.en: "City"
sql: "${CITY.id}"
...
relationship located_in {
display_name.en: "located in"
to: kb.Country
sql: "${CITY.country_id}"
}
}
}
File: knowledge-base/classes/Country.vkl
kb {
class Country {
name.en: "Country"
extends: onto.Country
sql: "${COUNTRY.id}"
name_sql: "${COUNTRY.name}"
...
}
}
Now you may have noticed that Country is still a class, but in a different file and with a different sql
/name_sql
mapping.
Apart from this, there is now a relationship
defined that goes from kb.City
to kb.Country
.
"Didn't you say that the knowledge-base layer wouldn't change? Why is there now a relationship?"
The truth is that a class defined as an attribute is just a "syntactic sugar" for a class + a relationship. In fact, if you want you can even modify its properties, e.g. display_name
, like this:
kb {
class City {
name.en: "City"
sql: "${CITY.id}"
...
class Country {
name.en: "Country"
extends: onto.Country
// defined inside the class attribute itself without an identifier
relationship {
// here you can override any relationship attribute,
// e.g. display_name from its default value "with" to something better like 'located in'
display_name.en: "located in"
// no sql defined here (Don't-Repeat-Yourself principle)
}
sql: "${CITY.country}"
}
}
}
Normalizing your KG
What if you have country
in many tables:
CITY(id, name, country)
CUSTOMER(id, name, country)
and you want a single class representing Country, even if your database doesn't have a table for it.
Using what we just learned above, we can extract the class attribute class Country
from the class City
into another separate file (click on the icon in Veezoo Studio to add a new file and then select type Class.)
Now you can copy the definition and put it in this new file.
File: knowledge-base/classes/Country.vkl
kb {
class Country {
name.en: "Country"
extends: onto.Country
sql: "${CITY.country}"
}
}
And then substitute the class Country
in the other classes (City and Customer) with a relationship.
File: knowledge-base/classes/City.vkl
kb {
class City {
name.en: "City"
sql: "${CITY.id}"
...
relationship located_in {
display_name.en: "located in"
to: kb.Country
// defaults to the sql of the kb.Country class (DRY principle)
}
}
}
File: knowledge-base/classes/Customer.vkl
kb {
class Customer {
name.en: "Customer"
sql: "${CUSTOMER.id}"
...
relationship lives_in {
display_name.en: "lives in"
to: kb.Country
sql: "${CUSTOMER.country}"
}
}
}
It is important to notice that Veezoo will not try to join the two tables, since it is not needed.
When should I model something as a string vs. a class?
This is a common question that arises when you first learn VKL. In all other BI tools, a country is just a string.
Well, the way you have to think about it is: is it actual text (e.g. that you would want to search inside)? If not, it's probably a class.
A free-text comment field? It's a string in Veezoo.
A customer? It's a class.
The first name of a customer? If this is relevant as a standalone concept for your business users, then sure, it's a string. Otherwise, just use it as part of the name_sql
, e.g. "${CUSTOMER.first_name} || ' ' || ${CUSTOMER.last_name}"
.
By modelling as a class with entities, you get to refer to the individual values (entities) in a question, since Veezoo will index them to understand you. This way you can ask 'How many Corporate customers do we have?' and Veezoo will recognize the entity 'Corporate' of class 'Segment', while helping you with AutoComplete.
Advanced Class Definition
Like with other attributes, we can use the sql
property to define whatever we can express in SQL.
For instance, let's say we have a table COUNTRY(id, name, population)
and we want to segment countries according to their population.
We can define a class attribute Population_Group
like this:
kb {
class Country {
name.en: "Country"
...
integer Population {
name.en: "Population"
sql: "${COUNTRY.population}"
}
// our new segmentation
class Population_Group {
name.en: "Population Group"
synonym.en: "Size"
description.en: "A segmentation of countries according to their population."
sql: """
CASE
WHEN ${kb.Country.Population} < 100000 THEN 'Very Small'
WHEN ${kb.Country.Population} < 1000000 THEN 'Small'
WHEN ${kb.Country.Population} < 10000000 THEN 'Medium'
WHEN ${kb.Country.Population} < 100000000 THEN 'Big'
ELSE 'Very Big'
END
"""
}
}
}
Once this is defined and saved, you can click on the Sync button in the Editor and it will generate the entities for the Population_Group
class.
Now you can ask questions like "How many countries are there in each population group?" or "Show me all small countries."