What are the factors to consider when adapting to the evolution of knowledge graphs?

Knowledge graph models the dynamic world and so existing data / relations / nodes changes over time. There are four factors we can consider when looking to adapt to these changes:

  1. What’s changing?

  2. Is there data already?

  3. Who’s using the data?

  4. How complex / brittle is the ecosystem?

In terms of what’s changing, we can look at the followings:

  1. Scope expansion – independent to submodel

  2. Generalisation – for example, book to publication

  3. Changing assumptions – adding extra properties or qualifying property assertions

In terms of is the required data readily available, it will affect the migrations such as incomplete, synchronisation, and retractions. In addition, what sources does those data comes from?

We also need to consider who’s using the data. Who needs to know about the changes? and can we easily switch applications to a new version without breaking too many existing work?

Lastly, we need to consider the complexity of the ecosystem. How would changes propagate throughout the schema and data? Would the changes affect the accuracy of inference? There are many factors to balance when assessing the optimal solution. You can improve the correctness and consistency of the knowledge graph but this might come at the cost of performance.

How can we use NLP to convert text into conceptual map?
  1. Sentence and Word tokenisation

  2. Morphological analysis to understand language forms

  3. Sentence / logical / grammatical analysis to understand how words relate to other words

  4. Semantic analysis / disambiguation to understand sentences and texts as a whole, taking into account synonyms, context, plausibility, etc.

How can transformers and KGs come together?

Knowledge graph can be used to convert input text into word-sense disambiguated text, which is then feed into a language model to retrieve contextualised word embeddings. These contextualised word embeddings can be used to extend the coverage of existing knowledge graph which can further improve its ability to disambiguate entities.

The language model focuses on human language and how sentences are built whereas the knowledge graph is human-engineered and highly interpretable.

Is a schema important for knowledge graph?

Many people in the industry prefer not to have a schema because it’s very complicated and labour intensive to build. You can build knowledge graph without a schema but you won’t be able to express meanings, which defeats the original purpose of an ontology-driven approach. The schema is important as it allows you to better understanding your data and also allows you to reuse things. The takeaway here is to always use a schema, the earlier the better.

What’s the purpose of SHACL?

One of the key purposes of SHACL is that it’s use to separate the meaning of the subject matter from the needs of a particular application. This means that one ontology can be the basis for many triple stores and applications by using different SHACL constraints, which allows us to avoid silos.

What’s the agile creation of an enterprise ontology?
  1. Identify questions you want answers to as initial requirements

  2. Build the ontology and triple store that meets the initial requirements

  3. Build out the application that uses the data

Once you have built an initial application, it’s time to do the second iteration (and so on):

  1. Broaden the scope by identifying another set of questions as requirements

  2. Extend the ontology to meet those requirements

  3. Coordinate with other ontology authors

  4. Convert data and ontology into triples

  5. Extend the existing or build out the application

You can use ontology to combine different knowledge graphs together! This is common in big corporates where teams build different knowledge graphs in silos and specifically for their domains.



Data Scientist

Leave a Reply