What is a knowledge graph?

It describes entities and relations. It is defined by a schema and covers various topics and domains. Given entities and relations, KG is a directed multi-relational graph that comprises of triples of (subject, predicate, object).

What are knowledge graphs being used for?
  • Web search

  • Answering questions

  • Data integration

Knowledge in natural language can easily remain hidden.

What is Wikidata?

It is an open knowledge graph (2012) that provides structured linked data. It can be used to disambiguate entities. The company Apple and the fruit apple are both given two separate unique identifier. Every wikidata page represents an entity, which represents a node in the knowledge graph and it has a list of all the connections and relations it has to other entities (edges). The whole Wikidata has 80 million nodes with more than 1 billion edges.

What is RDF?

It’s a standard way of sharing knowledge graph. It represents knowledge graph in a tuple of triplets of (subject, relation, object). There are many other ways to represent knowledge graph and once you get them into the different sharable forms, you can start merging different knowledge graphs together.

What is schema.org?

It provides vocabulary for the labels of edges to publish more data.

Why use knowledge graphs?

You can integrate different knowledge graphs together and start to form complex queries to answer questions (inferences) that require multiple chain of reasonings.

What’s the language to write queries in knowledge graphs?


What are the two main types of KG?
  1. Document and NLP based

  2. Entity/Event based

All the knowledge graph should start with ontologies and taxonomies to ensure the knowledge graph you built can be integrated with other knowledge graphs.

What are the two ways to represent knowledge graphs?
  1. Symbolic – Logic and databases

  2. Vector – NLP and CV

Symbolic representation refers to the triples of (subject, predicate, object). We can use logic programming to code out the symbolic representation. The vector representation refers to using graph embeddings.

What are the NLP areas required to build the knowledge graphs?
  1. NER

  2. Relation Extraction

  3. Question Answering

  4. Language Modelling



Data Scientist

Leave a Reply