What are the EKG hierarchy of needs?

From the foundation to the top of the pyramid:

  1. Infrastructure – Scalable storage, fault tolerance, networking, monitoring, etc.

  2. Data – High quality primary data and well-defined domain schemas

  3. Graph – ETL and queries with entity mappings

  4. Knowledge – Ontologies, modality, and provenance

  5. Logic – Inference proofs

What are the challenges of building EKG?
  1. Real data is messy

  2. Ontologies are hard to build

  3. Building a KG around a business case does not necessarily scale

What are the potential solutions?
  1. Invest in shared vocabulary

  2. Fit the tooling to the infrastructure

  3. Fit the data model to the data

  4. Prioritise the lower hierarchy needs

What does data standardisation means in knowledge graphs?
  1. Controlled vocabularies, which involves basic aliases type, predefined entities and relationships, metadata vocabularies, and structured types for other types of data

  2. Elevates domain-specific schemas onto ontologies

  3. Tooling carries schemas between data representation languages

What is metadata graph?

Metadata graph process over hundreds of thousands of structured datasets at Uber. This is important for data protections and user trust. The metadata graph requires manual annotation effort but this is beneficial as it allows us to standardise and compose schemas. With this, we can start investigating into efficient reasoning.



Data Scientist

Leave a Reply