Went through another 46 NLP interview questions and below are the questions I learned / consolidated from 🙂

What is latent semantic indexing (LSI) and where can it be applied?

LSI is an indexing and retrieval method that uses singular value decomposition (SVD) to detect relationships between terms and concepts in text documents.

What is pragmatic analysis?

The task of extracting information from text using external knowledge that’s outside the documents or queries.

Briefly describe word2vec.

Word2Vec is an example of distributional (fixed) word embeddings. It encodes words into lower-dimensional vector space using a shallow neural network. There are two versions of this model namely skip-grams and CBOW. Skip grams takes in target word and predicts the surrounding context words whereas CBOW takes in the surrounding context words to predict the target word. The idea is that we would like to encode words into a set of vectors where vectors are close to each other for words that have similar meanings.

Ryan

Ryan

Data Scientist

Leave a Reply