What is Natural Language Processing (NLP)?
subfield of linguistics, computer science, information engineering and AI
tries to emphasize the intercation with humans and computers, in particular how computers are able to process and analyze natural language
Improve communication between computers and humans
Improve the communication between humans –> babel fish, Google translator
Perceive knowledge form text
What is Eliza and how does it work and what was its impact in the field of AI?
It is a program from Joseph Weizenbaum , which should pretend that it is a therapist
Scans input sentences for keywords
Analysizes input sentences according to transformation rules
e.g.: I feel lonely –> do you often feel lonely?
Responds according to reassembly rules
In case it is stuck, it has some stock answers
Eliza is one of the most known chatbots until today
What is PARRY?
PARRY was also a NLP-model which instead of pretending to be a therapist, pretends to be a sociopath
What was STUDENT able to do
STUDENT - also a NLP-model - was able to solve simple high school math problems
What could SHRLDU do?
Was a NLP-model which was able to answer questions, but only in a certain environment, which happens to be the block’s world
Why is NLP so hard?
Natural language is highly ambigous, in terms of:
Lexical regards –> the same words have different meaning depending on the context
syntactic regards –> the same sentence may have different interpretations
semantic regards –> the interpretation of a sentence may depend on the context of he sentence and may require a deep understanding of our world
Discourse –> the meaninf of a sentence also depends on the previous conversation
What are traditional NLP Tasks?
How is word segmentation working?
It divides the input text into small semantic entitites
It identifies basic entities, e.g. words
New York –> two words but one token
Finlands, Finland’s –> two different spellings but the same meaning
Different notations of dates or abbreviations
How does part of speech (POS) tagging work?
It assigns each word to its most probable role in the sentence
POS tagging classifies word in a category (noun, verb, adjective, etc.) depending on its:
How does syntactic analysis work?
It tries to find the most likely grammatical interpretation of the sentence
Breaks a sentence down to its grammatical structure (noun phrase, verb phrase, prepositional phrase)
Marx-Brothers joke: The meaning of a sentence cannot always determined by its grammatical structure
How does semantic analysis work?
It tries to find the most likely meaning of a sentence and the reference of a word
Mostly implemented with so called word embeddings –> in a n-dimensional vector-space the distance between two object is equivalent to the similarity of two words
Traditional approach, e.g.: Bag of words:
each word has its own dimension
State of the art, e.g. Word2Vec:
How does the Bag-of-worlds model work?
A text is a n-dimensional vector space
Each subpart of the text (words) has its own dimension
The value of each word is determined by its number of occurances in the text
The text is then a linear combination of the vectors and computations can be done with linear algebra
What does World2Vec work and what are the two variants?
Each word is a simple vector and the similarities can be checked via computations of cosines
A supervised DL-model is trained on the data using the context of the word also as an input
Variant 1: continuous bag of words –> predict the current word from a window of surrounding words
Variant 2: Skip gram –> use the current word to predict the context window
What are NLP tasks?
Information retrieval –> find documents relevant for a question
Information extraction –> extract relevant information from these documents
Text Categorization –> put texts into categories
Text Summarization –> summarize given documents
Machine Translation –> translate text from one languate to another
Text Generation –> generate coherent text on a topic
What is text categorizing?
text categorizing assigns labels to each document, labels might be:
How does the bag of words model work?
Basic idea: If we repeatedly draw from a “bag of words” this results in a sequence of randomly chosen words, which may be seen as a document. Some words may occur more often than others
Probabilistic Document Model: we can assign probailitites to each word depending on how often it is drawn
Class-conditional Probabilities: different classes have different bags of words and therefore the probailitiy of a certain word in a certain bag is different to another bag
Probabilistic Text Classification: answers the question from which bag was a document generated
How does the naive Bayes Classifier work?
Where c is the predicted class
d is the given document
ti is the sequence of all terms of the document
How does machine translation work today?
Machine translaters use a encoder for the input language and a decoder of for the output language –> end-to-end learning
Signiture sentences which have to be in a certain language
What are language models and what does it do?
Language models try to estimate the probaility of each word given a prior context and generate text
The numbers of parameters increase exponentially with the number of words creating the context
How does GPT-3 work?
GPT is a language model trained on a huge dataset, containing books, Wikipedia and website informations
It can generate coherent text and it is challenging to tell them apart form human generated texts
But it also falls into traps:
e.g.: how many eyes does my foot have