Def. Document
· Can be anything, web page, word file, text file etc.
Def. Collection
A set of documents
Def. Relevance
Does a document satisfy the information need of the user or does it help complete the user’s task?
Document frequence:
How many documents contain the term
Term frequence per document
how often does the term appear per document
4 Types of queries
Exact matching (concatenated with or)
Boolean queries (and / or / not operators)
Expanded queries (incorporate synonyms)
Wildcard queries, phonetic queries, phrase queries
Explain the functionality of a dictionary
The Dictionary <T> maps text to T
T is a posting list or potentially other data about the term depending on the index
What are the 3 wanted properties for dictionaries
Random lookup
Fast (creation & lookup
Memory efficient
What are relevance limitations
Relevance to the need rather than to the query
Query is a shorthand for an instance of information need, its initial verbalized presentation by the user
Relevance is assumed to be a binary attribute
A document is either relevant to a query / need or it is not
What is term frequency
Number of occurences of term t in document d
Inverse Document Frequency IDF
Last changed2 days ago