So far
complete search for single player (BFS, DFS)
complete search for multi-player games (alpha-beta pruning)
Problem:
what if the game is too large to search completly?
Idea:
heurisitic as evaluation function for non-terminal states
Monte-Carlo Search
random plays until terminal state is reached
heuristics for a node: average reward of random plays
Too optimistic
Too pessimistic
Pros and Cons
Pros
easy to implement
low memory
no game specific knowledge
Cons
does not terminate
wrong assumption: every one plays randomly
not correct result (minimax always computes the best move)
Monte Carlo Tree Search
Zuletzt geändertvor 2 Jahren