Whats a main Problem with distances in high-dimensional space ?
The distances become very close together and therefore not meaningful
How does Isomap work ?
Start with a distance matrix (your favourite if you like)
Create k-nearest-neighbor graph on that distance matrix
Compute pairwise shortest path on the graph
Use these new distances
What does tSNE stand for ?
t-distributed Stochastic neighborhood embedding
How does tSNE work
Compute distances between all samples
Compute the “unscaled similarity scores” by getting the distance propability for each distance. The distribution is centered around our point of “interest”. The with depends on the density of data around point of interest
Scale these distance propablities so that they add up to 1
Whats crucial for tSNE and UMAP to work properly
Their initalization
Zuletzt geändertvor 5 Monaten