Latent Space and Contrastive Learning
Latent Space
-  
Definition: Representation of compressed data
 -  
Data compression: process of encoding information using fewer bits than the original representation
 

Ekin Tiu has a Medium article about why it is called latent “space” here
-  
Tasks where latent space is necessary
-  
Representation learning:
-  
Definition: set of techniques that allow a system to discover the representations needed for feature detection or classification from raw data
 -  
Latent space representation of our data must contain all the important info (features) to represent our original data input
 
 -  
 -  
Manifold learning (subfield of representation learning):
-  
Definition: groups or subsets of data that are “similar” in some way in the latent space, that does not quite show in the higher dimensional space.
 -  
Manifolds just mean groups of similar data
 
 -  
 -  
Autoencoders and Generative Models
-  
Autoencoders: a neural network that acts an identify function, that has both an encoder and a decoder
 -  
We need the model to compress the representation (encode) in a way that we can accurately reconstruct it (decode).
- i.e. image in image out, audio in audio out 
  
 - i.e. image in image out, audio in audio out 
 -  
Generative models: interpolate on latent space to generate “new” image
-  
Interpolate: make estimations of independent variables if the independent variable takes on a value in between the range
 -  
Example: if chair images have 2D latent space vectors as [0.4, 0.5] and [0.45, 0.45], whereas the table has [0.6, 0.75]. Then to generate a picture that is a morph between a chair and a desk, we would sample points in latent space between the chair cluster and the desk cluster.
 -  
Diff between discriminative and generative:
-  
Generative can generate new data instances, capture the joint probability of p(X,Y) or p(X) if Y does not exist
 -  
Discriminative models classifies instances into different labels. It captures p(Y X) -> given the image, how likely is it a cat?  
 -  
 
 -  
 
 -  
 
 -  
 
Contrastive Learning with SimCLRv2
-  
Definition: a technique that learns general features of a dataset without labels by teaching the model which data points are similar or different.
-  
Happens before classification or segmentation.
 -  
A type of self-supervised learning. The other is non-contrastive learning.
 -  
Can significantly improve model performance even when only a fraction of the dataset is labeled.
 
 -  
 -  
Process:

-  
Data Augmentation through 2 augmentation combos (i.e. crop + resize + recolor, etc.)
 -  
Encoding: Feed the two augmented images into deep learning model to create vector representations.
- Goal is to train the model to output similar representations for similar images
 
 -  
Minimize loss: Maximize the similarity of the two vector representations by minimizing a contrastive loss function
-  
Goal is to quantify the similarity of the two vector representations, then maximize the probability that two vector representations are similar.
 - We use cosine similarity as an example to quantify similarities: the angle between the two vectors in space. The closer they are, the bigger the similarity score 
  - Next compute the probability with softmax: 
  - Last we use -log() to make it a loss function so that we are minimizing this value, which corresponds to maximizing the probability that two pairs are similar 
  
 -  
 
 -