Pytorch Basics 3: Loss Function
To train a model, we need:
- loss function
- Optimizer
- model
- data
Relation beetween Loss function and Optimizer?
recall the simplest pytorch training process:
dataset = My_data()
loader = DataLoader(dataset)
model = MyNet()
criterion = torch.nn.CrossEntropyLoss()
optimizer = SGD(model.parameters)
for epoch in range(10):
for batch, labels in loader:
outputs = model(batch)
loss = criterion(outputs, labels)
optimizer.zero_grad()
loss.backward()
optimizer.step()
here criterion is the loss function.
Loss
is not directly related to the Optimizer
, loss
is directly related to Gradients
.
- Loss computes the gradient at
loss.backward()
- the optimizer looks at the gradient over all parameters in
model.parameters()
and then update them usingoptimizer.step()
Loss Function
Generally, loss funciton could be any function using Pytorch Operation that.
There are two categories of loss:
- Error Loss
- use the output of model as the prediction, and compare with the true label
- Likelihood Loss
- the goal of your optimizer is to make the overall likelihood of
observations
as large as possible- there is no “target” and “logits”
- there is only “positive samples” and “negative samples”
- the goal of your optimizer is to make the overall likelihood of
Error Loss is more focusing on “predictive power”, and likelihood loss is trying to fit the observation data.
# a general error loss
def error_loss(output, target):
'''
output is the output of a single forward pass
'''
...
...
return a_slacar # bigger = worse performance
# a general likelihood loss
def likelihood_Loss(positive, negative):
'''
output is the negative log likelihood
'''
L = torch.cumsum(positive) - torch.cumsum(negative)
return -L # bigger = worse fitting