Pytorch Basics 3: Loss Function

To train a model, we need:

  • loss function
  • Optimizer
  • model
  • data

Relation beetween Loss function and Optimizer?

recall the simplest pytorch training process:


dataset = My_data()
loader = DataLoader(dataset)
model = MyNet()
criterion = torch.nn.CrossEntropyLoss()
optimizer = SGD(model.parameters)
  for epoch in range(10):
    for batch, labels in loader:
    outputs = model(batch)
    loss = criterion(outputs, labels)
    optimizer.zero_grad()
    loss.backward()
    optimizer.step()



here criterion is the loss function.

Loss is not directly related to the Optimizer, loss is directly related to Gradients.

  • Loss computes the gradient at loss.backward()
  • the optimizer looks at the gradient over all parameters in model.parameters() and then update them using optimizer.step()

Loss Function

Generally, loss funciton could be any function using Pytorch Operation that.

There are two categories of loss:

  1. Error Loss
    1. use the output of model as the prediction, and compare with the true label
  2. Likelihood Loss
    1. the goal of your optimizer is to make the overall likelihood of observations as large as possible
      1. there is no “target” and “logits”
      2. there is only “positive samples” and “negative samples”

Error Loss is more focusing on “predictive power”, and likelihood loss is trying to fit the observation data.


# a general error loss

def error_loss(output, target):
  '''
  output is the output of a single forward pass
  '''
  ...
  ...
  return a_slacar # bigger = worse performance

# a general likelihood loss

def likelihood_Loss(positive, negative):
  '''
  output is the negative log likelihood
  '''
  L =  torch.cumsum(positive) - torch.cumsum(negative)
  return -L # bigger = worse fitting


Dayu Yang
Dayu Yang
Ph.D. Student in Financial Services Analytics

I am excited about NLP, Information Retrieval, and Algorithm Trading. My current research focuses on Conversational AI systems.