Working Notes: a commonplace notebook for recording & exploring ideas.
Home. Site Map. Subscribe. More at expLog.
Softmax
- y_i = e^(x_i) / sum over j ^(x_j)
- Easy way to get positive numbers, also called a probability distribution
- John bridle, recommends calling it softargmax
- Used a lot for classification
- They look like probabilities
- Used in the last layer for classification
- As score gets to zero, log becomes too big
- Better off using the log of the value as the module directly to avoid numerical instability [[Why]]
- This is the sigmoid function with inputs x and 1!
— Kunal