Working Notes: a commonplace notebook for recording & exploring ideas.
Home. Site Map. Subscribe. More at expLog.

Dl Intro

Dataset is just a small part of the visible function; we try to find the function.

Hyperplane is a subspace one dimension less than geometry (eg. a line in a plane).

  - Perceptrons:

1 if wx + b >= 0; 0 if wx + b < 0 w & b define the perceptron

Optimization Algo

fx = w x + b

f(x) = 1 / (1 + e^(-wx + b)) evaluate fx as probability

take product of chances as the product differentiable, doesn't have a closed form solution

use log likelihood

  - Gradient descent

Go through each example, and calculate a gradient based on loss function and difference between prediction

can also be used for closed form solution.

/Stochastic gradient descent/ -- per example, instead of dataset

  - Regularization

Apart from reducing training error, minimize regularization term by including magnitude of weight vector in the loss function.

Controlled by a lambda.

  - Hyperparameters

Not directly optimized by the learning process generally sweep over a different combination of hyperparameters

  - Stop validation after validation error increasses

  - Cross validation

keep trying different partitions: expensive for large daataset

  - Lua

http://tylerneylon.com/a/learn-lua

  - Additional Resources

https://www.facebook.com/groups/987689104683098/permalink/989801941138481/

Kunal