This week: optimization algos to faster train NN, on large dataset.

Mini-batch gradient descent

batch v.s. mini-batch GD

Compute J on m examples: vectorization, i.e. stacking x(i) y(i) horizontally.
X = [x(1), ..., x(m)]
Y = [y(1), ..., y(m)]
→ still slow or impossible with large m ...


Hyperparameter parameters

Tips for hyperparam-tuning.

Tuning process

Many hyperparams to tune, mark importance by colors (red > yellow > purple):

How to select set of values to explore ?

  • Do NOT use grid search (grid of n * n)

— this was OK in pre-DL era.

  • try random values.

reason: difficule to know which hyperparam ...

Setting up your Maching Learning Application

Train / Dev / Test sets

Applied ML: highly iterative process. idea-code-exp loop

splitting data
splitting data in order to speed up the idea-code-exp loop:
*training set / dev(hold-out/cross-validataion) set / test set *

split ratio:

  • with 100~10000 examples: 70/30 or 60/20/20
  • with ...

Deep L-layer neural network

Layer counting:

  • input layer is not counted as a layer, "layer 0"
  • last layer (layer L, output layer) is counted.

notation: layer 0 = input layer L = number of layers n^[l] = size of layer l a^[l] = activation of layer l = g[l]( z[l] ) → a ...

Neural Networks Overview

new notation:

  • superscript [i] for quantities in layer i. (compared to superscript (i) for ith training example).
  • subscript i for ith unit in a layer

Neural Network Representation

notation:

  • a^[i]: activation at layer i.
  • input layer: x, layer 0.
  • hidden layer
  • output layer: prediction (yhat)
  • don ...

This week: logistic regression.

Binary Classification & notation

ex. cat classifier from image image pixels: 64x64x3 ⇒ unroll(flatten) to a feature vector x dim=64x64x3=12288:=n (input dimension)

notation

  • superscript (i) for ith example, e.g. x^(i)
  • superscript [l] for lth layer, e.g. w^[l]
  • m: number of ...

What is a neural network?

Example: housing price prediciton.

Each neuron: ReLU function

Stacking multiple layers of neurons: hidden layers are concepts more general than input layer — found automatically by NN.

Supervised Learning with Neural Networks

supervised learning: during training, always have output corresponding to input.

Different NN types are ...

Save settings and configurations.

Data Persistance

5 different ways of data persistance:

  • onSavedInstanceState(): store state of views in k-v pairs (Bundles), used when screen rotates / app killed by system, temperary.
  • SharedPreferences: save k-v pairs to a file, can save primitive types.
  • SQLite database: complicated data types
  • Internal / External Storage: save ...

Android kills background apps !!

onCreate()CreatedonStart()Visible(can be seen on screen) → onResume()Active(get focus, can interact with)

ActiveonPause()Paused(lose focus — same thing as Visible?) → onStop()Stopped(disappeared) → onDestroy()Destroyed(lifecycle ends)

when rotate screen, the function calling is:

onPause --> onStop --> onDestroy --> onCreate --> onStart --> onResume

note ...