[XCS224N] Lecture 1 – Introduction and Word Vectors

Mon, 09 Mar 2020 Category notes deep learning Series Part 1 of «XCS224N: NLP with deep learning»

Course intro

Word Meaning and Representation

denotational semantics

wordnet (nltk): word meanings, synonym, relationships, hierarchical

pb: missing nuance, missing new meanings, required human labor, can't compute word similarity

Traditional NLP (untill 2012):

each words are discrete symbols — "localist representation"
use one-hot vectors for encoding

pbs with one-hot vecotrs:
large ...

[Sequential Models] week3. Sequence models & Attention mechanism

Wed, 28 Feb 2018 Category notes deep learning Series Part 16 of «Andrew Ng Deep Learning MOOC»

This week: seq2seq.

I-Various sequence to sequence architectures

Basic Models

e.g. Machine translation
encoder network: many-to-one RNN
decoder network: one-to-many RNN

This architecture also works for image captioning: use ConvNet as encoder

Difference between seq2seq and generating new text with language model: seq2seq don't randomly choose a translation ...

[Sequential Models] week2. Natural Language Processing & Word Embeddings

Mon, 26 Feb 2018 Category notes deep learning Series Part 15 of «Andrew Ng Deep Learning MOOC»

I - Introduction to Word Embeddings

Word representation
So far: representing words with one-hot encoding → word relationships are not generalized.
⇒ want to learn a featurized representatin for each word as a high-dim vector

→ visualize word embeddings in 2-dim space, e.g. via t-SNE

Using word embeddings

example: NER
transfer learning: using ...

[Sequential Models] week1. Recurrent Neural Networks

Fri, 02 Feb 2018 Category notes deep learning Series Part 14 of «Andrew Ng Deep Learning MOOC»

week1

Created Friday 02 February 2018

Why sequence models

examples of seq data (either input or output):

speech recognition
music generation
sentiment classification
DNA seq analysis
Machine translation
video activity recognition
name entity recognition (NER)

→ in this course: learn models applicable to these different settings.

Notation

motivating example: NER (Each ...

[Convolutional Neural Networks] week4. Special applications: Face recognition & Neural style transfer

Thu, 30 Nov 2017 Category notes deep learning Series Part 13 of «Andrew Ng Deep Learning MOOC»

This week: two special application of ConvNet.

I-Face Recognition

What is face recognition

Face verification & face recognition

verification: input = image and ID → output whether the image and ID are the same.
recognition: database = K persons, input = image → output = ID of the image among the K person or "not recognized".

→ the ...

[Convolutional Neural Networks] week3. Object detection

Mon, 27 Nov 2017 Category notes deep learning Series Part 12 of «Andrew Ng Deep Learning MOOC»

Object Localization

Classification VS. Localization VS. Detection

classification with localization
Apart from softmax output (for classification), add 4 more outputs of bounding box: b_x, b_y, b_h, b_w.

Defining target label y in localization
label format:
P_c indicating if there's any object
bounding box: b_x, b_y, b_h, b_w
class proba ...

[Convolutional Neural Networks] week2. Deep convolutional models: case studies

Wed, 22 Nov 2017 Category notes deep learning Series Part 11 of «Andrew Ng Deep Learning MOOC»

I-Case studies

Why look at case studies?

Good way to get intuition of different component of CNN: case study & reading paper.
Outline

classic networks:
- LeNet-5
- AlexNet
- VGG
ResNet (152-layer NN)
Inception

Classic Networks

LeNet-5(1998)

Goal: recognize hand-written digits.
image → 2 CONV-MEANPOOL layers, all CONV are valid (without padding) → 2 ...

[Convolutional Neural Networks] week1. Foundations of Convolutional Neural Networks

Sun, 19 Nov 2017 Category notes deep learning Series Part 10 of «Andrew Ng Deep Learning MOOC»

Computer Vision

CV with DL: rapid progress in past two years.

CV problems:

image classification
object detection: bounding box of objects
neural style transfer

input features could be very high dimension: e.g. 1000x1000 image → 3 million dim input ⇒ if 1st layer has 1000 hidden units → 3 billion params for ...