1. Regular Expressions

pb: pattern matching.

regular expression

Is a notation to specify a set of strings.
basic operations:

  • concatenation
  • or
  • closure: "0 or more appearances of chars"
  • parentheses


additional operations (added for convinence):

ex. [A-C]+ is equivalent to (A|B|C)(A|B|C)*.

吐槽名句:

2. REs and NFAs ...

今天简单介绍一下优先队列(priority queue, 以下简称PQ)这个数据结构的实现.

PQ又叫"堆"(heap), 但是可能优先队列这个名字更容易记忆它的用途: pq是一种队列, 不过不是先进先出(FIFO), 而是每次出队的元素永远是优先级最高的.

logical structure of PQ

个人感觉"堆"这个名字大概源于PQ的(逻辑上的)形状吧: PQ是一种树(tree), 准确的说, 是一种二叉树(binary tree), 说得再准确一点, 它是一种完全二叉树(complete binary tree): 没错, PQ是一种满足某些条件的完全二叉树.

所谓的"完全二叉树", 要满足:

  • 除了最后一层, 所有层都排满(没有非空节点)
  • 最后一层的所有非空节点都排在左边

一张图可以直观说明, 完全二叉树其实就是长得像这样:

一个完全二叉树能被成为PQ的话, 要满足的条件就是:

对于任何一个节点, 它的优先级都大于左右子节点的优先级.

比如下图 ...

1. Introduction to substring search

"most ingenious algorithm we've seen so far"
pb. having two strings, pattern and text, len(pattern)=M << len(text)=N, try to find pattern in text.

ex. indexOf method of String in java.

2. Brute-Force Substring Search

function signature:
public static int search(String ...

今天总结一下非常有用的快速排序(qsort)算法, 以及由此衍生的一些其他相关算法(Knuth shuffle, quick select, 3-way partition).

快速排序的算法可以用三句话描述:
[Algo]

  • 选择基准项(pivot element, 一般取第一个元素为pivot)
  • 把数组里所有小于pivot的移动到pivot左边, 大于pivot的移动到右边 ⇒ 此时pivot已经位于最终排序时的正确位置
  • 对pivot左右两个数组分别递归进行快速排序

由以上描述可见, qsort是一个递归算法, 我们可以把它的函数声明写成: void qsort(int[] a, int lo, int hi), 表示排序a[lo, hi]之间(闭区间)的所有元素.

quick partition

由上面描述可以见, qsort最关键的是第二步: 把数组元素以pivot分为两部分. 这个操作就是quick partition.

函数声明为: int partition(int[] a, int ...

More efficient version of symbol-table where the keys are strings.

1. R-way Tries

Two implementations of symbol tables that we've seen:

when keys are strings:
(L=string length, N=number of strings, R=radix)

for string keys ⇒ do better by avoiding examing the entire key.

goal: faster than hashtable ...

This week: string sort.

1. Strings in Java

char data type

  • char in C

8-bit integer, 256 characters, 7-bit ASCII code

  • char in Java

16-bit Unicode

String data type

String: immutable sequence of characters
operations: lengthe, ith char, substring, concatenate

implementation: using a char[], maintain a length and an offset ...

今天介绍一个论文写作的神器: TeXmacs !

0. Why TeXmacs?

一说到"论文写作神器"一般大家首先想到的就是LaTeX, 确实LaTeX写出来的数学公式和文章的排版非常漂亮. 但是作为一个几年来用过LaTeX写过几次报告的小白用户, 说句实话我从来都没有喜欢上过LaTeX. 根本的原因大概是: LaTeX的语法是一种标记语言(markup language), 本质上是给机器看而不是给人看的—就像html源代码是为了给浏览器看而不是直接给人看的.

0.0 LaTeX强迫症自测

矩阵A的转置, 你用tex会怎么写?

⇒ 如果你不能容忍直接写成$A^T$, 而一定要写成类似$\textbf{A}^\intercal$的话... 请直接忽略本文 & 继续用LaTeX, 好走不送......
(另: 强迫症可以去这里看到底怎么打转置: http://tex.stackexchange.com/questions/30619/what-is-the-best-symbol-for-vector-matrix-transpose)

如果你认为这样的细节不重要, 好好描述数学问题本身才最重要的话, 请继续阅读.

0.1 TeXmacs是什么

简言之, TeXmacs是一个所见即所得的编辑器 ...

1. Introduction to Maxflow

Min-cut pb

  • input: edge-weighted digraph G, each edge e has weight("capacity") c[e]>=0, a source vertex s, a target vertex t.
  • def. an st-cut (A,B) is a partition of vertices into 2 disjoint sets A and B, with s in set A and ...

1. Shortest Paths APIs

context: directe, weighted graphs.

shortest path variants

in terms of vertices:

  • source-sink: form one vertex to another
  • single source: from one vertex to all others (considered in this lecture)
  • all pairs

constraints on edge weights:

  • nonnegative weights
  • arbitary weights
  • eculidean

cycles:

  • no directed cycles
  • no negative ...

1. Introduction to MSTs

Given: undirected connecte graph G with positive edge weights.
def. Spanning tree T
is a subgraph of G, that is both tree (connected, acyclic) and spanning(all vertices are included).

⇒ Goal: find a spanning tree with minimum weight sum.

2. Greedy Algorithm

assumptions for simplification:

  • edge ...