Intro

ML lang family

statically strongly typed languages

  • fisrt-class functions
  • type inference
  • pattern matching

highlights of ocaml

  • safty: static typing, pattern matching
  • efficiency: high performance
  • expressiveness: functional+type inference+polymorphism

  • package manager: opam

  • debugger
  • profiler
  • REPL: "toplevel"
  • bytecode compiler: ccamlc
  • native compiler: ocamlopt

programming environment fully online: https://try.ocamlpro ...

Recap: Functions and Pattern Matching

case classes

ex: json json objects can be seq, num, str, bool,...

⇒ represented as abstract class and case classes.

pattern matching

→ question: what is the type of the {case(key, value)=>"..."} clause?

it is (JBinding => String) type, which is a shorthand for Function1[JBinding, String ...

6.1 - Other Collections

so far: only seen List. → more (immutable) collections.

vector

List: is linear -- access to head is faster than middle or end element. Vector: better rand access performance.

represented as very shallow trees(32-split at each node)

Vector support similar operations as List (head, tail,map, fold ...

5.1 - More Functions on Lists

already known methods:

xs.head
xs.tail

sublist and ele access:

  • xs.length
  • xs.last
  • xs.init: all elementh except last element
  • xs.take(n): sublist of first n elements
  • xs.drop(n): the rest of list after taking first n elements
  • xs(n ...

R里面的统计函数有很多, 这里只用线性模型lm以及(一维)非参估计最常用的三个smoother: Nadaraya-Watson kernel(NW, ksmooth), Local Polynomial(LP, loess), Smoothing Spline(SS, smooth.spline). 用这三个smoother作为例子, 介绍R里面统计回归的一些用法.

数据的形式是: 

目标是估计函数m(). 例子使用R自带的cars数据集, 它包含两列: 汽车速度speed和刹车距离dist.

> data(cars)
> summary(cars)
     speed           dist       
 Min.   : 4.0   Min.   :  2.00  
 1st Qu.:12.0   1st Qu.: 26.00  
 Median :15 ...

R关于绘图应该可以写很多, 不过这里只列举在compstat这门课里最经常用的几个函数. 关于R的绘图, 详细了解可以运行 demo(graphics)或者 example("plot").

R里面的绘图命令分为两类: 一类是"high-level"的"创建新图片"命令, 运行以后会新画一个图; 另一类则是"low-level"的命令, 不会创建新图片, 而只会在当前图片中修改添加(例如添加线条, 添加点等). 下面分别简单介绍, 最后再介绍其他一些绘图配置的命令.

configurations/parameters

介绍一下常用的参数意义, 以及画图的配置. 详细文档见?par. 这些参数可以放在绘图命令中.

  • par(mfrow=c(1,2)): 这个命令是设定画图的布局, 把放置图片的区域分为一行两列, 第一个plot的图片在左边, 第二个在右边.
  • col: 设定画图(线段/点)的颜色, 可以用数字(col=1, 2 ...