ML Expressions and Variable Bindings
An ML program is a sequence of bindings. Each binding gets type-checked and then (assuming it type-checks) evaluated. What type (if any) a binding has depends on a static environment, which is roughly the types of the preceding bindings in the file. How a binding is evaluated depends on a dynamic environment, which is roughly the values of the preceding bindings in the file. When we just say environment, we usually mean dynamic environment. Sometimes context is used as a synonym for static environment.
There are several kinds of bindings, but for now let’s consider only a variable binding, which in ML has this syntax :
Here, val is a keyword, x can be any variable, and e can be any expression. We now know a variable binding’s syntax (how to write it), but we still need to know its semantics (how it type-checks and evaluates). Mostly this depends on the expression e. To type-check a variable binding, we use the current static environment (the types of preceding bindings) to type-check e (which will depend on what kind of expression it is) and produce a new static environment
that is the current static environment except with x having type t where t is the type of e. Evaluation is analogous: To evaluate a variable binding, we use the “current dynamic environment” (the values of preceding bindings) to evaluate e (which will depend on what kind of expression it is) and produce a new dynamic environment
that is the current environment except with x having the value v where v is the result of evaluating e.
Whenever you learn a new construct in a programming language, you should ask these three questions:
- What is the syntax?
- What are the type-checking rules?
- What are the evaluation rules?
for example:
Addition:
- Syntax: e1+e2 where e1 and e2 are expressions
- Type-checking: type int but only if e1 and e2 have type int in the same static environment, else does not type-check
- Evaluation: evaluate e1 to v1 and e2 to v2 in the same dynamic environment and then produce the sum of v1 and v2
Function Bindings
1 | fun pow (x:int, y:int) = (* correct only for y >= 0 *) |
Syntax:
Type-checking:
To type-check a function binding, we type-check the body e in a static environment that (in addition to alltheearlierbindings)maps x1 to t1,..., xn to tn and x0 to t1 * ... * tn -> t. Because x0 is in the environment, we can make recursive function calls, i.e., a function definition can use itself. The syntax of a function type is argument types -> result type
where the argument types are separated by * (which just happens to be the same character used in expressions for multiplication). For the function binding to type-check, the body e must have the type t, i.e., the result type of x0. That makes sense given the evaluation rules below because the result of a function call is the result of evaluating e.
But what, exactly, is t - we never wrote it down? It can be any type, and it is up to the type-checker (part of the language implementation) to figure out what t should be such that using it for the result type of x0 makes, everything work out.
Evaluation:
The evaluation rule for a function binding is trivial: A function is a value — we simply add x0 to the environment as a function that can be called later. As expected for recursion, x0 is in the dynamic environment in the function body and for subsequent bindings (but not, unlike in say Java, for preceding bindings, so the order you define functions is very important).
Pairs and Other Tuples
Tuples: fixed number of pieces that may have different types
Build:
- Syntax:
(e1,e2)
- Evaluation: Evaluate e1 to v1 and e2 to v2; result is (v1,v2)
- A pair of values is a value
- Type-checking: If e1 has type ta and e2 has type tb, then the pair expression has type ta * tb
- A new kind of type
Access:
- Syntax:
#1 e
and#2 e
- Evaluation: Evaluate e to a pair of values and return first or second piece
- Example: If e is a variable x, then look up x in environment
- Type-checking: If e has type ta * tb, then #1 e has type ta and #2 e has type tb
Lists
Lists can have any number of elements,but all list elements have the same type
Build
- The empty list is a value: []
- In general, a list of values is a value; elements separated by commas:
[v1,v2,...,vn] - If e1 evaluates to v and e2 evaluates to a list [v1,...,vn], then e1::e2 evaluates to [v,...,vn]
e1::e2
Access
Until we learn pattern-matching, we will use three standard-library functions - null e evaluates to true if and only if e evaluates to [] - If e evaluates to [v1,v2,...,vn] then hd
e evaluates to v1 (raise exception if e evaluates to []) - If e evaluates to [v1,v2,...,vn] then tl
e evaluates to [v2,...,vn] (raise exception if e evaluates to [])
Let-expressions
Let-expressions are an absolutely crucial feature that allows for local variables in a very simple, general, and flexible way. Let-expressions are crucial for style and for efficiency. A let-expression lets us have local variables. In fact, it lets us have local bindings of any sort, including function bindings. Because it is a kind of expression, it can appear anywhere an expression can.
- Syntax: let b1 b2... bn in e end
- Each bi is any binding and e is any expression
- Type-checking: Type-check each bi and e in a static environment that includes the previous bindings. Type of whole let-expression is the type of e.
- Evaluation: Evaluate each bi and e in a dynamic environment that includes the previous bindings. Result of whole let-expression is result of evaluating e.
1 | fun good_max (xs : int list) = if null xs |
Options
The previous example does not properly handle the empty list — it returns 0. This is bad style because 0 is really not the maximum value of 0 numbers. There is no good answer, but we should deal with this case reasonably. One possibility is to raise an exception; you can learn about ML exceptions on your own if you are interested before we discuss them later in the course. Instead, let’s change the return type to either return the maximum number or indicate the input list was empty so there is no maximum. Given the constructs we have, we could code this up by return an int list, using [] if the input was the empty list and a list with one integer (the maximum) if the input list was not empty.
The ML library has options
which are a precise description: an option value has either 0 or 1 thing: NONE
is an option value carrying nothing whereas SOME e evaluates e to a value v and becomes the option carrying the one value v. The type of NONE is 'a option and the type of SOME e is t option if e has type t.
Building:
NONE
has type 'a option (much like [] has type 'a list)SOME
e has type t option if e has type t (much like e::[])
Access:
isSome
has type 'a option -> boolvalOf
has type 'a option -> 'a (exception if given NONE)
1 | fun better_max (xs : int list) = if null xs |
Boolean operations
- e1
andalso
e2 - e1
orelse
e2 not
e1
Comparisons
= <> > < >= <=
< >= <= can be used with real, but not 1 int and 1 real
- = <> can be used with any equality type but not with real
Lack of Mutation and Benefits Thereof
In ML, there is no way to change the contents of a binding, a tuple, or a list. If x maps to some value like the list of pairs [(3,4),(7,9)] in some environment, then x will forever map to that list in that environment. There is no assignment statement that changes x to map to a different list. (You can introduce a new binding that shadows x, but that will not affect any code that looks up the original
x in an environment.)
For a final example, the following Java is the key idea behind an actual security hole in an important (and subsequently fixed) Java library. Suppose we are maintaining permissions for who is allowed to access something like a file on the disk. It is fine to let everyone see who has permission, but clearly only those that do have permission can actually use the resource. Consider this wrong code (some parts omitted if not relevant):
1 | class ProtectedResource { |
Can you find the problem? Here it is:
getAllowedUsers
returns an alias to the allowedUsers array, so any user can gain access by doing getAllowedUsers()[0] = currentUser().
The Pieces of a Programming Language
Now that we have learned enough ML to write some simple functions and programs with it, we can list the essential “pieces” necessary for defining and learning any programming language:
- Syntax: How do you write the various parts of the language?
- Semantics: What do the various language features mean? For example, how are expressions evaluated?
- Idioms: What are the common approaches to using the language features to express computations?
- Libraries: What has already been written for you? How do you do things you could not do without library support (like access files)?
- Tools: What is available for manipulating programs in the language (compilers, read-eval-print loops, debuggers, ...)
Reference
- Course note from Programming Language