09/06 Notes¶
Dropbox link to in-class examples (guarenteed to work on arctic)
Down-to Operator¶
int x = 10 while (x –> 10){ do }
What is Lexing?¶
A token is a category of meaningful substrings in the source language
- In english this would be types of words or punctuation i.e noun, verb, adj, end-mark.
- In a program, this could be identifiers, floats, math symbol, keyword, etc...
A Lexeme is a substring that represents an instance of a token.
- EX
- Token: Static int
- Pattern: 0 - 9
- Lexical analyzer
- Correctly identifies all tokens within a string
- Discard usless tokens (whitespace and comments)
- Returns each remaining token with its lexeme and the line number it was on
- Patterns
- Condition: if
- Type: int or float or char
- Command: print or return or define
- Static int: 0 - 9 +
- id: [a-zA-Z][a-zA-Z0-9]
- Math: +-/ etc
- assign =
- Compare
- Whitespace
- Endline
- Unknown: Weird things like the copyright symbol
Lookaheads¶
A lookahead will be important for lexical analysis
Tokens are read left to right, not obvious at first what a token is
Using Regular Expressions¶
We need to divide ip an input stream into lexemes
- Design a set of reqular expressions
- Plug them into Lex (Harder without this)
Project 1¶
Correctness is job 1 and 2 and 3
Implement them nice so it’s easier to build on them
- Keep it simple
- Design things you can test
- Don’t optimize prematurely
- It’s easier to modify a working system than to get a system working