A deterministic finite state automaton, $M$, consists of:
A finite set, $Q$, of states
A finite set $\Sigma$ called the alphabet
A total function $\delta: Q \times \Sigma \to Q$ called the transition function
A distinguished state $q_0 \in Q$ called the initial state
A subset $F \subseteq Q$ called the final or accepting states.
Given a word $w = w_0 w_1 \dots w_{n-1} \in \Sigma^*$ the computation carried out by $M$ on input $w$ is a sequence of states $q_0, q_1, q_2, \dots, q_n$ defined as follows:
\[
q_1 = \delta(q_0, w_0), \: q_2 = \delta(q_1, w_1), \: \dots ,\: q_n = \delta(q_{n-1}, w_{n-1})
\]
We say that $M$ accepts or recognises $w$ if $q_n \in F$ and otherwise it rejects $w$.
The language of $M$, $L(M)$ is just the set of strings in $\Sigma^*$ that $M$ accepts.
Lecture 5
Non-deterministic automata
NFA
Relax the definition of the transition function to become a transition relation.
\[
\delta \subseteq Q \times \Sigma \cup \{\lambda\} \times Q
\]
$\{ab\}$ in English alphabet
Examples
Design an automaton that recognises the following languages in alphabet $\{a,b\}$
$L_1 = \{w | w = xaaybbbz, \mbox{ where } x,y,z \in\{a,b\}^*\}$
$L_2 = \{w | w \mbox{ contains $aa$ or $bbb$}\}$
$L_3 = \{w | w \mbox{ contains $aa$ and $bbb$}\}$
Lecture 6
NFA = DFA, pumping lemma
Theorem 3
The classes of languages recognised by DFA and NFA coincide.
$L = \{w \in\{a, b\}^* \mid w \mbox{ contains $abba$ or $aa$}\}$
Theorem 3
The classes of languages recognised by DFA and NFA coincide.
Proof: Go to the lecture (or read in the notes).
Lecture 7
Pumping lemma
$L = \{a^nb^n \mid n \in \mathbb{N}\}$
Pumping lemma
Let $L$ be an automatic language.
Then there exists a positive integer $k$ such that if $z \in L$, $|z| \geq k$ then for some $u$, $v$ and $w$:
$
\begin{eqnarray*}
z &=& u \cdot v \cdot w \\
|u| + |v| &\leq& k \\
|v| &>& 0 \\
uv^iw &\in& L \quad \mbox{for all $i \geq 0$}.
\end{eqnarray*}
$
Proof: Haven't we just proven it?
$L = \{w \in \{a, b\} \mid \mbox{$w$ contains an odd number of $a$'s}$
$\mbox{and an even number of $b$'s}\}$
Lecture 8
Pushdown automata and context free grammars
A pushdown automaton, $M$, consists of:
A finite set $Q$ of states
A finite set $\Sigma$ called the input alphabet (lower case letters)
A finite set $\Gamma$ called the stack alphabet (upper case letters)
A relation $\delta \subseteq Q \times \Sigma \cup \{\lambda\} \times \color{blue}{\Gamma \cup \{\lambda\}} \times Q \times \color{blue}{\Gamma \cup \{\lambda\}}$
called the transition relation
A distinguished state $q_0 \in Q$ called the initial state
A subset $F \subseteq Q$ called the final or accepting states.
The stack should be empty to accept a string!
A context-free grammar, $G$, consists of:
A finite set $V$ of nonterminals (or variables)
(upper case letters)
A finite set $\Sigma$ called the alphabet (or terminals)
(lower case letters), $\Sigma \cap V = \varnothing$
A finite set $P$ of production rules,
which are function from $V$ to $(V \cup \Sigma)^*$
A distinguished $S \in V$ called the start symbol
Examples
$S \to aS, S \to bS, S \to \lambda$
$S \to aS, S \to bT, T \to bT, T \to \lambda$
$\{a^nb^n \mid n \in \mathbb{N}\}$
Danger zone!
We recommend that you don't scroll past this slide.
Genome phasing
Schwarz, Roland F., Anne Trinh, Botond Sipos, James D. Brenton, Nick Goldman, and Florian Markowetz. 2014. “Phylogenetic Quantification of Intra-Tumour Heterogeneity.” PLoS Computational Biology 10 (4): e1003535.
[Corrected]
Formalising biology
Let $G$ be the set of binary (quaternary in reality) strings of length $n$.
Elements of $G$ are called genotypes.
A function $w: G \to \mathbb R^+$ is called a fitness landscape.
For $g \in G$, $w(g)$ is called fitness of the genotype $g$.
Scalability: synthetic lethal pairs
How many pairs of genes are there in the human genome?
Gene-gene interactions
Epistasis
is defined as the deviation from the additive expectation of allelic effects:
$$u_{11} = w_{00} + w_{11} - (w_{01} + w_{10})$$
Theorem. A partial order (e.g. fitness graph) implies epistasis if and only if all linear extensions compatible with the partial order do.
Mutation graph
Mutation graph
Mutation graph
Mutation graph
Mutation graph
Lienkaemper, Lamberti, Drain, Beerenwinkel, and Gavryushkin. The geometry of partial fitness orders and an efficient method for detecting genetic interactions. Journal of Mathematical Biology, 2018.
4. Online algorithms to improve computational performance
The traditional way is to make the algorithm more efficient
When the same algorithm has to be re-run routinely, we can economise by making the algorithm slower and doing more!
This approach is known as online
Online algorithms
Online algorithms
Online algorithms
Online algorithms
Online algorithms
Online algorithms
Online algorithms
Online algorithms
Online algorithms
Homework:
Find an efficient online algorithm to detect genetic interactions from