Fitness landscapes and incomplete data
Alex Gavryushkin
12 February 2018
Throughout, we consider $n$ biallelic loci, for different $n$.
That is, the set of genotypes is $\mathcal G = \{0,1\}^{n}$.
A fitness landscape is a function $w:\mathcal G \to \mathbb R^+$.
For $g \in \mathcal G$, $w(g)$ is called the fitness of genotype $g$ and denoted $w_g$.
Epistasis
is defined as the deviation from the additive expectation of allelic effects:
$$u_{11} = w_{00} + w_{11}  (w_{01} + w_{10})$$
Problem: What if no (credible) fitness measurements are available?
Image: Wikipedia
Mutation fitness graph
Ogbunugafor et al. Malar. J. 2016
Rank orders. The simplest case.
$\small u_{11} = w_{00} + w_{11}  (w_{01} + w_{10})$
Characterization of epistatic rank orders
Theorem 1. Consider a biallelic $n$locus system.
The number of rank orders which imply $n$way epistasis is:
\[
\frac{(2^n)! \times 2}{2^{n1}+1}
\]
Corollary. The fraction of rank orders that imply $n$way epistasis
among all rank orders is:
\[
\frac{2}{2^{n1}+1}
\]
Connection between rank orders and mutation graphs
Observation: A partial fitness order implies epistasis if and only if all its total extensions do.
Let $f$ be a linear form with integer coefficients.
Assume that the sum of the coefficients of $f$ is zero.
A partial fitness order of genotypes $\prec$ implies positive $f$interaction if $f(w) > 0$ whenever $w$ satisfy $\prec$.
Theorem: A partial order $P = (G, \prec)$ implies positive $f$interaction if and only if there exists a partition of the set of all genotypes $G$ into pairs $(p_i, n_j)$ such that $p_i \succ n_j$ for all $i, j$, where $p_i$ have positive coefficients in $f$ and $n_j$ negative.
Proof: Exercise.
Corollary: It is polynomial in the number of genotypes ($n^{5/2}$) to check whether a partial order implies $f$interaction.
Applications

HIV1

Antibiotic resistance

Gut microbiome

Synthetic lethality

Knockdown cell lines
Methodologically, this allows us to advise further measurements (experiments) for incomplete data sets, thus reducing the number of potential experiments significantly.
Example: antibiotic resistance
Mira et al. PLOS ONE, 2015
Example: antibiotic resistance
Example: antibiotic resistance
Example: antibiotic resistance
Mira et al. PLOS ONE, 2015
$u_{111} = w_{000} + w_{011} + w_{101} + w_{110}  (w_{001} + w_{010} + w_{100} + w_{111})$
Understanding threeway interactions
Total threeway interaction?
$\small u_{111} = w_{000} + w_{011} + w_{101} + w_{110}  (w_{001} + w_{010} + w_{100} + w_{111})$
Marginal epistasis?
$\small u_{\color{blue}{0}11} = w_{\color{blue}{0}00} + w_{\color{blue}{1}00} + w_{\color{blue}{0}11} + w_{\color{blue}{1}11} − (w_{\color{blue}{0}01} + w_{\color{blue}{1}01}) − (w_{\color{blue}{0}10} + w_{\color{blue}{1}10})$
Conditional epistasis?
$\small e = w_{\color{blue}{0}00} − w_{\color{blue}{0}01} − w_{\color{blue}{0}10} + w_{\color{blue}{0}11}$
Total mess!
Algebraic Geometry sorts out the mess!
$e = \frac12(u_{011} + u_{111})$
In general, the four interaction coordinates
$$u_{011}, u_{101}, u_{110}, u_{111}$$
allow to describe all possible kinds of interaction!
There are 20 types of threeway interaction and they are the circuits of the threecube.
Yep, we've got the list!
$$
\scriptsize
\begin{align*}
a&= w_{000}w_{010}w_{100}+w_{110} & m&=w_{001}+w_{010}+w_{100}w_{111}2w_{000}\\
b&=w_{001}w_{011}w_{101}+w_{111} & n&=w_{011}+w_{101}+w_{110}w_{000}2w_{111}\\
c&=w_{000}w_{001}w_{100}+w_{101} & o&=w_{010}+w_{100}+w_{111}w_{001}2w_{110}\\
d&=w_{010}w_{011}w_{110}+w_{111} & p&=w_{000}+w_{011}+w_{101}w_{110}2w_{001}\\
e&=w_{000}w_{001}w_{010}+w_{011} & q&=w_{001}+w_{100}+ w_{111}w_{010}2w_{101}\\
f&=w_{100}w_{101}w_{110}+w_{111} & r&=w_{000}+w_{011}+ w_{110}w_{101}2w_{010}\\
g&=w_{000}w_{011}w_{100}+w_{111} & s&=w_{000}+w_{101}+ w_{110}w_{011}2w_{100}\\
h&=w_{001}w_{010}w_{101}+w_{110} & t&=w_{001}+w_{010}+w_{111}w_{100}2w_{011}\\
i&=w_{000}w_{010}w_{101}+w_{111}\\
j&=w_{001}w_{011}w_{100}+w_{110}\\
k&=w_{000}w_{001}w_{110}+w_{111}\\
l&=w_{010}w_{011}w_{100}+w_{101}\\
\end{align*}
$$
$$
\scriptsize
\begin{align*}
a&= w_{000}w_{010}w_{100}+w_{110} & m&=w_{001}+w_{010}+w_{100}w_{111}2w_{000}\\
b&=w_{001}w_{011}w_{101}+w_{111} & n&=w_{011}+w_{101}+w_{110}w_{000}2w_{111}\\
c&=w_{000}w_{001}w_{100}+w_{101} & o&=w_{010}+w_{100}+w_{111}w_{001}2w_{110}\\
d&=w_{010}w_{011}w_{110}+w_{111} & p&=w_{000}+w_{011}+w_{101}w_{110}2w_{001}\\
e&=w_{000}w_{001}w_{010}+w_{011} & q&=w_{001}+w_{100}+ w_{111}w_{010}2w_{101}\\
f&=w_{100}w_{101}w_{110}+w_{111} & r&=w_{000}+w_{011}+ w_{110}w_{101}2w_{010}\\
\color{blue}{g}&\hskip{2pt}\color{blue}{=w_{000}w_{011}w_{100}+w_{111}} & s&=w_{000}+w_{101}+ w_{110}w_{011}2w_{100}\\
h&=w_{001}w_{010}w_{101}+w_{110} & t&=w_{001}+w_{010}+w_{111}w_{100}2w_{011}\\
i&=w_{000}w_{010}w_{101}+w_{111}\\
j&=w_{001}w_{011}w_{100}+w_{110}\\
k&=w_{000}w_{001}w_{110}+w_{111}\\
l&=w_{010}w_{011}w_{100}+w_{101}\\
\end{align*}
$$
Acknowledgements
 You
 Niko Beerenwinkel, ETH Zürich
 Bernd Sturmfels, Max Planck Institute Leipzig
 Kristina Crona, American University
 Devin Greene, American University
 Lisa Lamberti, ETH Zürich
 Caitlin Lienkaemper, Penn State
and stay tuned