A simplified model of chromatin dynamics drives differentiation process in Boolean models of GRN

Cellular types of multicellular organisms are the stable results of complex intertwined processes that occur in biological cells. Among the many others, chromatin dynamics significantly contributes—by modulating access to genes—to differential gene expression, and ultimately to determine cell types. Here, we propose a dynamical model of differentiation based on a simplified bio-inspired methylation mechanism in Boolean models of GRNs. Preliminary results show that, as the number of methylated nodes increases, there is a decrease in attractor number and networks tend to assume dynamical behaviours typical of ordered ensembles. At the same time, results show that this mechanism does not affect the possibility of generating path dependent differentiation: cell types determined by the specific sequence of methylated genes.


Introduction
Eukaryotic cells are characterised by the organisation of DNA in a condensed structure, called chromatin. Chromatin is composed of nucleosomes, structures of DNA wrapped around octamers of histone proteins. Histone methylation and histone acetylation change-by adding methyl and acetyl groups to histones-the degree of compactness of the chromatin, in this way facilitating or obstructing gene expression. These processes are defined as epigenetic mechanisms 1 .
Although methylation effects depend on the particular positions on histones on which it acts, it most often leads to tightly packed regions of chromatin called heterochromatin (Gilbert and Barresi, 2016;Perino and Veenstra, 2016;Schuettengruber and Cavalli, 2009). These regions are not accessible neither by transcription factors nor by RNA polymerases and so the expression of genes belonging to these DNA areas is inhibited. Biological cells exploit differential methylation to modulate their gene expres-sion during development and differentiation. It is important to note that, along lineages, the attained configurations of DNA methylation are inherited and progressively extended as cells become more specialised (Kim and Costello, 2017). Therefore, methylation contributes to maintain and stabilise the attained gene expressions that ultimately characterise the identities of the various cell states.
It is worth mentioning that methylation is tightly regulated by complex interactions, and that epigenetic dysregulation is very common in a lot of disorders, from cognitive, neurological and chronic diseases to cancer. Given the complexity of these mechanisms, the adoption of models can support the analysis of the role of epigenetics in pathophysiological processes. Several mathematical approaches have been proposed with the aim of disentangling the effects of epigenetics in development, differentiation and also in the establishment of aberrant cellular states-like cancer.
Noteworthy is the work (Miyamoto et al., 2015) in which the authors investigate the mechanisms of differentiation and cellular reprogramming introducing a continuous model of a minimal gene regulatory network (GRN) able to give rise to both pluripotent and differentiated states. In their modelling approach, an epigenetic process-introduced as a gene expression fixation-turns out to be important to increase the stability of the attained differentiated states and to reproduce with more accuracy the phenomenology of the reprogramming process.
In the works (Turner et al., 2017(Turner et al., , 2013, the authors have ascertained that the addition of an epigenetic layer-in the form of Boolean switches that dynamically change the actual network topology-within recurrent neural networks lead to better performance in the achievement of certain target tasks, as compared to models without it. To the best of our knowledge, the specific role of epigenetics in the dynamics of discrete models of GRN has been addressed only by (Bull, 2014). The author does not focus on the differentiation process as such, but instead, he evaluates the potential of Random Boolean networks (RBNs) with epigenetic control-which is interpreted as additional nodes that change the regular transcription dynamics-in NK land-scapes (Kauffman and Levin, 1987).
Of fundamental relevance for this discussion are the works of Kauffman and Huang (Kauffman, 1969;Huang et al., 2009Huang et al., , 2005Huang and Ingber, 2000) that have laid the foundations for a mathematical description of GRN dynamics in terms of dynamical systems, with attractors that model cell types.
Recently, an abstract mathematical model of differentiation based on Noisy RBNs has been proposed Villani and Serra, 2013;Serra et al., 2010). This model has proven to be able to describe the most relevant properties of the differentiation process, such as different degrees of differentiation, stochastic and deterministic differentiation, and cell reprogramming. The model focuses on the dynamics of a single cell represented as an autonomous system 2 subject to intracellular noise. Cell types are defined as the portions of the space of states in which the dynamics remains trapped, under a specific noise level. Changes in the intracellular level of noise drive the differentiation process: high noise levels correspond to pluripotent cells while the low levels to fully differentiated ones. Experimental analysis on RBNs subject to stochastic dynamics (Braccini et al., 2018) and the successful evolution of networks able to attain not trivial differentiation dynamics Benedettini et al., 2014) proved the expressiveness and plausibility of this model.
Differentiation represents a major challenge for every model of gene regulatory networks that, like RBNs, is based on deterministic dynamical systems which asymptotically reach stable attractor states, to be identified with the different cell types. Indeed, under the action of the deterministic dynamics, a stable attractor does not change any longer so it must represent a fully differentiated cell type. Therefore cells which are found at intermediate differentiation levels (e.g. pluripotent cells) should be associated to transientsan unsatisfactory proposal, since it is known that there exist long-lived pluripotent cells, which should rather be represented by metastable states.
The way out of this conundrum requires a mechanism to escape from the deterministic attractors. While this mechanism is provided in our previously described model by means of intracellular noise, in this work we want to explore an alternative-complementary-possibility, i.e. that it is due to an external signal. In this way, the system is no longer autonomous, and escaping from the attractors of the corresponding deterministic system becomes possible. External signals are indeed known to affect embryo evolution, and the simplest way to describe their effect in a GRN model is that of clamping the values of some network nodes to fixed values. This paper is organised as follows. The next section describes the proposed model with its theoretical implications in BN models and its contribution to the understanding of the biological process of methylation. The subsequent section details the experimental setting of our in silico experiments and illustrates results that show the properties of the model. Finally, we conclude with an outlook to future work.

Model
Boolean networks are discrete-time and discrete-state dynamical models of GRN introduced by Kauffman (Kauffman, 1969, 1993. In their original formulation BNs can be represented by a directed graph with n nodes each having associated a Boolean variable x i , i = 1, . . . , n and a Boolean function f i = (x i1 , . . . , x i k ) which depends on k other nodes, avoiding self loops. Despite their simplifications they proved to be suitable systems to represent the dynamics of biological GRNs to many level of abstractions (Graudenzi et al., 2011;Serra et al., 2006Serra et al., , 2007Shmulevich et al., 2005).
As previously discussed, methylation-even if it is not the only phenomenon in place-has a non negligible impact on cell fate determination and maintenance. Here we are especially interested in its abstract role in simplified models of GRNs, namely in Boolean networks. Indeed, borrowing the idea of a progressive methylation state of the chromatin along the development and differentiation of biological cells, we propose an analogous mechanism in BN models. Similarly to what happens in the heterochromatin condition, the expression of some BN nodes is blocked to value 0; these nodes will be referred to as frozen in the following.
Theoretically, the formulation of this peculiar methylation mechanism implies a sort of simplification of the network, as it reduces the nodes that are actually subject to a dynamic update, and so restricting the number of combinations that the system itself can assume. Therefore, it is not a priori clear whether this mechanism can accommodate path dependent differentiation: cell types determined by the specific sequences of methylated genes.
This model relies on the hypothesis-to be verified in RBNs-that the progression of frozen nodes imposes the arrow of time of the differentiation process and, at the same time, different patterns of methylated nodes give rise to distinct lineages, and so cell types. Indeed, biological differentiation is characterised by the presence of different stages of differentiation and by progressively specialisation of cells.
A schematic representation of this Boolean methylationinspired mechanism is depicted in Figure 1. In this work we undertake an experimental analysis of the main dynamical properties of RBNs subject to this process of progressive methylation. For this mechanism to be useful in a plausible BN differentiation model, it should (i) progressively stabilise the network and (ii) give origin to different lineages depending on the nodes chosen to be frozen. If these prop- Figure 1: Schematic representation of the methylation mechanism introduced. Grey nodes represent frozen nodes (nodes constrained to assume the value 0, regardless of the actual values of its inputs). The specific patterns of frozen nodes in the ennuples that represent the state of the BN over time have no other meaning than to exemplify the methylation process introduced. erties are attained in RBN ensembles, then we could suppose that evolution may act to tune the dynamics of the network so as to achieve a specific differentiation lineage tree. The choice of setting to 0 the nodes to be frozen is motivated by the inhibition effect of most methylation mechanisms and introduces an asymmetry in the RBNs model, as it progressively bias the Boolean functions to 0. However, this is not a limitation of the model, which can be extended to take into account also actions in which nodes are clamped to 1 and so provide even more variability in the lineages.

Results
The random Boolean networks used in these experiments are subject to a synchronous and deterministic dynamics, therefore fixed points and cycles are possible asymptotic states. For all the experiments, statistics are taken across 100 RBN with n = 500 and k = 2. We focused only on networks with k = 2 because the size of the network, combined with the other chosen parameters, would have made the experimental analysis computationally prohibitive. The Boolean functions are defined on the basis of the bias parameter p, which defines the probability to assign value 1 in a row of a node truth table. The variation of the parameter p makes it possible to determine the dynamical regime of the system (ordered, critical or chaotic) (Bastolla and Parisi, 1997): so, the limitation due to the choice of a specific connectivity is thus eliminated. Since we want to analyse the emerging generic properties induced only by the proposed methylation mechanism in ensembles of RBNs, we used an exact bias. Exact bias is computed by generating each time a random permutation of a vector of Boolean values with a length equal to the number of nodes in the network and a fraction p of 1's, and by using partitions of this vector to define the output values of Boolean functions. In this way, we remove from the statistics any possible contribution produced by a variance in network dynamic regime. We generated RBNs with p = 0.1, i.e. in the ordered regime, and p = 0.5, corresponding to the critical regime. As results with ordered RBNs are rather uninformative, we only show results for critical RBNs. Attractor number distribution To providing the trend of the number of attractors as the fraction of frozen nodes increases we generated 100 RBNs and for each number of frozen nodes we performed a search of the attractors starting from 10 4 random initial states. The range of frozen nodes considered varies from 0 to 200 with a step of 5 nodes. Boxplots showing the the distribution of the number of attractors as a function of the number of frozen nodes are depicted in Figure 2, along with the mean of these distributions. As expected, the number of attractors decreases with the number of frozen nodes, even though it remains non negligible up to one fifth of frozen nodes. A question may arise as to how many attractors are fixed points, as one expects an increasing number of fixed points as the RBNs become more ordered. This expectation is indeed confirmed, as shown in Figure 3.
Derrida parameter With the aim of assessing the intuition suggesting a progressive shift towards an ordered regime of the ensemble of RBNs subject to the methylation mechanism, we computed the distribution of the Derrida parameter (Bastolla and Parisi, 1997) λ, computed after one step. This parameter is evaluated by taking, for each state considered (10 3 in total), the means of the Hamming distances after one update between the state and the perturbed one (a logic negation of a single node value) in all not frozen nodes, taken one at a time. In particular, statistics report the distributions of the 100 means of the means, one parameter value for each RBN which summarises the overall behaviour observed along the 10 3 random states. For this investigation, we consider a number of frozen nodes represented by a percentage of {0, 10, 20, 50} of all nodes. Figure 4 depicts the boxplots summarising the distribution of λ for the ensembles sampled; the trend towards an increasing order is confirmed (the results for p = 0.1, on the left in Figure 4, are provided as a comparison).
The results shown so far support the conjecture that a progressive freezing pushes RBNs towards order. One may argue that a result not in agreement with this expectation might indeed sound surprising, nevertheless it is important to assess it experimentally in particular because this trend is not trivial at all in finite-size RBNs. While in infinite-size RBNs just a tiny fraction of frozen nodes leads to a complete stasis of the network 3 , in finite-size RBNs we observe that the number of attractors and the Derrida parameter are kept at significant values even in the presence of a non-negligible fraction of frozen nodes. This result suggests that in finitesize RBNs, while a progressive freezing tends to increase order in network dynamics, it may still be open to variabil-3 a formal model of this behaviour is subject of ongoing work ity. This last characteristic is relevant especially with respect to the possible paths across attractors that are feasible as the consequence of different choices in the nodes to be frozen.

Diversity estimation
In previous sections we have summarised with path dependent differentiation the property of generating different cell types as a result of different sequences of methylated genes. We can characterise the tendency of this mechanism to give rise to this property by inspecting the diversity caused by different combinations of methylated genes at any attained differentiation stage. For this purpose, we generate for each state of the methylation process (state represented by the already frozen nodes and the attractor reached) 10 2 couples of triplets of nodes, 4 among the non-already methylated nodes. This triplet is frozen while the network is in an asymptotic state, therefore after this perturbation the BN dynamics is subject to a transient and subsequently the network can either return to an attractor equal to the current one-except for the frozen triplet-or reach a different one. The freezing step may be taken at any state-i.e. phase-of the current attractor; as the phase of the attractor may be a source of variability and here we want to assess the contribution of the choice of frozen nodes only, once the attractor is reached after freezing a triplet of nodes, its minimum state according to the lexicographic order is chosen. As networks are random, this choice does not introduce any bias and in this way we rule out any possible contribution of attractor phase in the diversity of paths originated by freezing steps. 5 The diversity is then measured depending on the characteristics of the new asymptotic states on which the dynamics settles after the triplet is frozen. As we aim at providing general results, not bound to a specific definition of phenotype 6 , which should be supported by motivations on a concrete biological case, we analyse the arising diversities in various condition. Particularly, we count: • the number of equal reached asymptotic states considered in pairs and after removing the part of the already frozen nodes and the set of nodes that constitute the triplets; • the differences among all the reached attractors caused by the generated triplets, by considering subset of genes (patterns in the following) of different sizes (10, 50, 100) randomly chosen; • the differences among all the reached attractors caused by the generated triplets, by considering the states vectors in their entirety. By doing so we will have an overall picture of how this mechanism behaves in ensembles of RBNs, without limiting ourselves to particular points of view. As for the attractors distribution analysis, the range of frozen nodes considered varies from 0 to 200 with a step of 5 nodes. We stress that in this model the various degrees of differentiation are characterised by a distinct number of frozen nodes: the higher the number of frozen nodes the more differentiated the cell types. The triplets to be frozen are chosen at random among all the non already frozen nodes; we also made experiments with conditioning this choice to nodes that assume value 1 in the attractor state chosen for the perturbation. In this way, we can assess the highest level of variability that can be attained, as all the three nodes are actually perturbed by freezing.
The distribution of the frequency of equal pairs of attractors is shown in Figure 5; we observe that the median frequency of equal pairs increases from about 7/100 to 20/100 with the number of frozen nodes, while it is limited to low percentages when frozen nodes are chosen among the active ones (value 1). This result shows that the probability of choosing two different triplets 7 leading to the same asymptotic state after being frozen is rather low; therefore, at least for RBNs with at most 2/5 of frozen nodes, the different paths generated by freezing are a significant fraction of all the possible ones, despite the tendency towards a more ordered regime. This observation is confirmed also by the statistics involving the total number of different patterns. With the term pattern we refer to a projection of the network dynamics in subset of nodes. So, patterns in this context define the observable phenotypes in a way strongly related to the concept of macrostate introduced in (Borriello et al., 2018;Moris et al., 2016). These latter results are shown in Figures 6 and 7. It is worth observing that, even when differences are estimated on the basis of 10 nodes, the fraction of overall different patterns is still non-negligible up to 100 frozen nodes out of 500.
These results support the hypothesis that different freezing patterns in RBNs are very likely to produce different trajectories along attractors, and therefore variability in differentiation paths can be attained also by means of this mechanism.

Conclusion
In this work we have explored the possibility of incorporating epigenetic mechanisms-methylation in particularinto BN models of GRNs. We focused on those processes responsible for high chromatin compaction, that influences gene transcription by controlling the accessibility of DNA to transcription factors and RNA polymerases. Accordingly, in our model we progressively freeze-i.e. clamp to 0-a subset of nodes and analyse the impact of this modification on network dynamical features, namely on attractor numberin analogy with the number of cell types-, on the Derrida parameter-to assess the extent to which RBNs with frozen nodes tend to an ordered regime-and on attractor diversity as attained by different combinations of frozen nodes.
We observed that the number of attractors in RBNs decreases with the number of frozen nodes and the same does the Derrida parameter, suggesting that, from an ensemble point of view, the larger the fraction of frozen nodes the more ordered the RBNs. These results are in agreement with the intuition that, by clamping to 0 a fraction of RBN nodes, not only the state space is reduced with respect to the original network, but frozen nodes absorb perturbations and so they favour network stability. These properties are to some extent the abstract counterpart of progressive reduced alternatives and stability along differentiation stages. Moreover, results show a very interesting property of RBNs: they maintain diversity in terms of possible asymptotic states originating from different combinations of frozen nodes, both during the process of progressive freezing itself and in the final reached states. We assessed this diversity by means of three metrics, so as to attain general results. We found that different choices in nodes to be frozen are very likely to lead to different asymptotic states, implying that diverse differentiation paths can be generated. As expected, this diversity tends to decrease with the fraction of frozen nodes in the network.
As future work, we plan to add in our model mechanisms to reproducing open chromatin structure, where genes are made more accessible and their transcriptions eased. The combined effects of both closing and opening chromatin structure on attractors and other relevant features of BNs will be consequently analysed. Moreover, since epigenetic is expected to have an impact on cell type stability, we are devising a set of experiments to measure how attractor robustness changes along the path of differentiation, for example by measuring the impact of external signals-possibly modulated-during different stages of differentiation. To conclude, epigenetic is only one of the factors that are responsible for cell type transitions and definitions. Signalling cues, typically generated by other cells, are another crucial actor in the process of differentiation. In this perspective, we are planning to study models involving networks of BNs, so as to explore the possibility of modelling differentiation in a multi-cellular setting. sions on the model and its biological significance. Andrea Roli is a member of the IndAM research group GNCS.