Social coordination with locally observable types

In this paper we study the typical dilemma of social coordination between a risk-dominant convention and a payoff-dominant convention. In particular, we consider a model where a population of agents play a coordination game over time, choosing both the action and the network of agents with whom to interact. The main modeling novelties with respect to the existing literature are: (1) Agents come in two distinct types, (2) the interaction with a different type is costly, and (3) an agent’s type is unobservable prior to interaction. We show that when the cost of interacting with a different type is small with respect to the payoff of coordination, the payoff-dominant convention is the only stochastically stable convention; instead, when the cost of interacting with a different type is large, the only stochastically stable conventions are those where all agents of one type play the payoff-dominant action and all agents of the other type play the risk-dominant action.


Motivation
Social and economic interactions often require participants to coordinate their actions.
Conventions specifying on what side to drive or how to share the product of joint work and standards such as software or hardware platforms are examples, among many, of successful coordination. In this kind of situations coordination is an inherently strategic issue, giving rise to coordination games with a multiplicity of Nash equilibria, which can be seen as distinct conventions. A social dilemma that often arises in this setting is between a payoff-dominant action, which pays a higher payoff if the associated equilibrium is actually played, and a risk-dominant action, which performs better if out-of-equilibrium play happens (Harsanyi and Selten 1988). Which of these equilibria is more likely to emerge in the long run has been a matter of study in evolutionary game theory.
Most of the related research so far has focused on identifying conditions that lead to the emergence of either the payoff-dominant or the risk-dominant convention. However, in many real situations both conventions persist for a long time (as discussed, for instance, in Sugden 1995;Goyal and Janssen 1997). Also, agents are rarely fully homogeneous, being often characterized by a variety of traits that can affect the result of the social interaction. Sometimes such traits are not easily observable before interaction takes place, as it may be the case for preferences, religion or cultural heritage. This feature can help to explain the coexistence of conventions.
Indeed, what we typically observe in real-life interactions is the formation of clusters of agents that are homogeneous with respect to these traits, despite their unobservability prior to interaction. Agents often adopt distinctive actions that end up to be conventions in their cluster, e.g., dress, appearance, jargon, diet, rituals, meeting places. Some of these conventions can appear to be inferior choices but, nevertheless, they are adopted. The fact that these distinctive actions are endogenously chosen by the agents (in contrast to types that are instead exogenously given) and, in many cases, are also easily observable prior to interaction, suggests a novel explanation of why different conventions can coexist: They may act as signals allowing coordination across types. In other words, if it is known that a specific way of speaking/dressing is adopted mainly by agents with a certain trait, an individual might prefer to interact with the people showing that way of speaking/dressing-even if this forces such individual to coordinate on an action that yields low payoffs-because this allows him to interact with people that possess his own trait. Importantly, this kind of intuition points to a joint explanation of the coexistence of different conventions and the emergence of homogeneous clusters, at least for those situations where agents can choose with whom to interact.

Our contribution
We explore social coordination in a population made of two different types that have a preference for own type but can observe others' type only after first interaction and that interact on a network which they can shape but with a constraint on the maximum number of neighbors.
More precisely, we consider a variant of the model in Staudigl and Weidenholzer (2014). With respect to the latter, we introduce the novelty that agents are of two different types, x and y, and that types are payoff-relevant, in the sense that a penalty d > 0 is suffered when interacting with a different type. Importantly, actions taken by agents are globally observable, while types are not observable prior to interaction: Each agent knows the type of agents with whom he is connected, and ignores the type of other agents, being only able to form expectations on the basis of the distribution of choices at the current population state.
Our main results concern the long-run prediction obtained by applying stochastic stability (see Young 1993;Kandori et al. 1993) and are twofold. When d is low, the payoff-dominant convention is the only stochastically stable outcome: In the long run, all agents of both types end up choosing the payoff-dominant action. This result can be interpreted as a robustness check of Staudigl and Weidenholzer (2014). When instead d is sufficiently high, then the stochastically stable states are those where all agents of one type choose an action and all agents of the other type choose the other action. To our knowledge, we are the first in the stream of literature on social coordination and stochastic stability to introduce agents' heterogeneity with local observability of types. As a result, we obtain that the risk-dominant action and the payoff-dominant action coexist in the long run for a reason that is substantially different from restrictions to agents' mobility (see the paragraph on related literature for a more articulated discussion). The intuition is the following. One single mutation can be enough to escape from a state where an agent, once he has deleted all of his links by mistake, would be confronted with the risk of interacting with a different type if he chooses to reconnect to agents choosing his same action. Moreover, one single mutation is enough provided that the penalty for interacting with a different type is sufficiently large. In our model the former condition holds for all states where actions and types are not perfectly correlated; instead, it never holds for the states where all agents of one type choose one action and all agents of the other type choose the other action. Therefore, when d is large, the states where actions and types are perfectly correlated are harder to leave in terms of mutations, which makes them stochastically stable.
We observe that a single mutation would not be sufficient to leave an absorbing state if the mutant is able to remember the agents with whom he was previously connected and to re-establish a link with them. Indeed, in our model agents are prevented from tracing back their previous mates. This can be interpreted as a lack of memory regarding past interactions. However, what is actually needed for our results is that agents have decaying memory, so that they will eventually forget the identity of previous mates. Coupled with positive inertia, this would imply that, with positive probability, the mutant has forgotten the identities of his previous interaction partners when he gets the opportunity to update his strategy.
The rest of the paper is organized as follows. The next paragraph surveys the relevant literature, contrasting our contribution with the existing ones. Section 2 introduces the basic elements of the model. Section 3 discusses the induced Markov chain and provides some simple results for the unperturbed dynamics. Section 4 considers the perturbed dynamics and gives the main results concerning stochastic stability. Section 5 concludes by discussing the assumptions and providing directions for future research. Appendix collects the proofs of the propositions on the unperturbed dynamics and of all lemmas, while the proofs of the propositions on the perturbed dynamics are given in the main text.

Related literature
Most papers on social coordination in the long run consider agents who follow myopic best reply rules and occasionally make mistakes. 1 The main message in this literature is that when the interaction structure is exogenous, inefficient risk-dominant conventions emerge in the long run. 2,3 However, when the interaction structure is endogenous, this result does not necessarily hold and the payoff-dominant action can be selected in the long run. The endogeneity of the interaction has been modeled mainly in two ways: (1) The agents can choose with whom to form an interaction network, and (2) the agents can select a location among a number of locations available and then interact with agents in the same location.
In approach (1), to which our model belongs, network formation is typically associated with a cost to maintain the existing links. In a non-cooperative setup, Goyal and Vega-Redondo (2005) show that when interaction is unconstrained (i.e., there is no bound to the number of agents one can form a link with) and costs to maintain a link are low, then the risk-dominant convention still emerges in the long run, while for relatively high costs to maintain a link the payoff-dominant convention does emerge. 4,5 Our model follows the version of Goyal and Vega-Redondo (2005) where both the cost to maintain a link and the payoff flow is asymmetric (i.e., the agent who pays the cost is the only one to receive the payoff of interaction). 6 With respect to it, our model differs for three distinctive features: (1) Agents have a maximum number of interactions that can maintain at the same time, as in Staudigl and Weidenholzer (2014), (2) 1 See, e.g., Eshel et al. (1998), Alós-Ferrer andWeidenholzer (2008), Alós-Ferrer and Shi (2012) and Cui (2014), and references therein, for models of local interaction where agents follow imitative behavior. 2 For global interaction models, see, e.g., Kandori et al. (1993), Kandori and Rob (1995) and Young (1993). For local interaction models, see Blume (1993Blume ( , 1995, Ellison (1993Ellison ( , 2000, Alós-Ferrer and Weidenholzer (2007) and Jiang and Weidenholzer (2016); for a general framework for local interaction models with an exogenous interaction structure see Peski (2010); finally, see Weidenholzer (2010) for a recent survey on local interaction models focusing on social coordination. 3 Neary (2012) studies a model of social coordination where the interaction structure is exogenous and global, but agents are heterogeneous in their preferences about the action upon which to coordinate. In this setup only payoff-efficient conventions are selected. 4 Hojman and Szeidl (2006) develop a related model with unidirectional payoff flows that accrue from all path-connected agents. 5 Jackson and Watts (2002) study a cooperative (pairwise) network formation model and show that for low costs to maintain a link the risk-dominant convention is selected, while for high costs both the payoffdominant and the risk-dominant conventions can be selected. The difference with Goyal and Vega-Redondo (2005) is mainly due to the fact that the transition from one convention to the other is stepwise, while in the non-cooperative setup it is all at once when a sufficient number of agents become mutants. 6 In the main model of Goyal and Vega-Redondo (2005), instead, the payoff is earned by players on both sides of the link, independently of who is paying to maintain the link. agents come in two distinct types with a preference for interacting with agents of their own type, and (3) an agent's type can be observed only if already connected to him. Indeed, the paper that is most related to ours is Staudigl and Weidenholzer (2014). They show that when interaction is constrained in the sense that agents can only support a small number of links with respect to the population size, then the payoff-dominant convention emerges in the long run. Our paper shows that this result is robust to the introduction of features (2) and (3), provided that the preference for interacting with one's own type is not too strong; however, if the cost of interacting with agents of a different type is large enough, then efficiency is lost and both conventions coexist in the long run.
In approach (2), the interaction structure is constrained by the fact that an agent can interact only with those agents choosing the same location. Oechssler (1997), Ely (2002) and Bhaskar and Vega-Redondo (2004) are all nice examples of models in which agents play a coordination game and have to choose one location among the locations where the coordination game is played. In these models, the payoffdominant convention typically emerges in the long run. In the light of our results, a case of particular interest is when locations are subject to a capacity constraint, thus impeding or forcing the movements of agents across them. The important fact here is that under this constraint the payoff-dominant convention is no longer the only one selected in the long run and, in addition, non-monomorphic states can be stochastically stable. More precisely, the coexistence of both payoff-dominant and risk-dominant conventions can be obtained in the long run. In this regard, Anwar (2002) studies a model where there are constraints on locations and each location has a certain number of patriots (i.e., agents who never want to leave their current location). The author shows that when the constraints on locations are tight (i.e., a small number of agents can/want to move), then the risk-dominant convention emerges in the long run, while if sufficient movement is possible across locations (capacity is large and/or patriots are few), then different conventions emerge at different locations. Further, in the case where the size of location is asymmetric, the location with the smaller size will have agents coordinating on the payoff-dominant convention. 7 A similar model to that in Anwar (2002) is studied by Blume and Temzelides (2003), who also draw the conclusion that restricted mobility may lead to the coexistence of different conventions. In addition, the analysis of Blume and Temzelides (2003) allows to understand more clearly which are the long-run effects of increased mobility on the adopted convention: Both a higher number of locations and a higher fraction of agents that are mobile have a positive impact on the likelihood to observe the emergence of the payoff-dominant convention at one location (with all mobile agents being at that location), as opposed to the risk-dominant convention at every location. Furthermore, Blume and Temzelides (2003) study the long-run payoffs of mobile and immobile agents as a function of the exogenous restrictions on mobility: They find that mobile agents enjoy a higher payoff and always benefit from increased mobility, while immobile agents benefit from increased mobility at low levels of mobility only.
We stress that we obtain the coexistence of conventions without such constraints on mobility. Instead, we rely on the cost of type mismatch (that requires some degree of agents heterogeneity) and on the risk of mismatch (that requires some degree of imperfect observability of types). We also stress that in our model, differently from Anwar (2002), when the total population size is large, the relative size of the two populations of types does not affect which population plays which convention, provided that the cost of mismatch is large enough. 8 Finally, Carvalho (2016) shows that the coexistence of conventions can emerge also under global random interaction, provided that the population is divided into cultural groups that have distinct taboos (some actions are never played voluntarily) and different cultural preferences (for the non-taboo actions). In our model too preferences are type-dependent, although in a substantially different way: Heterogeneity is not with regard to actions but with regard to the cultural identity of the interacting partner.

Network structure
We consider a set N = {1, 2, . . . , n} of agents. Each i ∈ N can choose the subset of other agents with whom to play a fixed bilateral social game. Formally, let g i = (g i1 , . . . , g in ) be the n-dimensional vector collecting i's connections; in particular, g i j ∈ {0, 1}, and we say that agent i maintains a link with agent j if g i j = 1. We assume that g ii = 0 for every i ∈ N . Connections are directed, so that g i j = 1 does not necessarily imply g ji = 1. The maximum number of links that an agent can maintain at any given time is k ≥ 1, and the cost to maintain any single link is c. An agent i is said to be isolated if g i j = 0 for every j ∈ N . A profile of link formation choices, one for each agent in N , is denoted by g = (g 1 , g 2 , . . . , g n ). We will refer to g as the network of interactions.

Social game
Agents play a 2 × 2 symmetric game in strategic form with common action set. Each agent plays only with the agents with whom he is directly connected.
The table below describes the payoffs associated with the bilateral social game. Payoffs are given only for the row player, since the game is symmetric. 8 Another relevant contribution is Dieckmann (1999), although it is in a sense less related because agents are supposed to follow imitation rules instead of myopic best reply rules. Dieckmann (1999) presents a location model where, besides capacity constraints, the movement across locations is subject to frictions (in the form of the possibility that only the action or only the location is revised as desired) and the play outside the current location is imperfectly observable. The main finding-substantially in line with Anwar (2002) and Blume and Temzelides (2003)-is that imperfect observability and frictions alone cannot block the emergence of the payoff-dominant convention, while restricted mobility does. In our model the imperfect observability (of types) can prevent the emergence of the payoff-dominant convention.
where the following inequalities hold: which imply that two coordination equilibria exist (by 1.), B is the payoff-dominant action (by 2.), and A is the risk-dominant action (by 3.). We note that π(B, A) < π(A, B). We further assume that all payoffs are positive, i.e., π(B, A) > 0. Each agent i ∈ N has to choose an action a i ∈ {A, B} which is played against the choice of each j for which g i j = 1. A profile of action choices, one for each agent in N , is denoted by a = (a 1 , a 2 , . . . , a n ).

Strategies
A strategy for agent i is s i = (a i , g i ), where a i ∈ {A, B} denotes the action chosen by i, and g i ∈ {0, 1} n , with g ii = 0 and j g i j ≤ k, denotes the agents with whom agent i is connected (see network structure and social game). A profile of strategies for the whole population-to which we also refer as state-is s = (s 1 , s 2 , . . . , s n ). We will write s = (a, g), where a is the action profile of the entire population and g is the interaction network. 9

Agents' types
There are two types of agents in the population, x types and y types. The type of agent i ∈ N is denoted by w i ∈ {x, y}. With some abuse of notation, we use ¬w i to denote the type other than w i . We also define the indicator function of type dissimilarity δ: N 2 → {0, 1} such that δ(i, j) = 1 if and only w i = w j , and δ(i, j) = 0 otherwise. The number of agents of type x is denoted by n x , while the number of agents of type y is denoted by n y = n − n x . Without loss of generality, we assume that n x ≥ n y . We also assume that the number of agents of each type is large relative to the number of maximum connections for each agent; in particular, n y ≥ 2k + 1.
If agent i is of type w i and interacts with agent j whose type is w j = w i , then i incurs a cost of d > 0; if instead w j = w i , then no cost is incurred.

Time
Agents repeatedly interact over time, which is assumed to be discrete, and indexed by the natural numbers, i.e., t = 0, 1, 2, . . .. The state of the system at time t is denoted with s t = (s t 1 , s t 2 , . . . , s t n ), where s t i is the strategy adopted by agent i at time t.

Revision protocol
In each round, every agent has the opportunity to revise his strategy with probability γ ∈ (0, 1). When an agent receives a revision opportunity, he selects with positive probability any strategy that maximizes the interim utility, as formally provided in a subsequent paragraph. 10

Information
If agent i receives a revision opportunity at time t + 1, then, for any agent j such that g t i j = 1, agent i knows both a t j and w j , while for any agent h such that g t ih = 0, agent i knows only a t h . Also, agent i is informed of the summary statistics of the population state at time t; in particular, he knows the number of x types and y types that are playing action A, as well as those who are playing action B, at state s t . We stress, hence, that information about others' type accrues to an agent at two levels: Locally, the agent knows the type of each of his neighbors, while, globally, he only knows the averages of x-type and y-type agents who are choosing A and B.
On the whole, a revising agent i at time t + 1 knows the pieces of information listed in the following table, where we also introduce specific notation to make the following exposition easier.

Notation
Number of agents… n(w, a|s t ) of type w who are choosing action a n i1 (s t ) with whom agent i maintains a link n i0 (s t ) with whom agent i does not maintain a link n i1 (a|s t ) playing action a with whom agent i maintains a link n i0 (a|s t ) playing action a with whom agent i does not maintain a link n i1 (w|s t ) of type w with whom agent i maintains a link n i0 (w|s t ) of type w with whom agent i does not maintain a link n i1 (w, a|s t ) of type w playing action a with whom agent i maintains a link n i0 (w, a|s t ) of type w playing action a with whom agent i does not maintain a link

Utilities
The interim utility of agent i who chooses strategy s i at time t + 1, when the previous state was s t , can be formally written as follows: The first term of u i (s t+1 i , s t ) is the payoff from the social game net of the link maintenance cost, and it is made by the sum of payoffs accruing to i from the links he chooses to maintain at time t + 1. The second term is the cost of maintaining links at time t + 1 with agents of a type different from w i and with whom i was already maintaining a link at time t; we note that such a cost of type mismatch is known to i, since those agents are already his neighbors and hence he knows their type. The third term is the expected cost due to the formation of new links; differently from the previous term, these agents are new neighbors, and as such their type is unknown to i, who can only compute the expected cost of type mismatch on the basis of the fraction of agents of type different from w i (i.e., ¬w i ) among the agents with whom i is not connected and who choose the observed action. 11 Once the decision is taken and the types of agents with whom new links are formed become known, agent i obtains the following ex post utility at time t + 1: total cost of type mismatch .
The first term of v i (s t+1 i ) is, again, the payoff from the social game net of the link maintenance cost, while the second term is the cost of maintaining links with agents of a type different from w i , independently of whether i already had links toward them or not.

Markov chain
The process described in Sect. 2 formally defines a Markov chain (S, P), where S is the state space (i.e., the set containing all possible states) and P is the transition matrix, with P ss denoting the probability to move directly from state s to state s , with s, s ∈ S.

Neighborhood heterogeneity
We start by providing two results on the short-run behavior of the system.

Proposition 1
If agent i is given a revision opportunity at time t + 1, and there exists j ∈ N such that g t i j = 1, δ(i, j) = 1, and n(w i , a t j |s t ) ≥ k + 1, then g t+1 i j = 0 with probability 1.
Proposition 1 states that a revising agent, say i, maintaining a link toward an agent of a different type, say j, will choose for sure to replace such a link with a new link toward an agent h choosing the same action as j, provided that at least some agent exists who chooses the same action as j and has the same type of i: Indeed, h cannot grant a lower payoff than j, and grants a higher payoff with positive probability.
Proposition 2 states that a link between agents of the same type who play different actions can be stable, at least in the short run. Indeed, an agent, say i, will keep maintaining a link with another agent, say j, that plays a different action if the act of replacing j with some other agent h who plays i's action implies a severe risk of type mismatch, i.e., if the penalty d is large enough and if the pool of potential new neighbors playing i's action has a high enough fraction of agents who are not of i's type.
The above two results can be understood intuitively as follows. A revising agent can condition his decision to form a link on the action of the new neighbor, but not on his type. As a consequence of this, a link with an agent choosing the same action has no value: Such link can always be replaced by another link with an agent choosing the same action. Instead, a link with an agent of the same type has a value: If such link is replaced by a new link, then the risk of a mismatch of types is sustained.

Absorbing sets
An absorbing set (otherwise called a recurrent class) is a minimal set of states with respect to the property that the system has probability zero of moving from any state in the set to any state out of the set.
We introduce a number of definitions that will be useful in the subsequent analysis, where we discuss the variety of possible absorbing sets. Consider a state s = (a, g). State s is k-regular if n i = k for every i ∈ N . State s is type-segregated if g i j = 1 implies that δ(i, j) = 0. We say that state s is monomorphic if a i = a j for all i, j ∈ N . In particular, we distinguish between states that are A-monomorphic and states that are B-monomorphic, i.e., states where all agents choose A and states where all agents choose B, respectively. States that are not monomorphic are called polymorphic. Among polymorphic states, an important role in our analysis is played by type-monomorphic states; a state is called type-monomorphic if a i = a j for all i, j such that w i = w j = w ∈ {x, y}, and it is not monomorphic. We sometimes refer to states that are type-monomorphic with x on A and y on B (which means that all agents of type x play A and all agents of type y play B) and states that are type-monomorphic with x on B and y on A (with an analogous definition). All remaining polymorphic states are called type-polymorphic. Finally, we denote with S A A the union of all absorbing sets that contain only A-monomorphic states. We define S B B analogously. We use S A B to denote the union of absorbing sets that contain only type-monomorphic states with x on A and y on B; the set S B A is defined analogously, with the general rule that the apix refers to the choice of agents of type x and the pedix refers to the choice of agents of type y.

Proposition 3 (a) If c < π(A, A), then (1) there exist absorbing sets containing A-monomorphic states, and (2) there exist absorbing sets containing B-monomorphic states. (b) If c < π(A, A) and d > π(B, B) − π(A, A), then (1) there exist absorbing sets containing type-monomorphic states with x on A and y on B, and (2) there exist absorbing sets containing type-monomorphic states with x on B and y on A. (c) If c < π(A, A) and d > 2[π(B, B) − π(A, A)], then there exist absorbing sets containing type-polymorphic states. (d) If c < π(A, A) and d > π(B, B)(n − 1)/n x , then there exist absorbing sets containing states where some agent is isolated.
Proposition 3 allows us to make a remark about the effect that the introduction of a payoff-relevant heterogeneity in types has on the possibility to observe the coexistence of both actions in the long run. 12 If all agents had the same type, or if in our model we set d = 0 so that types are payoff-irrelevant, essentially we would obtain the model of Staudigl and Weidenholzer (2014), where the only absorbing sets (provided that c is not too large) are those in which all agents choose the same action. This can be easily understood by noting that, in such a case, if it is optimal for an agent to maintain his current action, then it is optimal for agents using a different action to switch to this one. By introducing agents of different types and unobservability of types prior to linking, we obtain that agents at different locations in the network are different because they have different local information about the types of their neighbors. Such information is valuable if types are payoff-relevant, i.e., when d > 0. This leads to a substantially richer variety of states belonging to absorbing sets. Monomorphic states still belong to absorbing sets [point (a)]. In addition to them, when d is sufficiently large, we find absorbing sets containing polymorphic states. In particular we have that both type-monomorphic states and type-polymorphic states can belong to absorbing sets [points (b) and (c), respectively]. This leads to the following observation: Even if interactions between agents of different types are unlikely to last for a long time . More importantly, as it will become clearer in the following examples, there is the possibility that some agents are isolated and others are not. We stress that this heterogeneity in the network structure cannot be observed if isolation is obtained by raising c.

Graphical illustrations
In Figs. 1, 2, 3 and 4 we provide some graphical representations with the aim of illustrating the variety of possible absorbing sets. In all the examples, we have n x = 10, n y = 9, k = 3. Circles identify x types; squares identify y types. Arrows represent interactions. Agents choosing action A are colored in light green; agents choosing B are colored in dark blue. Figure 1 depicts monomorphic states; more precisely, in Fig. 1a we have an A-monomorphic state, and in Fig. 1b we have a B-monomorphic state. Also, the subfigures represent states that are k-regular, since every agent has exactly 3 connections (in particular, the same connections are in place in the two states). Consistent with point (a) of Proposition 3, such states belong to absorbing sets for any value of d. In particular, each state belongs to a singleton absorbing set: As long as d is positive, every agent prefers not to reshuffle his links, in order to avoid the risk of a type mismatch. Both states are k-regular; this must necessarily be the case, since there is no risk of mismatch due to the perfect correlation between actions and types. We observe that each of these states belongs to an absorbing set comprising many states; this is so because agents are indifferent between keeping current mates and substituting them with other agents choosing the same action. In order for type-monomorphic states to belong to an absorbing set, we must have that agents choosing A do not find it profitable to switch to B and cast links to agents choosing B; this happens when d > π(B, B) − π(A, A), which is the same inequality given in point (b) of Proposition 3.  Figure 3 represents a state that is type-polymorphic. In particular, there are agents of each type choosing A and agents of the same type choosing B. It is easy to understand that, given the current network of interactions, no agent wants to change action. Also, agents who choose B will never reshuffle links, because there is a risk of type mismatch if doing so, while there is no benefit. Agents who choose A face an expected penalty due to type mismatch that is equal to d/2; if such a cost is larger than π(B, B) − π(A, A), which is the same inequality given in point (c) of Proposition 3, then these agents will never change strategy as well, so that the state in Fig. 3 actually belongs to a singleton absorbing set.
We stress that even if the state under consideration is k-regular, there are typepolymorphic states belonging to absorbing sets where this is not the case (the same can occur for monomorphic states, while type-monomorphic states are necessarily kregular). Imagine that, starting from the state represented in Fig. 3, a link is removed. Intuitively, if d is large enough, then the expected penalty of a type mismatch is sufficiently large to discourage any attempt to form a new link. Finally, a link between two agents of the same type who choose different actions might also be in place, in case such an interaction brings a positive payoff (which happens if c is not that large), and d is again sufficiently large (see Proposition 2). Figure 4 depicts a state where one agent is isolated. In particular, agent i has no link outgoing from him. Furthermore, agent i plays action A, while all other agents (including those having the same type as i) play B, so the state represented is typepolymorphic. If the expected penalty for i of a new link toward an agent playing B (which is equal to 10d/18) is larger than the largest benefit coming from the new interaction [which is equal to π(B, B)], then casting a new link is unprofitable. We observe that the arising inequality is the same as the inequality in point (d) of Proposition 3, once we consider that n x = 10 and n y = 9. We also observe that agent i will keep on switching from action B to A and vice versa, since both actions grant him the same (null) utility. Moreover, all other agents strictly prefer not to change their strategies, since they currently earn the maximum attainable utility and reshuffling links come with the risk of a type mismatch. Therefore, the state in Fig. 4 actually belongs to an absorbing set. We finally notice that agent i has no incoming links. We remark that this is something specific to this example; indeed, there exist states belonging to absorbing sets where an agent exists who has some incoming links and no outgoing links (so that such an agent is still isolated).

Perturbed dynamics 4.1 Regular perturbed Markov chain
We are ready to introduce perturbations in the unperturbed dynamics considered in Sect. 3 and to apply concepts and tools developed by Foster and Young (1990), Young (1993), Kandori et al. (1993) and Ellison (2000).
We adopt the so-called uniform error model for mistakes. 13 In particular, when an agent is given a revision opportunity, with probability 1 − he will update his strategy by using the myopic best reply rule described in the previous section, while with probability the agent is hit by a perturbation (or mutation, mistake, etc.,) and chooses at random one strategy in his strategy set. The arising transition matrix is denoted with P , and we refer to (S, P ) as the perturbed Markov chain resulting from (S, P). For any positive level of , the system can move with positive probability from any state to any other state, i.e., it is ergodic. This implies that the perturbed Markov chain is irreducible and aperiodic, and hence, by known results, there exists a unique invariant distribution μ over states in S that describes the long-run behavior of the system. As tends to zero, we have that P tends to P; in particular, P ss ∼ r (s,s ) as → 0, where r (s, s ) is the so-called resistance of the transition from s to s , which basically counts how many perturbations (or mutations, mistakes, etc.,) are required to complete such a transition in one period of time. A family of perturbed Markov chains for going to zero which satisfies the above properties is called a regular perturbed Markov chain. For a regular perturbed Markov chain, the limit of the invariant distribution μ for going to zero is known to exist, and the states having positive probability in that limiting distribution are called stochastically stable. The following characterization of stochastically stable states will be useful for our subsequent analysis (see Young 2001, for a more detailed exposition).
The notion of resistance can be extended by relaxing the constraint that the transition must occur in one period, and can be usefully applied to absorbing sets instead of states. Given two absorbing sets S and S , the resistance between S and S is given by the minimum sum of resistances between states over paths that start in a state belonging to S and end in a state belonging to S . Now, for any conceivable tree (i.e., a graph such that any two vertices are connected by exactly one path) having the absorbing set S as root and all absorbing sets as nodes, consider the sum of resistances assigned to each edge of the tree, and take the minimum over trees of such a sum. This number represents the stochastic potential of S . Intuitively, the stochastic potential tells us how difficult is to reach an absorbing set starting from other absorbing sets. A fundamental result in this literature asserts that a state is stochastically stable if and only if it belongs to an absorbing set with minimum potential: Stochastically stable states are those that are relatively easiest to reach in terms of the minimum number of mutations required to reach such states starting from other states.
Two other notions are useful in the following analysis: the radius and coradius (Ellison 2000). If Q is a union of absorbing sets, consider all possible paths-i.e., sequences of states-starting from a state in Q and ending in a state belonging to an absorbing set that is not part of Q. The radius of Q, denoted with R(Q), is defined as the minimum sum of resistances between states over all such paths. Now consider all possible paths starting from a state belonging to an absorbing set Q and ending in a state in Q. For each Q , consider the minimum sum of resistances between states over all such paths. The coradius of Q, denoted with CR(Q), is the maximum among Q of such minimum numbers. Intuitively, R(Q) and CR(Q) provide measures of how difficult it is, respectively, to leave Q and to reach Q.

Stochastic stability: low cost of mismatch in types
Our first main result on stochastic stability is Proposition 4, and it addresses the case where the cost of interacting with an agent of a different type is low relative to the gain of coordinating on the payoff-dominant action instead of the risk-dominant one. Preliminarily, Lemma 1 provides a characterization of the set S B B that is then exploited in the proof of the proposition. Lemma 1 states that S B B is made of the union of all and only the singleton absorbing sets where each agent plays B and has k links toward agents of his own type.

then a state s is stochastically stable if and only if s ∈ S B B .
Proof We remind that S B B is defined as a union of absorbing sets. We then compute its radius and coradius, denoted with R(S B B ) and CR(S B B ), respectively. We first observe that in every state where at least k + 1 agents choose B, then all agents find it optimal to choose B and have k connections to agents choosing B: Indeed, such a strategy grants in the worst case of all mismatches a payoff of k(π(B, B) − c − d), while the highest payoff with any other strategy is k(π(A, A) − c), which is lower by the assumption d < π(B, B) − π(A, A). So, starting from a state in S B B , at least n − k mutations must occur to switch n − k agents from B to A, hence leaving less than k + 1 agents choosing B. Our assumptions guarantee that n ≥ 4k + 2. This implies that R(S B B ) ≥ 3k + 2. We now suppose to start from a state outside S B B . With k mutations, we are sure to reach a state where at least k agents choose B. In such a state, all other agents find it optimal to choose B and connect to agents choosing B (for the same reasons discussed above). This means that a state in S B B can be reached with positive probability in the unperturbed dynamics, thus implying that CR(S B B ) ≤ k. Since R(S B B ) > CR(S B B ), we can conclude by Theorem 1 of Ellison (2000) that all stochastically stable states belong to S B B . The last step is to show that, for any two states s = (a, g), s = (a , g ) ∈ S B B , there exists a sequence of states belonging to absorbing sets s 1 , . . . , s i , . . . , s such that s 1 = s, s = s , and a single mutation is sufficient to move from s i to s i+1 for i = 1, . . . , −1. If, for agent i, we have that s i = s i , then a single mutation can change s i to s i . The state reached in this way forms an absorbing set by Lemma 1, since all agents play B; it is k-regular and type-segregated. With at most n of such steps, we are sure to have reached state s . Therefore, S B B is a mutation-connected component (in the words of Samuelson 1994), and we apply Theorem 2 in that paper to conclude that all states in S B B are stochastically stable.
Let us make a remark to better contrast our results with those in Staudigl and Weidenholzer (2014). In our model, the assumption that n x ≥ n y ≥ 2k + 1 implies that k < (n − 1)/2. In the model of Staudigl and Weidenholzer (2014) (see Theorem 1) this condition guarantees that the payoff-dominant convention is the stochastically stable outcome. In this respect, our Proposition 4 represents a robustness check of their result: The introduction of types that are not globally observable and that determine a penalty in case of a mismatch does not affect the long-run prediction in favor of the payoff-dominant convention, provided that the penalty for a mismatch in types is sufficiently low. The only difference between our prediction and theirs concerns the shape of the interaction network: We obtain that stochastically stable states are typesegregated, while this feature is clearly absent in Staudigl and Weidenholzer (2014). This is something expected-especially in monomorphic states-as the interaction between agents of different types bears a cost.

Stochastic stability: high cost of mismatch in types
The prediction obtained by stochastic stability drastically changes when d is high. Surprisingly, neither the payoff-dominant convention nor the risk-dominant convention is selected in this case. Rather, we obtain that both actions will coexist in the long run.
The proof of the result relies on a tree surgery argument, exploiting the techniques by Young (1993 14 We are indebted to an anonymous referee who has provided a simple example where k = 1 and a single mutation is sufficient to leave S A B and S B A : think of all agents of one type as located around a circle, with each agent who is linked only to the agent on his right; they are all playing A, and a mutation hits an agent so that his action changes from A to B, while his neighbor remains the same; now, the agent on his left, if given the possibility, would switch from A to B; after that, also the agent on the left of the latter agent would like to switch from A to B, and so on; this clockwise path of adjustments, which occurs with positive probability, ends in a state belonging to S B B . (a) there exists a sequence of absorbing sets Q 1 , . . . , Q , where Q 1 = Q and Q = Q , such that R(S A B ) mutations are required to move from Q 1 to Q 2 , and a single mutation is sufficient to move from Q i to Q i+1 for i = 2, . . . , − 1; (b) there also exists a sequence of absorbing sets Q 1 , . . . , Q , where Q 1 = Q and Q = Q, such that R(S B A ) mutations are required to move from Q 1 to Q 2 , and a single mutation is sufficient to move from Q i to Q i+1 for i = 2, . . . , − 1.
Finally, the following lemma provides the last result to be used in the proof of Proposition 5: For any absorbing set belonging to S A A ∪ S B B , we can find a sequence of absorbing sets where each step of the sequence has resistance 1, which originates from the absorbing set under consideration and reaches S A B (or S B A ). We are now ready to state and to prove our main result.

Proposition 5 If c < π(A, A), d > π(B, B) n−1 n y , and k ≥ 2, then every stochastically stable state is contained in S B
A ∪ S A B . If, additionally, n y > 2k + kd

π(B,B)−max{π(A,A),π(A,B)} , then a state s is stochastically stable if and only if s ∈ S B A ∪ S A B .
Proof By Lemma 2 we know that S A B and S B A are two absorbing sets, and by nesting Lemma 4 with Lemma 5 and with Lemma 6, we can find, starting from any absorbing set, a path between absorbing sets that lead to S A B and such that every step in the path involves only 1 mutation, except if the path goes through S B A , which instead requires R(S B A ) mutations to leave. For every absorbing set, take one such path with the absorbing set as the starting point, and consider the arising directed graph; for every absorbing set that has more than one outgoing link, all such links but one are deleted; by doing so, we are able to construct an S A B -tree over absorbing sets where every link between any two nodes involves only 1 mutation, except for the link outgoing from S B A , which involves R(S B A ) mutations. We observe that 1 is the minimum number of mutations required to move between absorbing sets, and R(S B A ) is the minimum number of mutations required to exit S B A , by Lemma 3. Therefore, the stochastic potential of S A B is equal to ξ − 2 + R(S B A ), where ξ denotes the total number of absorbing sets. With an analogous reasoning we obtain that the stochastic potential of S B A is equal to ξ − 2 + R(S A B ). Now we consider an absorbing set Q S A B ∪ S B A . We observe again that 1 is the minimum number of mutations required to move between absorbing sets, R(S B A ) is the minimum number of mutations required to exit S B A , and R(S A B ) is the minimum number of mutations required to exit S A B . Therefore, the stochastic potential of Q cannot be lower than ξ − 3 + R(S A B ) + R(S B A ), which is higher than the stochastic potentials of S A B and S B A , because R(S A B ) ≥ 2 and R(S B A ) ≥ 2 by Lemma 3. This allows us to conclude, by Theorem 2 in Young (1993), that every stochastically stable state is contained in S B A ∪ S A B . In case the additional assumption n y > dk

π(B,B)−π(B,A)
is satisfied, then the stochastic potentials of S A B and S B A are equal, and hence we can conclude, again by Theorem 2 in Young (1993), that a state s is stochastically stable state if and only if s ∈ S B A ∪ S A B .
The results stated in Proposition 5 can be better understood if contrasted with the results provided in Anwar (2002) and Blume and Temzelides (2003). As mentioned in the Introduction, Anwar (2002) and Blume and Temzelides (2003) study models where agents interact at specific locations in the presence of restrictions to mobility; the main finding is that the risk-dominant and the payoff-dominant convention can coexist in the long run, provided that there is enough freedom of mobility across locations. However, freedom of mobility must not be too much-i.e., some agents are not allowed to go to the location that they prefer-otherwise only the payoff-dominant convention emerges-as shown by, e.g., Bhaskar and Vega-Redondo (2004). So, coexistence can be understood as the result of imperfections-or frictions-that make this case lie in between absence of mobility-when the risk-dominant convention emerges in the log-run-and full mobility-when the payoff-dominant convention does emerge. On the contrary, in the present model the coexistence of distinct conventions is the consequence of a new effect that favors directly type-monomorphic states, rather than mediating between two extreme cases where the two conventions are globally adopted. Type-monomorphic states are indeed the only states where the information on types can be perfectly derived from the observation of actions, allowing a mutant to come back with certainty to having interactions with agents of own type; this makes type-monomorphic states relatively more resilient to mutations.

Network formation model
The non-cooperative model of network formation that we have considered is noncooperative and asymmetric in both cost bearing and payoff flows (fundamentally, the one-way flow model in Bala and Goyal 2000). The assumption that connections are formed unilaterally can be realistic for some cases, while for other cases a cooperative model of network formation might be more appropriate-like that in Jackson and Watts (2002), where the consent of both agents involved is needed to form a link. Further, the assumption that payoffs flow unilaterally to the agent who has established the linkand bears the maintenance cost-is another feature of our model that may fit some cases, but not others. Both assumptions are also found in Staudigl and Weidenholzer (2014), which is the natural benchmark against which to compare our results. In their analysis, as in ours, the two assumptions described above allow a reasonable treatment of the model that, otherwise, would have been by far more complicated-and this, admittedly, is the main reason why we adopt them.
However, we do want to stress that the main intuition underlying our results does not depend crucially on the details of the network formation model. Indeed, it is a general observation that, when types are not observable outside one's own neighborhood and interactions with different types are costly, there emerges an implicit cost of adding a new neighbor. This is due to the risk of linking to an agent of a type different from one's own, and such a cost can be large enough to prevent an agent-who has been hit by a mutation-from going back to his status quo. This allows paths made of single mutations to move the system from one absorbing set to another absorbing set, making it rather easy to exit most absorbing sets-including the ones with the payoff-dominant convention and the risk-dominant convention. Similarly, we observe that there is only one situation where such a risk is totally absent: when there is perfect correlation between actions and types, i.e., in type-monomorphic states; indeed, if we start from a type-monomorphic state, then going back to the status quo after a single mutation comes at no implicit cost and, hence, it is quite more difficult to leave a type-monomorphic state. This is the core reason why the absorbing sets collecting type-monomorphic states are stochastically stable, and in this argument there is no substantial role played by a specific network formation model (even if details can well and substantially affect transition periods and patterns).

Type unobservability and homophily
The ingredients which are crucial in the above reasoning are related to two distinct aspects: the unobservability of the types of agents with whom no link is currently maintained, and the magnitude of the cost that has to be incurred to interact with an agent of a different type. The fact that types are unobservable prior to interaction seems a plausible assumption in many situations, at least if we think of the type as private piece of information which is learned only after interaction-possibly inferring the type from the payoff earned (which seems natural if type is a payoff-relevant characteristic). 15 The cost to be incurred for type mismatch can be seen as a form of homophily, i.e., as the result of preferences for interacting with one's own type. In this regard, we observe that our model exhibits homophily in the long run, and this is less obvious than it might appear at first because agents cannot choose directly to interact with agents of similar type-exactly because types are not observable prior to interaction. In this sense, the present paper can be seen as a marginal contribution to the recent literature on homophily in social interactions (see Currarini et al. 2009;Bramoullé et al. 2012): social coordination plus weak homophily (i.e., d is small enough) leads to the global adoption of the payoff-dominant convention (like in Staudigl and Weidenholzer 2014) with the additional feature that interactions take place only between agents of the same type (see Proposition 4 and comments thereafter), while if homophily is strong enough (i.e., d is large enough) in the long run we observe segregation in both types and actions (see Proposition 5). As a final remark on the cost of type mismatch, we observe that such cost can well be interpreted as a part of the link maintenance cost. With this interpretation, while a connection with a agent of similar type has a cost of c, a connection with an agent of different type has a cost of c + d, where d measures some additional type-related cost of interaction (e.g., extra communication costs due to the use of a different language).

Lack of memory
There is another assumption of our model that is crucial for our results, as anticipated in the Introduction: the agents' lack of memory. The important consequence of the lack of memory is that agents who are hit by a mutation, and change the network of interactions, cannot choose to go back and connect with the same agents with whom they were previously connected. In fact, if this were possible, there would be no risk of type mismatch when an agent is hit by a mutation and considers to go back to his previous status quo-i.e., to undo what a mutation has done. So, without this assumption our results would not be warranted. However, we think that assuming agents' lack of memory does fit some relevant cases. For instance, it fits the case where the effects of a mutation persist over several periods and the mutants can receive the opportunity to readjust their strategy only after some time, but in the meanwhile old neighbors might have become untraceable or unrecognizable. In general, we can think of the lack of memory as not only due to agents' cognitive bounds, but also to the actual technology of interactions which might make it impossible to retain with certainty the access to old neighbors (for instance, this is something quite widespread in social interactions over the internet). Most importantly, as stressed in the Introduction, what is really required for our main results is a weak version of the lack of memory, i.e., a positive probability that agents are not able to trace back their old neighbors once they disconnect from them. 16

Extensions
We end this discussion with a couple of remarks concerning the coordination game. First, we observe that the mechanism behind our results does not rest crucially on the fact that one action is payoff-dominant and the other action is risk-dominant. What is important in order to establish that absorbing sets made of type-monomorphic states are more resilient to mutations-i.e., that at least two mutations are required to leave these absorbing set (see Lemma 3)-is that, once a mutation has turned a single agent from A to B, the agents who are connected to him must not find it profitable to switch from A to B. In our model this condition is implied by the fact that A is risk-dominant (we remind also that in S A B and S B A all agents are k-regular with agents of their own type). However, if action B were both payoff-dominant and risk-dominant, and provided that at least 2 out of k neighbors must play B to have that playing B is better than playing A, then stochastic stability would still select, under the assumptions of Proposition 5, states where agents of one type coordinate on B, and agents of the other type coordinate on A.
Moreover, the inspection of the mechanism driving our results makes us confident that similar conclusions hold in more general coordination games and with more than 2 types of agents. Our belief is that in any coordination game with a number of actions greater than or equal to the number of types, if the cost of interacting with any type different from one's own is large enough, then different types will coordinate on different actions. Indeed, states where at least one action is played by agents of two different types admit paths leading to new absorbing sets which are made of steps involving a single mutation, on the ground of the same intuition used in this paper: Going back to the status quo after a mutation involves the risk of a type mismatch for such agents. But this does not hold in states where types and actions are perfectly correlated, and hence such states are more resilient to mutations; of course, the existence of states where types and actions are perfectly correlated requires the existence of at least as many actions as types.
Another dimension along which our results could be extended concerns the error model, even if a more careful analysis is required because not all mutations receive the same weight once we abandon the uniform error model. 17 We think that these observations reinforce the main message that can be drawn from our contribution: In a setting where a population of agents has to form interactions and coordinate on some action, if agents differ for some unobservable characteristic and interactions between agents with dissimilar characteristics are costly enough, then actions will end up playing the role of signals, allowing the formation of clusters of agents who are type-homogeneous, each cluster coordinating on a different action.

Proof of Proposition 1
Proof Suppose that agent i is given a revision opportunity at time t + 1, and suppose ad absurdum that there exists a strategy s i = (a i , g i ) that maximizes the interim utility of i and tells him to maintain the link with agent j, i.e., g i j = 1. We construct another strategy s i = (a i , g i ) such that a i = a i and g i is equal to g i with the only difference that in g i the link with agent j is removed and a new link is formed with an agent such that a t = a t j . We observe that the assumption that n(w i , a t j |s t ) ≥ k + 1 implies that there exists at least an agent having the same type as i and choosing the same action as j. This in turn implies (1) that a new link can actually be formed with an agent choosing a t = a t j , and (2) that the overall change in utility for i by playing s i instead of s i is strictly positive: This is so because the change of utility due to the play of the social game with all neighbors is trivially equal to zero (since a t = a t j and all other neighbors have remained the same, while the change in the expected number of different types is negative (since there is a positive probability that has the same type as i, and agent j is known for sure to be of different type).
We have proven that u i (s i , s t ) > u i (s i , s t ), and this suffices to show that agent i cannot choose a strategy such that g t+1 i j = 1.

Proof of Proposition 2
Proof Let us suppose, ad absurdum, that strategy s i = (a i , g i ) such that g i j = 0 maximizes i's interim utility. We first observe that g i cannot tell i to have less than k links, because otherwise i might increase his utility by simply adding the link with j, obtaining an additional utility of π(a i , a t j ), which is surely positive because of the assumption that c < π(B, A), which implies that every payoff in the social game is positive even after subtracting the maintenance cost. Therefore, g i tells i to have k links, which in turn implies that a new link with some agent has been formed, since the link with j has been removed. We construct another strategy s i = (a i , g i ) such that a i = a i and g i is equal to g i with the only difference that in g i the link with agent j is maintained and the link with agent is not formed. We now argue that We first note that the expected number of mismatches in types is lower with s i than with s i , because w(i) = w( j) for sure while 's type is different from i's type with positive probability: Indeed, if a t = a t j , then we have the assumption that n i0 (¬w(i), a t j |s t ) ≥ 1, while if a t = a t j (and hence a t = a t i ) then we have that n i0 (a t i |s t ) ≥ 1 and hence the assumption that d i |s t ) ≥ 1. Therefore, if i obtains the same or a larger utility in the social game by interacting with j than with , then we have obtained that u i (s i , s t ) > u i (s i , s t ). The only case where i obtains a larger utility interacting with than with j is if a t j = a i and a t = a i . Even in such a case, the assumption that d We can conclude that no strategy s i such that g i j = 0 can maximize i's interim utility, and this means that i will surely maintain the link with j if he has a revision opportunity at time t + 1 (and no change of course happens if i is not given a revision opportunity).

Proof of Proposition 3
Proof We first prove point (a). Consider a state s that is A-monomorphic, k-regular, and type-segregated. We check that an agent who receives a revision opportunity at s would see his utility reduced if he changes strategy. Indeed, having less than k links is suboptimal, since π(A, A) > c. Moreover, removing a link and casting a new one brings a neighbor who still plays A (since the state is A-monomorphic) but possibly is of a different type, hence generating an expected loss. Finally, switching from A to B is clearly detrimental, due to π(A, A) > π(B, A). Therefore we can conclude that no agent will ever change strategy, and hence state s belongs to a singleton absorbing state. An analogous reasoning can be made for a state that is B-monomorphic, k-regular, and type-segregated, where π(B, B) > c holds because π(B, B) > π(A, A).
We now prove point (b). Consider a state s that is type-monomorphic with x on A and y on B, k-regular and type-segregated. Take an agent of type x who receives a revision opportunity. Maintaining less than k links is suboptimal for him, since π(A, A) > c and, in addition, all agents playing A are of type x so that there is no risk of a type mismatch. Replacing an existing link with a new one has no effect on utility if the new link is cast toward an agent who currently plays A, since all agents playing A are of type x and hence there is no risk of a type mismatch. Replacing an existing link with a new one has a negative effect on expected utility if the new link is cast toward an agent playing B, since all agents currently playing B are of type y and d > π(B, B) − π(A, A), which means that the penalty for the type mismatch is larger than the maximum attainable gain. Finally, switching from A to B without changing neighbors is detrimental because of π (A, A) > π(B, A). A similar argument holds a fortiori if we consider an agent y who receives a revision opportunity. We can conclude that any revising agent will at most reshuffle his links among agents playing his same action, who are surely of his same type (due to the perfect correlation between actions and types). Therefore, starting from s we can only reach other states that are typemonomorphic with x on A and y on B, k-regular and type-segregated. This shows that an absorbing set exists, containing type-monomorphic states with x on A and y on B. Clearly, if we invert x with y, we obtain that the same reasoning applies to type-monomorphic states with x on B and y on A.
Then, we prove point (c). Consider a state s where k + 1 agents of type x play B, k + 1 agents of type y play B, and all other agents play A; also, s is k-regular and type-segregated (which is possible, since n x ≥ n y ≥ 2k + 2). Any agent playing B will never change strategy, because he is attaining the maximum possible payoff [i.e., k(π(B, B) − c)], which is not reachable if he switches to A, and changing neighbors comes with the risk of a type mismatch-since some agents currently playing B are of type x and some are of type y. Any agent playing A will never change his strategy as well. Indeed, the maximum gain which can be obtained by removing an existing link and connecting to someone playing B is π(B, B) − π(A, A), which is lower than the expected cost of a type mismatch (which is equal to d/2 since n(x, B|s) = k + 1 = n(y, B|s)) because of the assumption that d > 2(π(B, B) − π(A, A)). Moreover, removing an existing link and connecting to someone playing A brings no benefit and an expected cost due to type mismatch; deleting any of the k links is suboptimal, since π(A, A) > c; and switching from A to B without changing neighbors is detrimental because of π (A, A) > π(B, A). Therefore, state s belongs to a singleton absorbing state.
Finally, we prove point (d). Consider a state s where agent i of type y plays A and maintains no link, while all other agents play B and maintain k links toward agents of the same type different form i. Any agent other than i will never change strategy, because he is attaining the maximum possible payoff (i.e., k(π(B, B) − c)), which is not reachable if he switches to A, and changing neighbors come with the risk of a type mismatch because some agents currently playing B are of type y. If agent i chooses to connect toward an agent playing B, he will earn at most π(B, B), but has to suffer an expected cost of type mismatch equal to dn x /(n − 1), which is larger than π(B, B) due to the assumption that d > π(B, B)(n − 1)/n x . If agent i is isolated, then he is indifferent between playing A and B. Therefore, we have found an absorbing set that is made of states s and s , where s is identical to s with the only difference that agent i plays B instead of A.

Proof of Lemma 1
Proof The requirement that s is B-monomorphic is trivially a necessary condition for s ∈ S B B . We first show that, given (1), if we are not in a state such that also (2) and (3) hold, then it must be the case that with positive probability we reach a state where (1), (2) and (3) hold. Suppose that a i = B for all i ∈ N , but s is not k-regular and/or not type-segregated. We observe that, for a revising agent, choosing action A would clearly be suboptimal. Moreover, the expected payoff of forming a link with a new neighbor who plays B is both higher than not forming that link at all (because π (B, B) − π(A, A) > d and π(A, A) > c imply π(B, B) − c > d) and higher than maintaining an existing connection with a type different from one's own (because the resulting match cannot be worse, and possibly better). Therefore, with positive probability any agent who has less than k links and/or links with agents of a type different from his own will form new links with agents who play B, and with positive probability these new agents are of his own type.
We now show that a state satisfying (1), (2) and (3) forms a singleton absorbing set. To do so, it is enough to observe that any agent who receives a revision opportunity would see his payoff decreased, in expectation, by changing strategy. Indeed, by choosing action A, the agent would obtain a utility that is surely lower than his current utility k(π(B, B) − c), and the same is true if he chooses to maintain less than k links; also, substituting an existing link with a new one comes with the risk of linking to an agent of a type different from one's own, which leads to a lower expected payoff.

Proof of Lemma 2
Proof We provide the proof for point (a) only, as the proof for point (b) is almost identical to the proof of point (a).
By definition, if s ∈ S A B , then s is type-monomorphic with x on A and y on B, so (1) is trivially necessary. We first show that, given (1), if we are not in a state such that also (2) and (3) hold, then it must be the case that with positive probability we reach a state where (1), (2) and (3) hold. Suppose that s is type-monomorphic with x on A and y on B, but not k-regular and/or not type-segregated. Consider a revising agent currently playing B. We observe that choosing A is clearly suboptimal since, at most, he can obtain k(π(A, A) − c − d) < 0 because, by (1), all agents playing A are of a type different from his own and, by assumption, d > π(B, B)(n − 1)/n y > π (A, A). Moreover, the expected payoff of forming a link with a new neighbor who plays B is the same of keeping an existing link with a neighbor who also plays B [because, again by (1), B is played only by agents of similar type], it is strictly greater than the expected payoff of forming a link with an agent who plays A [because, by (2), A is played only by agents of a different type], and it is strictly greater that not forming that link at all [because π(B, B) > π(A, A) and π(A, A) > c imply π(B, B) − c > 0]. Consider a revising agent currently playing A. We observe that choosing B is suboptimal since, at most, he can obtain k(π(B, B)−c−d) < 0 because, by (1), all agents playing B are of a type different from his own and, by assumption, d > π(B, B)(n−1)/n y > π (B, B). Moreover, the expected payoff of forming a link with a new neighbor who plays A is the same of keeping an existing link with a neighbor who also plays A [because, by (1), A is played only by agents of similar type], it is strictly greater than the expected payoff of forming a link with an agent who plays B [because, by (1), A is played only by agents of a different type and, by assumption, d > π(B, B)(n − 1)/n y > π (A, B), so that π(A, B)−d < 0 < π(A, A)], and it is strictly greater that not forming that link at all (because π(A, A) − c > 0). Therefore, with positive probability any agent who plays B (respectively, A) and that has less than k links and/or links with agents of a type different from his own will form new links with agents who play B (respectively, A) up to k connections in total, and with certainty these new agents are of his own type.
We now show that the set of states satisfying (1), (2) and (3) forms an absorbing set. To do so, we first observe that, for the same arguments described above, any agent who receives a revision opportunity would see his payoff certainly decreased by changing action, and/or by choosing to maintain less than k links, and/or by linking to new agents who play a different action [since, by (1), they must be of a different type]. Therefore, if we start from a state where conditions (1), (2) and (3) are satisfied, we will always remain in states where those conditions are satisfied. We finally show that, taken any two distinct states s and s satisfying (1), (2) and (3), we can move from one to the other with positive probability. Indeed, s = (a, g) and s = (a , g ) can only differ because g i = g i for some agent i; every such agent can receive with positive probability a revision opportunity, and he can choose with positive probability to reshuffle all his links as long as links are cast toward agents choosing his own action, since by (1) there is no risk of forming a link with an agent of a type different from one's own.

Proof of Lemma 3
Proof We first show that 1 mutation is not sufficient to move from S A B to another absorbing set. Consider a state s ∈ S A B , and suppose that a single mutation hits an agent possibly changing both his action and his network of interactions. Suppose that an agent different from the mutant is given a revision opportunity. We claim that such an agent will not change action and will not form new links with agents choosing an action different from his own. To see why this is so, we observe five facts. First, forming new links with an agent who is currently playing a different action is suboptimal, as the expected payoff is negative due to the high penalty for a mismatch in type [because the expected payoff from such a link is at most π(B, B) − n y n y −1 d, which is negative due to the assumption that d > π(B, B) n−1 n y ]. Second, if the mutant switched from A to B, then any revising agent who is maintaining a connection with the mutant will not switch to B since he has k − 1 other neighbors playing A and so by switching he would get (k − 1)π(A, A) + π(A, B) > π(B, B) + (k − 1)π(B, A) (the inequality holding because k ≥ 2 and A is the risk-dominant action). Third, if instead the mutant switched from B to A, then any neighboring revising agent will not switch to A since he can keep playing B, remove the link with the mutant and form a new link with another agent playing B (who exists because n y ≥ 2k + 1) which gives him kπ(B, B) − kc > π(A, A) + (k − 1)π(A, B) − kc. Fourth, changing action is clearly suboptimal for an agent who is not maintaining a connection with the mutant. Finally, we observe that when the mutant is given a revising opportunity, he will certainly choose to have k links toward agents playing the same action he was playing before the mutation, because only doing so he can avoid to pay the cost d > π(B, B) since all other agents' action is perfectly correlated with their type; given this, it follows that the mutant will also choose to play the action he was playing before the mutation, since this allows him to coordinate. From these five observations it follows that, after that a single mutation has occurred, the system will surely go back to a state where conditions (1), (2) and (3) then R(S A B ) and R(S B A ) can be computed considering only mutations that hit agents of a single type-either x or y.
We start by providing a sufficient condition to have that m mutations hitting agents of type x are enough for the system to leave S A B with positive probability. By Lemma 2, we know that S A B is a single absorbing set. So we can choose a specific state in S A B to start from, and in particular we can choose a state where there exists a cluster made of k + 1 agents of type x (i.e., g i j = 1 for any i and j in the cluster). If m mutations hit m distinct agents in the cluster inducing them to switch action from A to B (and with no change to their interaction networks), and if the following inequality is satisfied: then all non-mutants in the cluster who receive a revision opportunity will find it optimal to choose to perform action B while not changing their interaction networks. So, with positive probability the system reaches a new state that belongs to an absorbing set different from S A B (indeed, at least the k + 1 agents of type x in the cluster will never go back to action A). We also observe that other agents of type x might then find it profitable to switch to action B and that all agents of type y will keep playing action B. Finally, we note that (2) is surely satisfied when m = k.
Suppose now that, starting from a state s ∈ S A B , m x mutations hit agents of type x, m y mutations hit agents of type y, so that state s is reached from which another absorbing set can be reached with positive probability. Since we know that (2) is satisfied when m = k, then in the following we focus on the case where m x + m y < k.
We first observe that at least one of the following two inequalities must be satisfied in s : To see why, suppose that both (3) and (4) are not satisfied. Then, no agent of type x who is given a revision opportunity finds it profitable to play action B [due to the failure of (3)], and no agent of type y who is given a revision opportunity finds it profitable to play action A [due to the failure of (4)]. For the mutant this is true a fortiori, since he can interact with smaller number of mutants-being himself one of the mutants. Hence, sooner or later all agents of type x will go back to play A and all agents of type y will go back play B. When we have perfect correlation between types and actions, it is obvious [because of the assumption that d > π(B, B)(n − 1)/(n y )] that revising agents will choose to maintain k links with agents choosing their same action (and hence having their same type). Therefore, if both (3) and (4) are not satisfied, then from s no other absorbing set can be reached. We now show that, under the assumption that n y > 2k + kd π(B,B)−max{π(A,A),π(A,B)} , inequality (4) is false. To understand this, it is enough to rewrite (4) with < instead of ≥ and to obtain an explicit bound on n y (getting rid of m x by making the new inequality harder to be satisfied). Therefore, inequality (3) must hold.
We then observe that, if inequality (3) holds, then inequality (2) is implied by the assumption that n y > 2k + kd π(B,B)−max{π(A,A),π(A,B)} . To understand this, we fix m = m x + m y , we take the difference between the left-hand side of (2) and the lefthand side of (3), and we set it larger than the difference between the right-hand side of (2) and the right-hand side of (3). Working out such inequality we obtain a bound on n x that is implied by n y > 2k + kd π(B,B)−max{π(A,A),π(A,B)} (once it is noted that n y < n x ). This means that, if m x mutations hitting agents of type x and m y mutations hitting agents of type y allow the system to leave S A B with positive probability, then m x + m y mutations hitting agents of type x only are also sufficient for the system to leave S A B with positive probability. We can repeat all the previous arguments with S B A in the place of S A B . The only difference is that m x and m y have inverted roles, and the same occurs for n x and m y . Summing up, it is true that, if n y > 2k + kd π(B,B)−max{π(A,A),π(A,B)} , then we are allowed to focus on mutations hitting only agents of one type, in the attempt to determine R(S A B ) and R(S B A ). Finally, we focus on mutations hitting only one type of agents, and we determine R(S A B ) and R(S B A ). We consider a state s ∈ S A B , and we observe that we already know that m mutations that hit agents of type x inducing them to switch from A to B are enough to leave S A B with positive probability, if inequality (2) is satisfied. We remark that such inequality is also necessary for such an exit from S A B . Indeed, it is immediate to observe that, if (2) is not satisfied, then no agent of type x who is given a revision opportunity will find it profitable to play action B, and agents of type y will clearly keep playing action A. Furthermore, once a mutant goes back to action A, the gain (potentially negative) of choosing B over A for agents of type x is further reduced, while agents of type y never find it profitable to play A over B. Sooner or later, perfect correlation between types and actions will be restored, and a k-regular and type-segregated state will be reached, belonging to S A B . We denote with m the minimum m such that inequality (2) is satisfied. We note that m ≥ k.
We now consider mutations hitting agents of type y who switch from B to A. As long as at least k + 1 agents of type B keep choosing B, then mutants will sooner of later go back to B (due to the perfect correlation between B and y, and the fact that B is the payoff-dominant action). Since n y ≥ 2k + 1, this means that at least k + 1 mutations hitting agents of type y are required to leave S A B . Since m ≥ k, we can conclude that R(S A B ) = m. We can repeat exactly the same arguments with S B A in the place of S A B , thus obtaining that R(

Proof of Lemma 4
Proof The proof begins by showing that, starting from a generic state s ∈ Q, another stateŝ can be reached with positive probability such that it satisfies properties that help in the following construction of a path fromŝ to an absorbing set Preliminarily, we define β x (s) ⊆ N x as the set of agents of type x at state s who are playing A and have at least one best reply strategy where either action B is played, or a new link toward an agent currently playing B is cast, or both. Similarly, define α y (s) as the set of agents of type y such that, at state s, they are playing B and have at least one best reply strategy where either action A is played, or a new link toward an agent currently playing A is cast, or both. We now show how the system can move with positive probability from state s to a stateŝ where β x (ŝ) = ∅ and α y (ŝ) = ∅.
Starting from state s, with positive probability all and only the agents in β x (s)∪α y (s) will receive a revision opportunity and will choose a best reply strategy where either action B (respectively, action A) is played, or a new link toward an agent currently playing B (respectively, A) is cast, or both. Call s the state reached after these updates.
If β x (s ) = ∅ and α y (s ) = ∅, then we are done; otherwise we iterate the updating process, giving revision opportunities to all and only the agents in β x (s ) ∪ α y (s ). We observe that this iteration will yield a stateŝ with β x (ŝ) = ∅ and α y (ŝ) = ∅ in a finite number of repetitions. This is so because agents who switch from A to B (respectively, from B to A) exit definitely set β x (s) [respectively, set α y (s)], and agents who cast a new link toward an agent currently playing B (respectively, A) can be part of set β x (s) [respectively, set α y (s)] for at most k times (since k is the maximum number of links that each agent can maintain).
Let us denote with N x A (ŝ) the set of agents of type x who are playing action A at stateŝ, and with N y B (ŝ) the set of agents of type y who are playing action B at stateŝ. The proof now proceeds by considering 4 possible cases concerning the emptiness/non-emptiness of the sets N x A (ŝ) and N y B (ŝ). For each case, we construct the needed sequence of states fromŝ ∈ Q to a state in Q that is either monomorphic or type-monomorphic.
Case 1 Suppose first that N x A (ŝ) = ∅ and N y B (ŝ) = ∅. We apply the following path-building procedure.
Consider a single mutation that hits an agent j of type x who is playing action B atŝ. If no such agent exists, we are done. Otherwise, suppose that after the mutation agent j copies the strategyŝ i = (â i ,ĝ i ) of an agent i ∈ N x A (ŝ); in particular, j will adopt strategy s j = (a j , g j ) such that a j =â i , g jh =ĝ ih for every h = i, j, and g ji =ĝ i j . We now observe that 3 properties hold for every agent h ∈ N x A (s ): Agent h has no best reply where (1) action B is chosen, or (2) a new link toward an agent playing B is cast, or (3) an existing link toward an agent of type x playing A is removed unless he is certain to find an agent of type x when casting a new link toward an agent playing A. (1) and (2) come from the fact that agents belonging to N x A (ŝ) have, by construction, no best reply where action B is chosen or a new link toward an agent choosing B is cast; the same holds for the same agents a fortiori at state s (where agent j has switched from B to A), and it also holds for agent j, who is copying agent i after mutation. (3) comes from the simple observation that, given the optimality of choosing A, it cannot be optimal to remove a link from an agent of type x playing A unless he is certain to find an agent of type x when casting a new link toward an agent playing A.
For similar reasons, analogous properties hold for the agents belonging to N y B (s ) (=N y B (ŝ)): No agent of type y who is playing B has a best reply where action A is chosen, or a new link toward an agent playing A is cast, or an existing link toward an agent of type y playing B is removed unless he is certain to find an agent of type y when casting a new link toward an agent playing A.
The above three properties imply that any state s that is reachable with positive probability in the next period of the unperturbed dynamic is such that, for every agent h ∈ N x A (s ), (1.) the number of h's neighbors of type x choosing action B has not increased, i.e., n hx B (s ) ≤ n hx B (ŝ), (2.) the number of neighbors of type x choosing action A has not decreased, i.e., n hx A (s ) ≥ n hx A (ŝ), (3.) the probability of mismatch for a new link toward an agent choosing action B has not decreased, i.e., Analogous inequalities, of course appropriately adjusted, hold for agents belonging to N y B (s ). Altogether these inequalities imply that, for the agents in N x A (s ) ∪ N y B (s ), the 3 properties holding at state s also hold at state s . By induction, we can conclude that the same properties will hold forever, and hence an absorbing set must be reached where the number of agents of type x playing A never falls below n x A (s ) = n x A (ŝ) + 1, and the number of agents of type y playing B never falls below n y B (s) = n y B (ŝ).
Starting from any state s in this absorbing set, and following the reasoning done at the beginning of the proof for state s, a stateŝ where β x (ŝ ) = ∅ and α y (ŝ ) = ∅ can be reached with positive probability. At stateŝ , there exist at least n x A (ŝ) + 1 agents of type x playing A, and n y B (ŝ) agents of type y playing B. Then, following the above argument, a single mutation allows to reach another absorbing set where the number of agents of type x playing A is at least n x A (ŝ) + 2, and the number of agents of type y playing B is at least n y B (ŝ). Given the finiteness of the set N x , an absorbing set where all agents of type x play A must eventually be reached and the number of agents of type y playing B is at least n y B (ŝ). This completes the procedure.
The same path-building procedure can now be repeated, constructing a sequence of absorbing sets, with each step requiring a single mutation, and where the minimum number of agents of type y playing B increases by at least 1 at each step, while the minimum number of agents of type x playing A always remains n x . Given the finiteness of the set N y , at the end of this procedure a state is reached where all agents of type x play A, and all agents of type y play B; at such a state, all agents find it optimal to have exactly k links with agents choosing the same action and hence will end up having k connections with agents of the same type; by Lemma 2, we know that the absorbing set S A B has been reached. Case 2 We now suppose that N x A (ŝ) = ∅ and N y B (ŝ) = ∅. We apply the procedure described above to agents in N y B (ŝ), thus obtaining that a single mutation per step is sufficient to move along a sequence of absorbing sets, where the number of agents of type y playing B increases by at least 1 at each step, until all agents of type y play B. Starting from any state s in the absorbing set that has been reached, and following the reasoning done at the beginning of the proof for state s, a stateŝ where β x (ŝ ) = ∅ and α y (ŝ ) = ∅ can be reached with positive probability. We know for sure that N y B (ŝ ) = N y . If N x A (ŝ ) = ∅, then we can apply the path-building procedure to agents in N x A (ŝ) and reason analogously to what done for case 1, so reaching the absorbing set S A B . If instead N x A (ŝ ) = ∅, then all agents of type x are playing B at stateŝ . The only possibility that some agents of type x are indifferent between playing A and playing B is that they are isolated (an agent i is isolated if g i j = 0 for all j ∈ N ). If they want to cast new links with agents choosing B, with positive probability they will do so and will be lucky enough to link to agents of type x. These agents now strictly prefer B over A. If more than one agent of type x remains isolated, then all such agents can jointly switch from B to A with positive probability; in the subsequent period, these agents will find it optimal to connect among themselves as playing A now implies to be an x and there is no risk of type mismatch; this leads with positive probability to a stateŝ where β x (ŝ ) = ∅, α y (ŝ ) = ∅, and N x A = ∅. Then, the path-building procedure described in case 1 can be applied starting fromŝ , and the absorbing set S A B is eventually reached. Finally, if at most one agent of type x is isolated and indifferent between A and B, then a single mutation can hit such an agent and let him connect with other agents of type x playing B, so that an absorbing set belonging to S B B is reached.
Case 3 Suppose that N x A (ŝ) = ∅ and N y B (ŝ) = ∅. This case runs as in case 2, with reversed roles between x and y and, when only one agent of type y is isolated, leading to an absorbing set belonging to S A A . Case 4 Finally, we consider the case in which N x A (ŝ) = ∅ and N y B (ŝ) = ∅. All agents of type x find it optimal to choose B and to have k links toward agents playing B (who are surely of type x), while all agents y find it optimal to choose A and to have k links toward agents playing A (who are surely of type y). The absorbing set S B A is so necessarily reached.

Proof of Lemma 5
Proof Suppose to start from S A B . As shown in the proof of Lemma 3, after the formation of a cluster of k + 1 agents of type x (which happens with positive probability starting from any s ∈ S A B ) it is enough to have R(S A B ) mutations hitting the agents in such a cluster (in particular, making them switch from action A to action B while keeping their interaction network fixed) to move the system to a state from which, with positive probability, a new absorbing set Q is reached where at least those k + 1 agents of type x choose B, and all agents of type y keep choosing B.
If Q ⊆ S B B , we are done. Otherwise, consider a single mutation hitting an agent of type x who currently plays A, making him choose action B and cast all his connections toward agents choosing action B. With positive probability, the mutant casts all his links toward agents of type x (which, by construction, are at least k + 1). This leads the system to either S B B or to another absorbing set where the number of agents of type x playing B has increased by at least 1, while all agents of type y keep playing B. By repeating this argument, S B B is surely reached within a finite number of steps each of which requires 1 mutation only.
The same reasoning can be applied to S B A in the place of S A B , completing the proof.

Proof of Lemma 6
Proof We show in the following that, starting from state s ∈ S A A ∪ S B B , we can reach S A B following a path of absorbing sets such that a single mutation is sufficient to move from one absorbing set to its successor in the path. The same arguments can be repeated for S B A instead of S A B , completing the proof. Suppose that to be in state s ∈ S B B . Suppose also that a single mutation hits an agent, say i, of type x making him switch to action A and no links. Call this new state s . Since d > π(B, B) n−1 n y , at s agent i does not want to cast new links toward agents playing B, so all states which are reachable with positive probability from s with one round of revision opportunities are such that i maintains no links. Moreover, if i is the only isolated agent of type x and no other agent wants to switch to action A, then s must belong to an absorbing set; otherwise, we reach a new state s which belongs to a new absorbing set where either i forms links with other isolated agents of type x who (with positive probability) switch to play A or some agents currently maintaining a link toward i switch to action A. With a further single mutation, another agent of type x who is currently playing B can be made switch to A, sever all his current links, and connect to and only to agents of type x who are playing A; this leads to a new state s belonging to a new absorbing set where the number of agents of type x playing A has increased. We can iterate the last passage until we get to some state in S A B .
Suppose now that s ∈ S A A . We can apply an argument similar to the one just described (with the only difference that mutations affect agents of type y) and draw an analogous conclusion.