Concurrency Theory: Lecture 12, 08 March 2018 --------------------------------------------- Recall: Let t = (E,<=.lab) be a trace, lab : E --> Act max_i(t) = maximum i-event in t delta_i(t) = i-view of t = {e | e <= max_i(t)} latest_{i->j}(t) = max_j(delta_i(t)) ---------- Problem: Suppose each process i incrementally maintains local information about latest{i->k}(t) for every process k. When i and j meet, they need to compare which of latest_{i->k}(t) and latest_{j->k}(t) is latest, for every process k Constraints: Finite-state, maintain only bounded amount of information. ---------- Difficulty: - Comparing local timestamps relies on actual values of timestamps. Counter values are unbounded, so number of bits to maintain counters and timestamps is also unbounded. - Finite state => bounded memory => bounded set of labels, that must be reused. Need to fix order between labels dynamically, according to context. (For unbounded numbers, order between labels is fixed statically, based on value of number.) ---------- Comparing labels without relying on values: Suppose i and j synchronize on an action after t. Events in delta_i(t) union delta_j(t) divide into three sets: 1. E_common = delta_i(t) intersect delta_j(t) : both see these events 2. E_i = delta_i(t) \ delta_j(t) : only i sees these events 3. E_j = delta_j(t) \ delta_i(t) : only j sees these events Note that each pair of events e in E_i and f in E_j is independent. We want to compare e_ik = latest_{i->k}(t) and e_jk = latest{j->k}(t). Three cases are possible A. e_ik in E_i, e_jk in E_common : e_ik is later than e_jk B. e_jk in E_j, e_ik in E_common : e_jk is later than e_ik C. e_ik and_jk both in E_common : e_jk = e_ik Cannot have e_ik in E_i and e_jk in E_j because all k events are linearly ordered, but E_i is independent of E_j. If we can compute whether e_ik and e_jk are inside or outside E_common, we are done. Lemma: Every maximal element in E_common is a primary event for both i and j Therefore, the set of events that belong to both latest_i and latest_j include all the maximal events in E_common - Any event in latest_i below such a maximal event lies in E_common - Any event in latest_i above such a maximal event lies in E_i Given this, each process i now maintains latest_i as a partial order, called the primary graph, rather than just an indexed list (array) of primary events Informal algorithm 1. i and j scan primarygraph_i and primarygraph_j and mark all events whose labels appear in both graphs 2. For any other k, check the positions of e_ik = latest_{i->k}(t) and e_jk = latest{j->k}(t) with respect to marked events. This tells us which of the following holds A. e_ik in E_i, e_jk in E_common : e_ik > e_jk B. e_jk in E_j, e_ik in E_common : e_jk > e_ik C. e_ik and_jk both in E_common : e_jk = e_ik 3. We collect the later copies of e_ik for each k. We have to also put these together as a new primary graph a. If (e,f) both come from i, inherit edge from primarygraph_i b. If (e,f) both come from j, inherit edge from primarygraph_j c. If e comes from i and f comes from j, e is in E_i, b is in E_j and they must be unordered, so no edge Notice that we need only equality of labels to perform this comparison, actual values are unimportant. So we can use labels without assuming any static ordering relation. - To reuse labels, we should ensure that labels are used consistently across across all primary graphs---that is, equal labels always denote the same event. - Each primary graph has upto N events. There are N processes, so at most N^2 labels are ever "in use" at any time. - In principle, if we N^2 + 1 labels, we always have one that is free to use. Question: how can the processes that synchronize on an action a accurately check which labels are being used by other processes not involved in a? ---------------------------------------------------------------------- Recycling labels - Event labels are used across primary graph to determine the maximal events in the intersection. - If two labels are the same, they must refer to the same event - If {i,j} meet to execute an action a, the current event needs a label (a,l) that is not in use in any other agent's primary graph ---------------------------------------------------------------------- Secondary information - i's best information about j's primary information: latest_{j->k} wrt latest_{i->j}(t) - Write as latest_{i->j->k}(t) Observe that secondary information is "inherited" by the comparison of secondary information. Suppose latest_{i->k} is older than latest_{j->k}. Then, every event of the form latest_{i->k->k'} must be older than the corresponding event latest_{j->k->k'}. Hence, when i updates its primary information for k from j, it also copies all of j's secondary information for k. Note that this does not involve comparing labels at the secondary level. The original comparison of labels in the primary graph suffices to update both primary and secondary information. ---------------------------------------------------------------------- Lemma: Suppose e is an i-event that is in the primary graph for some other process j. Then, e is also a secondary event for i. Proof: There are paths from e to max_i(t) and max_j(t). Consider the path from e to max_j(t). This path leaves the intersection of view_i(t) and view_j(t) at some point. Let e',e" be events on this path such that e' is in the intersection and e" is outside the interesection. e' and e" are labelled by dependent letters, so there is some process k that takes part in both. Clearly, e' = latest_{i->k}(t) because e' is in the intersection and the next k-event, e", is not. Hence, e = latest_{i->k->i}(t). Corollary: If e is an i-event and e does not appear in the secondary information of i, e does not appear in the primary graph of any other process j. ---------------------------------------------------------------------- This gives us an algorithm with bounded labels for maintaining primary information. Since each process has at most n^2 secondary events and there are n processes overall, we need at most n^3 + 1 labels to ensure that each new event can be labelled consistently. - When i and j synchronize, they choose a label for the new event that is not in their secondary information. - They then compare their primary graphs and update their primary and secondary information ======================================================================