1 Supplementary Material 1 . 1 Proof of Proposition 1

We will show that, given an arbitrary strictly-ordered d-tree D, we can perform an invertible transformation to turn it into a binary c-tree C; and vice-versa. Let D be given. We visit each node h ∈ {1,. .. , L} and split it into K + 1 nodes, where K = |M h |, organized as a linked list, as Figure 3 illustrates (this will become the spine of h in the c-tree). For each modifier m k ∈ M h with m 1 ≺ h. .. ≺ h m K , move the tail of the arc h, m k , Z k to the (K + 1 − k)th node of the linked list and assign the label Z k to this node, letting h be its lexical head. Since the incoming and outgoing arcs of the linked list component are the same as in the original node h, the tree structure is preserved. After doing this for every h, add the leaves and propagate the yields bottom up. It is straightforward to show that this procedure yields a valid binary c-tree. Since there is no loss of information (the orders ≺ h are implied by the order of the nodes in each spine), this construction can be inverted to recover the original d-tree. Conversely, if we start with a binary c-tree, traverse the spine of each h, and attach the modifiers m 1 ≺ h. .. ≺ h m K in order, we get a strictly ordered d-tree (also an invertible procedure). We need to show that (i) Algorithm 1, when applied to a continuous c-tree C, retrieves a head ordered d-tree D which is projective and has the nesting property, (ii) vice-versa for Algorithm 2. To see (i), note that the projectiveness of D is ensured by the well-known result of Gaifman (1965) about the projection of continuous trees. To show that it satisfies the nesting property, note that nodes higher in the spine of a word h are always attached by modifiers farther apart (otherwise edges in C would cross, which cannot happen for a continuous C). To prove (ii), we use induction. We need to show that every created c-node in Algorithm 2 has a contiguous span as yield. The base case (line 3) is trivial. Therefore, it suffices to show that in line 8, assuming the yields of (the current) ψ(h) and each …