How to Attack the NP-complete Dag Realization Problem in Practice

We study the following fundamental realization problem of directed acyclic graphs (dags). Given a sequence $S:={a_1 \choose b_1},\dots,{a_n \choose b_n}$ with $a_i,b_i\in \mathbb{Z}_0^+$, does there exist a dag (no parallel arcs allowed) with labeled vertex set V:={v1,…,vn} such that for all vi∈V indegree and outdegree of vi match exactly the given numbers ai and bi, respectively? Recently this decision problem has been shown to be NP-complete by Nichterlein [1]. However, we can show that several important classes of sequences are efficiently solvable. In previous work [2], we have proved that yes-instances always have a special kind of topological order which allows us to reduce the number of possible topological orderings in most cases drastically. This leads to an exact exponential-time algorithm which significantly improves upon a straightforward approach. Moreover, a combination of this exponential-time algorithm with a special strategy gives a linear-time algorithm. Interestingly, in systematic experiments we observed that we could solve a huge majority of all instances by the linear-time heuristic. This motivates us to develop characteristics like dag density and "distance to provably easy sequences" which can give us an indicator how easy or difficult a given sequence can be realized. Furthermore, we propose a randomized algorithm which exploits our structural insight on topological sortings and uses a number of reduction rules. We compare this algorithm with other straightforward randomized algorithms and observe that it clearly outperforms all other variants. Another striking observation is that our simple linear-time algorithm solves a set of real-world instances from different domains, namely ordered binary decision diagrams (OBDDs), train and flight schedules, as well as instances derived from food-web networks without any exception.