4. Graph based formulas

The most interesting benchmark formulas have a graph structure. See the following example, where cnfformula.TseitinFormula() is realized over a star graph with five arms.

>>> import cnfformula
>>> import networkx as nx
>>> G = nx.star_graph(5)
>>> G.edges()
>>> [(0, 1), (0, 2), (0, 3), (0, 4), (0, 5)]
>>> F = cnfformula.TseitinFormula(G,charges=[0,1,1,0,1,1])
>>> F.is_satisfiable()
(True,
 {'E_{0,1}': True,
  'E_{0,2}': True,
  'E_{0,3}': False,
  'E_{0,4}': True,
  'E_{0,5}': True})

Tseitin formulas can be really hard for if the graph has large edge expansion. Indeed the unsatisfiable version of this formula requires exponential running time in any resolution based SAT solver [1].

In the previous example the structure of the CNF was a simple undirected graph, but in CNFgen we have formulas built around four different types of graphs.

simple simple graph default graph
bipartite bipartite graph vertices split in two inpedendent sets
digraph directed graph each edge has an orientation
dag directed acyclic graph no cycles, edges induce a partial ordering

To manipulate graphs CNFgen does not reinvent the wheel, but uses the famous NetworkX library behind the scene. NetworkX supports reading and writing graph from/to files, and CNFgen leverages on that. Furthermore CNFgen implements graph I/O in few other file formats. The function cnfformula.graphs.supported_formats() lists the file formats available for each graph type.

>>> from cnfformula.graphs import supported_formats
>>> supported_formats()
{'bipartite': ['matrix', 'gml', 'dot'],
 'dag': ['kthlist', 'gml', 'dot'],
 'digraph': ['kthlist', 'gml', 'dot', 'dimacs'],
 'simple': ['kthlist', 'gml', 'dot', 'dimacs']}

The dot format is from Graphviz and it is available only if the optional pydot2 python package is installed in the system. The Graph Modelling Language (GML) gml is the current standard in graph representation. The DIMACS (dimacs) format [2] is used sometimes for programming competitions or in the theoretical computer science community. The kthlist and matrix formats are defined and implemented inside CNFgen.

Note

More information about the supported graph file formats is in the User Documentation for the cnfgen command line too. In particular there is a description of kthlist and matrix formats.

4.1. Directed Acyclic Graphs and Bipartite Graphs

NetworkX does not have a specific data structure for a directed acyclic graphs (DAGs). In CNFgen a DAG is any object which is either a networkx.DiGraph or networkx.MultiDiGraph instance, and which furthermore passes the test cnfformula.graphs.is_dag().

>>> import networkx as nx
>>> G = nx.DiGraph()
>>> G.add_cycle([1,2,3])
>>> cnfformula.graphs.is_dag(G)
False
>>> H = nx.DiGraph()
>>> H.add_path([1,2,3])
>>> cnfformula.graphs.is_dag(H)
True

In the same way NetworkX does not have a particular data structure for bipartite graphs ( networkx.Graph and networkx.MultiGraph are possible choices), but it follows the convention that all vertices in the graph have the bipartite attribute that gets values \(\{0,1\}\) [3]. The CNF formula constructions that make use of bipartite graphs usually associate the \(0\) part as the left side, and the \(1\) part to the right side. The function cnfformula.graphs.has_bipartition() tests whether this bipartition exists in a graph.

>>> import networkx as nx
>>> G=nx.bipartite.random_graph(3,2,0.5)
>>> cnfformula.graphs.has_biparition(G)
True
>>> G.node
{0: {'bipartite': 0},
 1: {'bipartite': 0},
 2: {'bipartite': 0},
 3: {'bipartite': 1},
 4: {'bipartite': 1}}
>>> G.edges()
[(0, 4), (1, 3), (1, 4), (2, 3)]
>>> F = cnfformula.GraphPigeonholePrinciple(G)
>>> list(F.variables())
['p_{0,4}', 'p_{1,3}', 'p_{1,4}', 'p_{2,3}']

4.2. Graph I/O

The cnfformula.graphs module implements a graph reader cnfformula.graphs.readGraph and a graph writer cnfformula.graphs.writeGraph to facilitate graph I/O. .. Both readGraph and writeGraph operate either on filenames, encoded as str or unicode, or otherwise on file-like objects such as

>>> import sys
>>> import networkx as nx
>>> from cnfformula.graphs import readGraph, writeGraph
>>> G = nx.bipartite.random_graph(3,2,0.5)
>>> writeGraph(G,sys.stdout,graph_type='bipartite',file_format='gml')
strict graph "fast_gnp_random_graph(3,2,0.5)" {
 0    [bipartite=0];
     4        [bipartite=1];
     0 -- 4;
     1        [bipartite=0];
     3        [bipartite=1];
     1 -- 3;
     2        [bipartite=0];
     2 -- 4;
}
>>> from StringIO import StringIO
>>> buffer = StringIO("graph X { 1 -- 2 -- 3 }")
>>> G = readGraph(buffer, graph_type='simple', file_format='dot')
>>> G.edges()
[('1', '2'), ('3', '2')]

There are several advantages with using those functions, instead of the reader/writer implemented NextowrkX. First of all the reader always verifies that when reading a graph of a certain type, the actual input actually matches the type. For example if the graph is supposed to be a DAG, then cnfformula.graphs.readGraph() would check that.

>>> buffer = StringIO('digraph A { 1 -- 2 -- 3 -- 1}')
>>> readGraph(buffer,graph_type='dag',file_format='dot')
[...]
ValueError: Input graph must be acyclic

When the file object has an associated file name, it is possible to omit the file_format argument. In this latter case the appropriate choice of format will be guessed by the file extension.

>>> with open("example.dot","w") as f: print >> f , "digraph A {1->2->3}"
>>> G = readGraph("example.dot",graph_type='dag')
>>> G.edges()
[('1', '2'), ('2', '3')]
>>> with open("example.gml","w") as f: print >> f , "digraph A {1->2->3}"
>>> G = readGraph("example.gml",graph_type='dag',file_format='dot')
>>> G.edges()
[('1', '2'), ('2', '3')]
>>> with open("example.gml","w") as f: print >> f , "digraph A {1->2->3}"
>>> G = readGraph("example.gml",graph_type='dag')
ValueError: [Parse error in GML input] expected ...

4.3. Graph generators

Note

See the documentation of the module cnfformula.graphs for more information about the CNFgen support code for graphs.

4.4. References

[1]A. Urquhart. Hard examples for resolution. Journal of the ACM (1987) http://dx.doi.org/10.1145/48014.48016
[2]Beware. Here we are talking about the DIMACS format for graphs, not the DIMACS file format for CNF formulas.
[3]This convention is describe in http://networkx.readthedocs.org/en/latest/reference/algorithms.bipartite.html