Other Things to Download: [System Code], [Datasets]

Applications

To download the applications, please check this page. For all our application programs, the first input argument is the input data path on HDFS, and the second input argument is the number of threads run in each process (machine). Check this page for how to run G-thinker applications.

Triangle Counting

The GitHub link for this application is here. It takes an undirected unlabeled graph (see this page for the input data format), and counts the number of triangles in it.

Maximum Clique Finding

The GitHub link for this application is here. It takes an undirected unlabeled graph (see this page for the input data format), and reports a maximum clique. This application will split a graph into smaller subgraphs if that graph has more than 400,000 vertices, so that they can be distributed to other threads (including those in other machines) for processing.

Graph Matching

The GitHub link for this application is here. It takes an undirected labeled graph (see this page for the input data format), and counts the number of its subgraphs that match a query graph shown below.

Maximal Quasi-Clique Mining

The GitHub link for this application is here. This application code should be used together with our new G-thinker version and please see our PVLDB 2020 paper here for the details.

Single-Threaded Programs for Comparison

The GitHub link for the traditional single-threaded algorithms for these problems are provided here. Each folder there corresponds to one application, where run.cpp contains the main function, and a toy graph is provided for your testing.