Author: Xiaolei Li
Sampled data is included in code link
This is the original implementation of
High-Dimensional OLAP: A Minimal Cubing Approach
Xiaolei Li, Jiawei Han, Hector Gonzalez.
Proceedings of the 30th International Conference on Very Large Data
Bases (VLDB 2004), Toronto, Canada, August 2004.
Xiaolei Li (email@example.com)
The Makefile is included. Just run "make" to compile the program.
An executable named "frag" should be created under /bin/ this time.
No Windows binaries are included.
The Frag-Cubing program is an online program. Given a base table,
fragment size, and minsup as input parameters, the program generates
the shellfrags and awaits user queries at a prompt.
At this prompt, there are several commands which are accepted.
1) "quit", "exit", "x"
Exits the program
2) "m <int>"
Resets the minimum support. Only makes sense if it's higher
than the original minimum support.
Set to verbose mode. Program will output various different
information. For testing purposes.
Binary switch to set file output on or off.
Script to run experiments used in the paper. See code for more
This is the query command. Probably the most interesting
command. The format of the command is:
"q <value> <value> ... <value>"
The number of values should match the number of dimensions. If
not, the program should give an error saying so.
There are 3 possible inputs for each value.
1) Integer: This instantiates the dimension to this value.
2) *: This marks the dimension as don't-care. All different
values will be aggregated.
3) ?: This marks the dimension as list-all. In other words,
all different values and their cube cells will be returned.
Suppose we have a 4-dimensional data cube.
1) "q * * * *"
This should return the apex cell, which counts how many
tuples there are overall.
2) "q 4 * * 1"
This should return the cell where the first dimension
equals 4, the last dimension equals 1, and the middle 2
dimensions are aggregated.
3) "q 4 ? * 1"
This should return a set of cells where the first
dimension equals 4, the last dimension equals 1, the
third dimension is aggregated. The answer set should
show all different values for the second dimension.