Prototypes de
recherche
Every piece of software I develop is licensed under the GNU
General Public License
v3. Data-Peeler, fitcare, multidupehack
(which superseded Fenster
in any way), Biceps
and NclusterBox are
available from this page, and so is the tool that Victor
Henrique Silva Ribeiro developed under my supervision to
visualize NclusterBox's
summaries. Data-Peeler
and Fitcare were designed
and implemented within the
LIRIS laboratory when I was working at
INSA-Lyon. Please,
send me an e-mail if you
are interested in using and/or improving my most recent
developments.
Data-Peeler
From a dataset, d-peeler computes every closed
n-set (i.e., maximal rectangles of 1 modulo permutations on
any of the sets) satisfying given constraints. With
the --minimize (-m) option, it also
post-treats these patterns to output a minimization of the
input dataset.
Related publications
Data-Peeler was first
presented at SDM'08:
A longer version was published in
the ACM TKDD journal:

Loïc Cerf, Jérémy Besson, Céline
Robardet, and Jean-François
Boulicaut.
Closed
Patterns Meet n-ary Relations.
ACM
Transactions on Knowledge Discovery from Data,
3(1):1–36, March 2009.
If you
use Data-Peeler, or a
modified version of it, and publish your results, we ask you
to cite this reference.
Download
Here is the
source code. Please, take the time to read the INSTALL
file before compiling d-peeler and the README
file before using it.
Fitcare
From a classified dataset, fitcare computes
the bodies of the rules concluding on the classes such that
every rule is frequent in one class and not frequent in any of
the other classes. Either every frequency threshold is bound
to a parameter set by the user or these parameters are
automatically learned. Then, fitcarc can apply
these rules on unclassified data.
Related publications
Fitcare was first
presented at DaWaK'08:

Loïc Cerf, Dominique Gay, Nazha Selmaoui,
and Jean-François
Boulicaut.
A
Parameter-Free Associative Classification
Method. In
DaWaK'08: Proceedings of
the Tenth International Conference on Data Warehousing and
Knowledge Discovery, pages 293–304. Springer,
September 2008. Acceptance rate: 33%.
A longer version was published in
the DKE journal:
If you
use fitcare, or a modified
version of it, and publish your results, we ask you to cite
this reference.
Download
Here is the
source code. Please, take the time to read the INSTALL
file before compiling fitcare
and fitcarc. The installation
includes man pages for both commands.
Multidupehack
From a fuzzy tensor, multidupehack computes
every (closed) noise-tolerant n-set satisfying given
constraints.
Related publication
Multidupehack was
presented at ICDE'14:
If you
use multidupehack, or a
modified version of it, and publish your results, we ask you
to cite this reference.
Download
Here
is the
source code. Please, take the time to read the INSTALL
file before compiling multidupehack and the
README file before using it.
This
archive contains the datasets and the script to rerun
all the experiments (and more) reported in the
article Enforcement of Minimal Size and Area
Constraints before and while Mining Patterns in Fuzzy
Tensors, which was presented at SAC 2023.
Biceps
Given a real
matrix,
Biceps lists
muscly biclusters. A bicluster is a a subset of rows
associated with a subset of columns. In any column of a
muscly bicluster, the values in the rows of the bicluster are
all strictly greater than those out. Moreover, the rows of
the bicluster must not be a subset or a superset of the rows
of another bicluster of greater or equal quality.
Related publication
Biceps was published in
the Data Mining and Knowledge
Discovery journal:
If you
use Biceps, or a modified
version of it, and publish your results, we ask you to cite
this reference.
Download
Here is the
source code. Please, take the time to read the INSTALL
file before compiling biceps and the README
file before using it.
This archive
contains the datasets and the scripts to rerun all the
experiments reported in the aforementioned article.
NclusterBox
NclusterBox modifies
patterns, which hold in a fuzzy tensor, to maximize their
explanatory powers and selects an ordered subset of the built
patterns to summarize this tensor.
Related publications
A preliminary version
of NclusterBox was
presented at SAC'23:
After many modifications, which have greatly improved the
qualities of the returned summaries and much lowered the
computational requirements to discover
them, NclusterBox was
published in the Information
Sciences journal:
If you
use NclusterBox, or a
modified version of it, and publish your results, we ask you
to cite this reference.
Download
Here
is the
source code. Please, take the time to read the INSTALL
file before compiling nclusterbox and the
README file before using it.
This
archive contains the datasets and the scripts to rerun
all the experiments reported in the SAC'23 article
and this
one all those reported in the article and in the
supplementary material published in the
Information Sciences
journal.
Boxcluster Visualization
Boxcluster Visualization,
that Victor Henrique Silva Ribeiro developed under my
supervision, provides an interactive visualization of patterns
summarizing fuzzy tensors through the disjunctive box cluster
model.
Related publication
The article detailing the visualization will soon be
published in the Information
Visualization journal:

Victor
Henrique Silva Ribeiro and Loïc Cerf. Interactively
Visualizing Pattern-Based Summaries of Fuzzy
Tensors.
Information
Visualization.
Download
Here
are the
source code
and a
link to binaries. The formats for the input tensors and
patterns are nclusterbox's input and output
formats, as detailed in the README file that comes
with its
source code.