GearPrototypes de recherche

Every piece of software I develop is licensed under the GNU General Public License v3. Data-Peeler, fitcare and multidupehack (which superseded Fenster in any way) are available from this page. Data-Peeler and Fitcare were designed and implemented within the LIRIS laboratory when I was working at INSA-Lyon. Please, send me an e-mail if you are interested in using and/or improving my most recent developments.

Table des matières :

Data-Peeler

From a dataset, d-peeler computes every closed n-set (i.e., maximal rectangles of 1 modulo permutations on any of the sets) satisfying given constraints. With the --minimize (-m) option, it also post-treats these patterns to output a minimization of the input dataset.

Related publications

Data-Peeler was first presented at SDM'08:

Article Loïc Cerf, Jérémy Besson, Céline Robardet, and Jean-François Boulicaut. Data-Peeler: Constraint-Based Closed Pattern Mining in n-ary Relations. In SDM'08: Proceedings of the Eighth SIAM International Conference on Data Mining, pages 37–48. SIAM, April 2008. Acceptance rate: 14%.

A longer version was published in the ACM TKDD journal:

Article Loïc Cerf, Jérémy Besson, Céline Robardet, and Jean-François Boulicaut. Closed Patterns Meet n-ary Relations. ACM Transactions on Knowledge Discovery from Data, 3(1):1–36, March 2009.
If you use Data-Peeler, or a modified version of it, and publish your results, we ask you to cite this reference.

Download

Here is the source code. Please, take the time to read the INSTALL file before compiling d-peeler and the README file before using it.

Fitcare

From a classified dataset, fitcare computes the bodies of the rules concluding on the classes such that every rule is frequent in one class and not frequent in any of the other classes. Either every frequency threshold is bound to a parameter set by the user or these parameters are automatically learned. Then, fitcarc can apply these rules on unclassified data.

Related publication

Fitcare was first presented at DaWaK'08:

Article Loïc Cerf, Dominique Gay, Nazha Selmaoui, and Jean-François Boulicaut. A Parameter-Free Associative Classification Method. In DaWaK'08: Proceedings of the Tenth International Conference on Data Warehousing and Knowledge Discovery, pages 293–304. Springer, September 2008. Acceptance rate: 33%.

A longer version was published in the DKE journal:

Article Loïc Cerf, Dominique Gay, Nazha Selmaoui-Folcher, Bruno Crémilleux, and Jean-François Boulicaut. Parameter-free Classification in Multi-Class Imbalanced Data Sets. Data & Knowledge Engineering, 87:109–129, September 2013.
If you use fitcare, or a modified version of it, and publish your results, we ask you to cite this reference.

Download

Here is the source code. Please, take the time to read the INSTALL file before compiling fitcare and fitcarc. The installation includes man pages for both commands.

Multidupehack

From a fuzzy relation, multidupehack computes every (closed) noise-tolerant n-set satisfying given constraints.

Related publication

Multidupehack was presented at ICDE'14:

Article Loïc Cerf and Wagner Meira Jr. Complete Discovery of High-Quality Patterns in Large Numerical Tensors. In ICDE'14: Proceedings of the 30th International Conference on Data Engineering, pages 448–459. IEEE Computer Society, April 2014. Associated poster. Acceptance rate: 20%.
If you use multidupehack, or a modified version of it, and publish your results, we ask you to cite this reference.

Download

Here is the source code. Please, take the time to read the INSTALL file before compiling multidupehack and the README file before using it.

Valid
							       HTML
							       4.01
							       Strict Valid
								      CSS