R package for exploratory data analysis and manipulation of multilabel datasets

Version 0.2.25 released

A new version of mldr was just released and is now live on CRAN. This update adds new functions able to assess multilabel classifier performance, allows to create mld objects out of ARFF files with different structures (see example below), and fixes several bugs.

Classification performance measures

mldr now includes the mldr_evaluate function, which analyzes the performance of classifier predictions via several well-known measures (Accuracy, Precision, Recall, F-measure, Hamming Loss among others). Using it is simple: just call it with the test dataset and the predictions generated by the classifier. The function will return a list with all 20 measures identified by their names.

res <- mldr_evaluate(emotions, predictions)

New parameters to identify labels

Labels in ARFF files can be structured in several ways, so now the mldr constructor allows the use of three new parameters that will ease the read of a multilabel dataset: label_indices, label_names and label_amount. The first one enables the user to specify exactly the indices the labels will be taking in the dataset; the second one identifies labels by using their names, and the last one takes a number of labels to be read from the last attributes in the ARFF file.

corel5k <- mldr("corel5k", label_amount = 374)
emotions <- mldr("emotions", label_indices = c(73, 74, 75, 76, 77, 78))

New vignette: Working with Multilabel Datasets in R

A vignette has been added to mldr as well. The document instructs the reader on how the package works and provides examples to ease the learning. It can be loaded with the command vignette("mldr") or downloaded from CRAN.