Computer forensic text corpora are usually very heterogeneous and easily surpass the terabyte range. Classification methods should be an aid in the exploration of such corpora, but they do not help in the task of thematically grouping together documents. In this paper, we propose the use of Adaptive Resonance Theory (ART), applying the ART1 algorithm, to help in the task of thematically grouping together computer forensics documents into clusters. For the clustering approach we present the defined conceptual model and the software package implemented, in which a modified version of the ART1 algorithm was developed to improve the running time. Furthermore, real world forensic experiments were carried out to validate the model using a two-fold approach with a quantitative and a qualitative analysis method. The results demonstrate that our approach can generate good clusters when compared to the gold standard defined by domain area experts, with one clear advantage over other clustering methods (e.g. SOM and k-means) since there is no need to supply parameters beforehand such as the number of clusters.
Computer forensics; document clustering; artificial neural networks; adaptive reasonance theory; ART1 algorithm.
To return to the Volume/Number webpage, click here.
THE INTERNATIONAL JOURNAL OF FORENSIC COMPUTER SCIENCE - IJoFCS
Volume 7, Number 1, pages 24-41, DOI: 10.5769/J201201003 or http://dx.doi.org/10.5769/J201201003
Using ART1 Neural Networks for Clustering Computer Forensics Documents
By Georger Ara˙jo, and CÚlia Ralha
To download this paper, click here.