Utilize este identificador para referenciar este registo: http://hdl.handle.net/10437/14088
Título: Generating multidimensional clusters with support lines
Autores: Fachada, Nuno
de Andrade, Diogo
Palavras-chave: SYNTHETIC DATA GENERATION
CRIAÇÃO DE DADOS SINTÉTICOS
CLUSTER ANALYSIS
ANÁLISE DE CLUSTERS
DATA GENERATION
CRIAÇÃO DE DADOS
INFORMÁTICA
COMPUTER SCIENCE
Editora: Elsevier
Citação: Fachada, N. & de Andrade, D. (2023). Generating multidimensional clusters with support lines, 277, 110836. https://doi.org/10.1016/j.knosys.2023.110836
Resumo: Synthetic data is essential for assessing clustering techniques, complementing and extending real data, and allowing for more complete coverage of a given problem’s space. In turn, synthetic data generators have the potential of creating vast amounts of data – a crucial activity when real-world data is at premium – while providing a well-understood generation procedure and an interpretable instrument for methodically investigating cluster analysis algorithms. Here, we present Clugen, a modular procedure for synthetic data generation, capable of creating multidimensional clusters supported by line segments using arbitrary distributions. Clugen is open source, comprehensively unit tested and documented, and is available for the Python, R, Julia, and MATLAB/Octave ecosystems. We demonstrate that our proposal can produce rich and varied results in various dimensions, is fit for use in the assessment of clustering algorithms, and has the potential to be a widely used framework in diverse clustering-related research tasks. Keywords: Synthetic data, Clustering, Data generation, Multidimensional data
Descrição: Knowledge-Based Systems
URI: https://doi.org/10.1016/j.knosys.2023.110836
http://hdl.handle.net/10437/14088
ISSN: 0950-7051
Aparece nas colecções:FE - Artigos de Revistas Internacionais com Arbitragem Científica

Ficheiros deste registo:
Ficheiro Descrição TamanhoFormato 
arxiv_v2.pdf11.85 MBAdobe PDFVer/Abrir


Todos os registos no repositório estão protegidos por leis de copyright, com todos os direitos reservados.