A3 Book section, Chapters in research books
Improving Clustering and Cluster Validation with Missing Data Using Distance Estimation Methods (2022)
Niemelä, M., & Kärkkäinen, T. (2022). Improving Clustering and Cluster Validation with Missing Data Using Distance Estimation Methods. In T. T. Tuovinen, J. Periaux, & P. Neittaanmäki (Eds.), Computational Sciences and Artificial Intelligence in Industry : New Digital Technologies for Solving Future Societal and Economical Challenges (pp. 123-133). Springer. Intelligent Systems, Control and Automation: Science and Engineering, 76. https://doi.org/10.1007/978-3-030-70787-3_9
JYU authors or editors
Publication details
All authors or editors: Niemelä, Marko; Kärkkäinen, Tommi
Parent publication: Computational Sciences and Artificial Intelligence in Industry : New Digital Technologies for Solving Future Societal and Economical Challenges
Parent publication editors: Tuovinen, Tero T.; Periaux, Jacques; Neittaanmäki, Pekka
ISBN: 978-3-030-70786-6
eISBN: 978-3-030-70787-3
Journal or series: Intelligent Systems, Control and Automation: Science and Engineering
ISSN: 2213-8986
eISSN: 2213-8994
Publication year: 2022
Number in series: 76
Pages range: 123-133
Number of pages in the book: 275
Publisher: Springer
Place of Publication: Cham
Publication country: Switzerland
Publication language: English
DOI: https://doi.org/10.1007/978-3-030-70787-3_9
Publication open access: Not open
Publication channel open access:
Publication is parallel published (JYX): https://jyx.jyu.fi/handle/123456789/84512
Additional information: The CSAI 2019 Conference (Computational Science and AI in Industry: New Digital Technologies for Solving Future Societal and Economical Challenges) took place at Jyväskylä, Finland, on June 12–14, 2019.
Abstract
Missing data introduces a challenge in the field of unsupervised learning. In clustering, when the form and the number of clusters are to be determined, one needs to deal with the missing values both in the clustering process and in the cluster validation. In the previous research, the clustering algorithm has been treated using robust clustering methods and available data strategy, and the cluster validation indices have been computed with the partial distance approximation. However, lately special methods for distance estimation with missing values have been proposed and this work is the first one where these methods are systematically applied and tested in clustering and cluster validation. More precisely, we propose, implement, and analyze the use of distance estimation methods to improve the discrimination power of clustering and cluster validation indices. A novel, robust prototype-based clustering process in two stages is suggested. Our results and conclusions confirm the usefulness of the distance estimation methods in clustering but, surprisingly, not in cluster validation.
Keywords: machine learning; cluster analysis; algorithms
Contributing organizations
Related projects
- Competitive funding to strengthen universities’ research profiles. Profiling actions at the JYU, round 3
- Hämäläinen, Keijo
- Research Council of Finland
- STRUCTURE PREDICTION OF HYBRID NANOPARTICLES VIA ARTIFICIAL INTELLIGENCE
- Kärkkäinen, Tommi
- Research Council of Finland
Ministry reporting: Yes
Reporting Year: 2022
JUFO rating: 2
Parent publication with JYU authors:
- Tuovinen, T. T., Periaux, J., & Neittaanmäki, P. (Eds.). (2022). Computational Sciences and Artificial Intelligence in Industry : New Digital Technologies for Solving Future Societal and Economical Challenges. Springer. Intelligent Systems, Control and Automation: Science and Engineering, 76. https://doi.org/10.1007/978-3-030-70787-3