Data clustering using leaders and followers optimization and differential evolution
Künye
Zorarpacı, E. (2023). Data clustering using leaders and followers optimization and differential evolution. Applied Soft Computing, 132, art. no. 109838. https://doi.org/10.1016/j.asoc.2022.109838Özet
Data clustering is an important research topic in data mining. Although cluster analysis based on optimization algorithms has attracted great attention, optimization-based techniques face difficul-ties due to the non-linear objective function and complicated search space. Leaders and followers optimization (LaF) introduced in the 2015 IEEE Congress on Evolutionary Computation, and differential evolution algorithm (DE) are two efficient evolutionary computation methods, and they own some special advantages. The key power of LaF is the exploration in multi-modal search spaces, but it has a poor performance in the exploitation. On the other hand, DE based on the DE/best/1 mutation operator significantly promotes the exploitation process. In this study, the strong properties of LaF and DE are combined to balance exploration and exploitation in the search space to discover the cluster centroids. Besides, the proposed clustering method, i.e., LaF-DE, does not need parameter settings, unlike the existing optimization-based partitional clustering methods. Hence, this study proposes a straightforward, parameter-free, and efficient novel hybrid algorithm for the optimization-based partitional data clustering problem. Many experiments on the functions from CEC2017 test suite show that LaF-DE has better optimization performance and higher stability than the state-of-the-art metaheuristic algorithms. LaF-DE has been compared with well-known clustering techniques on the UCI and Shape datasets. The experimental results and statistical tests indicate that LaF-DE outperforms the well-known partitional clustering methods on 8 of 12 datasets in terms of internal performance metrics. Besides, LaF-DE performs better than density peaks clustering on 9 of 12 datasets in terms of external performance metrics.