Analysis and Application of Big Data Feature Extraction Based on Improved k-means Algorithm

Wenjuan Yang

doi:10.12694/scpe.v25i1.2281

PDF

Published: Jan 4, 2024

DOI: https://doi.org/10.12694/scpe.v25i1.2281

Keywords:

Improved k-means algorithm; big data; feature extraction

Wenjuan Yang

Shanghai Zhongqiao Vocational and Technical University, Shanghai, 201514, China

Abstract

This paper addresses the challenges modelled by collecting and storing large volumes of big data, focusing on mitigating data errors. The primary goal is to propose and evaluate an enhanced K-means algorithm for big data applications. This research also aims to design an extensive energy data system to demonstrate the improved algorithm's practical utility in monitoring power equipment. The research begins with an in-depth analysis of the traditional K-means algorithm, culminating in the proposal of an improved version. Subsequently, the study outlines developing a comprehensive, extensive energy data system, encompassing architectural aspects such as data storage, mechanical layers, and data access structures. The research also involves the development of a power big data analysis platform, incorporating the improved algorithm for clustering and analyzing power equipment monitoring data. Experimental results reveal that the proposed improved K-means algorithm outperforms the traditional version, with significantly improved accuracy and reduced classification errors, achieving an error rate of less than one. The improved K-means algorithm showcased remarkable enhancements, achieving a meagre misclassification rate of just 0.08% while substantially boosting accuracy levels, consistently exceeding 95% across all datasets. Moreover, the power big data system developed in this study to meet practical requirements while enhancing storage and processing efficiency effectively.

Issue

Vol. 25 No. 1 (2024)

Section

Special Issue - Next generation Pervasive Reconfigurable Computing for High Performance Real Time Applications

Article Sidebar

Main Article Content

Abstract

Article Details