Pattern recognition and segmentation of smart meter data


  • Barry McDonald Massey University
  • Peter Pudney University of South Australia
  • Jia Rong Deakin University



pattern recognition, cluster analysis, segmentation, smart meters


In Australia, Smart Meters automatically provide electricity suppliers with half-hour energy use data for each customer. This data can be used to classify customers into different categories. To this end, electricity supplier AGL provided MISG participants with data from 772 anonymous Victorian customers, collected between 2011-07-16 and 2012-01-30, and the corresponding series of half-hour temperature readings for Melbourne. The goals were to identify a small number of load profiles that could be used to classify customers, and to identify which customers have significant cooling loads and which customers have significant heating loads. For each customer there was a time series of 9552 half-hour periods, which made the dimensionality of the problem too high for cluster analysis of the entire sample data. Therefore analysis proceeded in two phases. First, the data were explored using various methods of data visualisation, including time series plots, scatterplots and heatmaps of electricity use against temperature and time, Fourier series analysis and load duration curves. Exploration suggested that some automatic data-selection rules would be useful, for example to eliminate premises with long periods of zero electricity, presumably due to vacancy. Based on the data exploration, summary statistics were chosen that would represent each customer, and these were used in the next phase, cluster analysis. Second, three approaches were used for clustering: self-organising maps, agglomerative clustering, and K-means clustering. Each of these methods produced interpretable clusters indicating different types of customer. Agglomerative clustering with complete linkage was good for picking out small very distinctive clusters, and Ward's linkage also performed well provided sufficient clusters were allowed. Computational limitations mean these two techniques cannot be directly used on very large samples---AGL has hundreds of thousands of customers. However, the cluster centroids from a pilot study, such as the sample provided to MISG, could be used as initial estimates for feeding into K-means clustering, providing the twin benefits of interpretable clusters and computational efficiency. References
  • Chicco, G. (2009) Support vector clustering of electrical load pattern data. IEEE Transactions on Power Systems, 24: 1619–1628. doi:10.1109/TPWRS.2009.2023009
  • Chicco, G. (2010) Clustering methods for electrical load pattern classification. In: 8th World Energy System Conference (WESC 2010), Targoviste, Romania, 1–3 July 2010. pp. 5–13 doi:10.1016/
  • De Silva, D., Yu, X., Alahakoon, D., Holmes, D. (2011) Incremental pattern characterization learning and forecasting for electricity consumption using smart meters. In Industrial Electronics (ISIE), 2011 IEEE International Symposium on (pp. 807–812).
  • Flath, C., Nicolay, D., Conte, T., Dinther, C., Filipova-Neumann L. (2012) Cluster Analysis of Smart Metering Data. Business Information Systems Engineering 4: 31–39.
  • Gullo, F., Ponti, G., Tagarelli, A., liritano S., Ruffolo, M., Labate D., (2009), Low-voltage electricity customer profiling based on load data clustering, IDEAS 2009, pp. 330–333. doi:10.1145/1620432.1620472
  • Kim, Y. I., Kang, S. J., Ko, J. M., Choi, S. H. (2011a). A study for clustering method to generate Typical Load Profiles for Smart Grid. Power Electronics and ECCE Asia (ICPE and ECCE), 2011 IEEE 8th International Conference on. pp. 1102–1109.
  • Kim, Y. I., Ko, J. M., Choi, S. H. (2011b). Methods for generating TLPs (typical load profiles) for smart grid-based energy programs. In Computational Intelligence Applications In Smart Grid (CIASG), 2011 IEEE Symposium on (pp. 1–6). doi:10.1109/CIASG.2011.5953331
  • Murtagh, F. (1995) Interpreting the Kohonen Self-organizating feature map using contiguity-constrained clustering. Pattern Recognition Letters, 16:399–408. doi:10.1016/0167-8655(94)00113-H
  • Ward, J. H. Jr (1963) Hierarchical grouping to optimize an objective function. Journal of the American Statistical Association, 58: 236–244. doi:10.1080/01621459.1963.10500845

Author Biographies

Barry McDonald, Massey University

Senior Lecturer, Statistics, Institute of Natural and Mathematical Sciences

Peter Pudney, University of South Australia

Senior Research Fellow, Centre for Industrial and Applied Mathematics

Jia Rong, Deakin University

Lecturer, Melbourne Institute of Business and Technology (MIBT)





Proceedings of the Mathematics in Industry Study Group