Date of Award


Level of Access Assigned by Author

Campus-Only Thesis

Degree Name

Master of Science (MS)


Computer Engineering


Yifeng Zhu

Second Committee Member

Mohamad Musavi

Third Committee Member

Richard Eason


The power sector is increasingly utilizing GPS-stamped real-time measurements from Phasor Measurement Units (PMUs) to improve the reliability and efficiency of power grids. As the number of PMUs installed in power grids grows rapidly and each PMU has increasingly more high-resolution and high-speed sensors, the volume of data streamed from PMUs in the power grid increases quickly over time. Scalable and reliable computation platforms are desperately needed to efficiently handle massive streamed real-time data and accumulated history data.

This thesis studies the application of two cloud computing platforms in managing and analyzing large-scale PMU data, including the Storm platform for streamed data, and the Hadoop platform for history data. A voltage stability monitoring technique widely deployed in power grids is used in this research as a case study of streaming computing.

The efficiencies and weaknesses of these platforms are investigated in this thesis. Experiments on eight processing nodes borrowed from Amazon Web Services show that Hadoop decreases simple function calculation times by a factor of 7.16. Storm is tested on a local system of six nodes with a simulated stream of data generated from a North American 60k bus system. Utilizing two different methods to calculate voltage stability, storm increases streamed throughput by a factor of 5.7 and decreases calculation latencies by a factor of 5.1. These tests identify load imbalance as a major inefficiency in Storm’s streaming calculations.

Three scheduling algorithms are proposed for Storm to address the load imbalance issue. These algorithms utilize latency feedback to perform load distribution based on the heuristics of Mean Completion Time (MCT). Experimental results show that the proposed algorithms can reduce the calculation time of the voltage stability indices up to 14% in a cluster of six computers with respect to storm’s default scheduling algorithms. The experimental addition of loaded heterogeneous nodes decreases Storm’s default algorithm efficiency which provides the opportunity for the new algorithms to reduce calculation times by 35%. Finally, experiments show that the algorithms handle dynamic workload fluctuations in machine time.