Open-source dataset for rooftop PV generation in urban environments

Share

Scientists from the Hong Kong University of Science and Technology have created a new high-resolution three-year dataset of rooftop PV generation in urban environments.

The open-source data comprises measured PV power generation data and corresponding weather data. The PV generation was gathered from 60 grid-connected rooftop PV stations located at the university's campus, while the weather data was collected from an on-site station.

“The potential use cases for the dataset can be as follows: comparing the generation efficiency of PV modules with different capacities, module models, optimizer types, and connection time; calibrating PV generation and forecasting models developed from either data-driven or physics-based approaches; developing automatic fault detection algorithms for PV modules; and longitudinal performance degradation analysis for PV system,” the academics said.

The measured rooftop solar power project, managed by the university's Sustainability/Net-Zero Office, is the largest behind-the-meter rooftop solar power project in Hong Kong. Its combined power capacity is 2,230.8 kW, and its annual electricity output is 3 million kWh.

It is generated from 6,085 PV modules, all recorded in the dataset from 2021 to 2023. 61.7% of the stations are equipped with panel-level optimizers, and their measurements were collected by both the inverter and the panel-level optimizer. The rest of the stations did not have panel-level optimizers, and their data was only collected at the inverter level. Generation data was collected at 5-minute intervals.

The meteorological data was collected at 1-minute intervals from a weather station on the eastern side of the campus. The station comprises a 10-meter-high automatic weather tower and an outdoor area with six monitoring sensors. Parameters included irradiation, temperature, relative humidity, sea level pressure, visibility, wind, and rainfall.

“This region has a subtropical climate, with humidity levels averaging over 75% and temperatures ranging from 10 C in winter to above 30 C in summer,” the researchers said. “These climate conditions significantly impact the performance of PV systems, leading to variations in efficiency. Elevated temperatures can reduce the efficiency of PV panels, while high humidity may lead to dust accumulation, further affecting performance. Since the meteorological and solar PV data are recorded in this specific location, this may limit the generalizability of models trained on this dataset.”

The measurement of PV generation had an accuracy of around 2.5%, based on the devices, while the weather station had a minimum uncertainty rate of about 1% and a maximum uncertainty rate of approximately 10% for the variety of sensors. The data was classified as Grade A, given that the missing rate is below 10%. Missing data may arise from communication failures, equipment malfunctions, and data logging errors, among other reasons.

“The open-sourced dataset is divided into two categories: time-series data and metadata,” explained the scientists, saying that the first includes all PV generation and metrological data points. “To enhance data comprehension and enable efficient querying, we have developed a Brick model that represents the location, equipment, and temporal metadata for PV system.”

The novel dataset was presented in “A high-resolution three-year dataset supporting rooftop photovoltaics (PV) generation analytics,” published in Scientific Data.

This content is protected by copyright and may not be reused. If you want to cooperate with us and would like to reuse some of our content, please contact: editors@pv-magazine.com.

Popular content

Carrier launches new air-to-water heat pumps for residential, commercial applications
13 January 2025 Carrier, a US-based heating solutions provider, has introduced a new 4-14 kW line of heat pumps with a coefficient of performance of up to 4.90. The s...