GHCN-tools

Overview

This repository contains tools and datasets related to the analysis of GHCN (Global Historical Climatology Network) data. The main focus is on extracting and analyzing brightness information (BI) and built-up area (BU) for climate stations, using datasets from Google Earth Engine (GEE). The repository contains multiple files, each with a specific purpose:

Fields Explained

Built-up Area (BU)

Brightness Information (BI)

Parameters in GHCNv4_stations_with_BI_BU_orwell2022.csv

The CSV file contains the following columns:

GEE Data Sources

GHCN_US-vs-global.ipynb Notebook

This notebook provides a comparative analysis of temperature trends between US stations and global stations, using GHCN data. It includes visualizations and statistical comparisons to understand how temperature trends differ regionally and globally.

How to Use

Some pictures and examples

image

image

Ineractive map showing all stations which deviate being BI = 0-0.1 both in old (NASA) and new (orwell2022) BI analysis.

image

Mapping Brightness and Built-Up Indexes Across the Globe

In this analysis, we explore how NASA’s Brightness Index (BI) compares to actual built-up areas around meteorological stations. Using data from NASA’s GHCNv4 and VIIRS/NOAA, we mapped stations with minimal urbanization (BU_10km ≤ 1%). Colors Explained: Purple: Stations with a NASA flagged BI above 15. Red: Stations with a a NASA flagged BI between 10 and 15 Blue: Stations with a a NASA flagged BI 6-10, Despite these classifications, all stations shown here have both BI_2020 < 1 and BU_10km < 1%, meaning they are in deep rural areas. This highlights how NASA’s BI misclassifies these locations, suggesting the need for improved metrics (use of GHSL_S etc) to distinguish urban from rural areas. Footnote: The colors are used to quickly highlight the most extreme misclassifications. All points are rural, with BU_10km < 1%.

image

Correlation plot BI_2020_sqr versus BU_2km, colored by old BI values. 🔴 for BI>6.

Brightness Index (BI) is nearly a random number generator, failing to rigorously separate Rural 🔴 from Urban ⚫️ stations (“R”, “U” mark the ensemble’s weight centers). This is why no difference appears in the temperature curves either. Proper GHSL data should be used.

image

Trend lines (added feature on Nov 6th 2024, due to US Election Boredout Syndrome)

Motivation - to compare with Java. E.g. here

image

What can you do with this?

Extract trend slopes for various BU bins with a flexible start point.

image

The analysis on the right is done ad-hoc (just feed GPT with the trend values and do the plot). It depends on the experiment perfromed.

image

Comaring adjusted versus raw.

image

#Note on output file GHCNv4_stations_with_BI_BU_orwell2022.csv

https://github.com/orwell2024/GHCN-tools/blob/main/GHCNv4_stations_with_BI_BU_orwell2022.csv

The image provides a global map of GISS GHCN stations, comparing them against the GHSL built-up land data retrieval process.

Gray dots represent all existing stations in the dataset. Red dots highlight 386 stations missing from the fetched GHSL dataset, meaning built-up data could not be retrieved for these locations. The missing stations are primarily located in polar regions (e.g., above ~70°N and Antarctica), suggesting a limitation in GHSL data coverage for extreme latitudes. This issue could be due to: GHSL dataset gaps where satellite-derived built-up data is unavailable in remote locations.

Gray dots: Existing stations. Red dots: Stations missing from the fetched GHSL dataset. Total stations in fetched dataset: 27,519. Stations missing from the fetched GHSL list: 386. image