Working with the Green Web Open Datasets

The Green Web Foundation regularly publishes a dataset of green domain names, and who hosts them. We refer to this as the green domains dataset.

This data closely follows the data available over the Green Web API, and generally speaking, analysis you might use the green web API for, you can use the published datasets for, without needing to hit the API for each check.

Understanding the green_domains dataset

Every check of a website in the The Green Web Foundation platform is recorded in a table called greenchecks. As of Feburary 2021, this table is more than 3 billion rows long, so is rather unwieldy to work with.

For this reason, the dataset we publish contains a smaller table, greendomains, listing the urls, and their status, with the columns below.

Column Description
id the id of the last check
url the url checked
_hostedby the organisation hosting this site
_hosted_bywebsite the website of the company providing the hosting for this site
partner does this url belong to one of the web green web partner organisations
green is this a green domain? 1 for yes, 0 for no.
_hosted_byid the id of the hosting company
modified the time and date of the last check of this url

Example uses of this dataset

Because this data provides similar data to the greencheck API, this dataset can work like an offline cache, where making API calls for each check either would either be too slow, or leak data about your users that you would not want to share.

Licensing of the data

This dataset is releases under the Open Database Licence.

Getting support with using the the Green Web Foundation datasets

We provide limited, free support for using the Green Web Datasets we publish, and are happy to provide advice or answer questions about this data if you want to use it in classes or research.

If you're interested in further analysis about the shift of the web away from fossil fuels, the Green Web Foundation has data going back to 2009, and we're happy to do collaborations. Get in touch at hello@thegreenwebfoundation, or visit our contact page for more ways to reach us