On this page we provide technical background information about our of lists of ccTLD domain names. This information is targeted at academic researchers.
The goal of the OpenINTEL project is to capture daily snapshots of the state of large parts of the global Domain Name System. Because the DNS plays a key role in almost all Internet services, recording this information allows us to track changes on the Internet, and thus its evolution, over longer periods of time. By performing our flagship active DNS measurements, we build consistent and reliable time series of the state of the DNS.
Our fDNS measurements are seeded with domain names that we extract from zone files or from list-based sources. To expand the coverage of our fDNS measurement into country-code TLDs -- for which it is for the most part challenging to impossible to obtain the full zone file -- we started extracting lists of domain names from Certificate Transparency (CT) logs. CT data are public by design. Collecting and processing CT data at scale and longitudinally can be a challenge. Given that the lists of domain names hold values in themselves, we've decided to publish them. Our hope is that this will enable others to extend the coverage of their analysis or name-based measurement campaigns.
We continually ingest certificates contained in CT logs. We target all production logs from major operators (currently Google, DigiCert, Cloudflare, Let's Encrypt, Sectigo, and TrustAsia). We store all certificates in their entirety. We then post-process certificates and extract fully qualified domain names from the subject's Common Name and any and all DNS name(s) in the Subject Alternative Name extension, when present. From these FQDNs we extract registered domain names in a public-suffix aware manner (i.e., we extract the eTLD+1).
We started collecting CT data early 2020, which means our archive covers logs that have since retired and are no longer in operation. We create the lists at weekly intervals, on Mondays. A domain name is included if it is contained in a certificate that is valid on the date to which the list corresponds. Note that this inclusion heuristic provides no guarantees that the domain name is actively delegated. In rare cases where the issuing CA used a cached domain validation token, the domain name may not even be registered anymore.
Our first lists corresponds to the first Monday of January 2015. The underlying reason is that Google Chrome required all EV certificates to be CT-compliant to as of January 1, 2015 to show an EV indicator. A few years later Chrome's CT requirement was extended to certificates of all types (DV, OV and EV), newly issued after April 30, 2018.
Please note that CT data can provide substantial but not absolute coverage of a ccTLD zone, as we've shown in our ACM SIGCOMM CCR paper.
The lists that we currently provide:
Name | Description | Since | Interval | Type |
---|---|---|---|---|
CTLog registered ccTLD domains | Registered domain names (i.e., eTLD+1) under country-code TLDs | 2015-01-05 | Weekly | Open |
The data collected by the OpenINTEL platform has numerous applications in network and network security research. To support these efforts, we make our open data available on the download page or through the links provided in the coverage overview above, under the terms and conditions outlined on our terms page. As using our data may require specialized knowledge and specific analysis infrastructure, we encourage academic researchers to contact us to discuss your needs.
Closed data may be available upon request (more information here). If you have an interest in licencing access to our data for commercial purposes feel free to contact us.