Prefix Top Lists Reloaded

A Temporal Prefix Ranking Dataset

Introduction

Internet measurement studies often begin with target selection and domain-based top lists such as Tranco and Cisco Umbrella are a common starting point. However, many Internet phenomena are shaped by the underlying network infrastructure rather than by domain names alone.

Prefix Top Lists (PTLs) and AS Top Lists (ATLs) address this by translating domain popularity into rankings of BGP prefixes and Autonomous Systems. In doing so, they provide an infrastructure-centric complement to traditional domain-based rankings.

These datasets are designed to support target selection and longitudinal analysis at the prefix and AS level, helping researchers study the network entities behind popular Internet services over time.

Methodology

The PTL/ATL pipeline works as follows:

  1. Collect domain-based top lists from multiple sources, including Umbrella, Majestic, Tranco, CrUX, and Radar.
  2. Apply domain-name weighting, either through a Zipf-based ranking model or a presence-based frequency model.
  3. Stabilize rankings over time using a seven-day sliding window.
  4. Resolve domains through OpenINTEL DNS data and map the resulting IP addresses to BGP prefixes and origin ASes.
  5. Aggregate weights at the prefix and AS level to produce Prefix Top Lists (PTLs) and AS Top Lists (ATLs).
Dataset Variants

The published datasets include two main ranking variants:

  1. Zipf-weighted PTLs and ATLs, which give more influence to highly ranked domains
  2. Presence-based PTLs and ATLs, which emphasize how broadly domains appear across multiple source lists rather than their exact rank

These variants support different research goals, from studying highly popular infrastructure to analyzing broader coverage across the long tail.

Dataset Format

The prefix top list produces CSV records with the following fields:

  • prefix: the ranked BGP prefix
  • weight: the aggregated popularity score for that prefix
  • domains: the domains contributing to the prefix’s score
  • ips: the resolved IP addresses associated with those domains
  • ASes: the origin ASes associated with the mapped prefixes

The AS top list produces CSV records with the following fields:

  • asn: the ranked Autonomous System Number
  • weight: the aggregated popularity score for that AS
  • prefix: the prefixes announced by that AS and included in the aggregation
  • domains: the domains contributing to the AS’s score
  • ips: the resolved IP addresses associated with those domains

Coverage

The datasets are released as weekly snapshots.

Name Description Since Interval Type
Prefix Top Lists & AS Top lists Weekly rankings of BGP prefixes and Autonomous Systems derived from domain popularity data 2025-03-17 Weekly Open

Data Access & Terms

The data collected by the OpenINTEL platform has numerous applications in network and network security research. To support these efforts, we make our open data available on the download page or through the links provided in the coverage overview above, under the terms and conditions outlined on our terms page. As using our data may require specialized knowledge and specific analysis infrastructure, we encourage academic researchers to contact us to discuss your needs.


Closed data may be available upon request (more information here). If you have an interest in licencing access to our data for commercial purposes feel free to contact us.

References

Acknowledgements

This research received funding from the Dutch Research Council (NWO) under the projects UPIN and CATRIN.