ZoneStream

Detecting Changes in DNS zones in real-time

Introduction

Zonestream is a tool that uses Kafka and websockets to stream data about changes in DNS zones in real-time. The data is obtained by IFXR from open ccTLDs and CT logs and adapting them for use in the stream.

The Zonestream tool is designed to allow for real-time monitoring of DNS zone changes, which can be used for a variety of purposes, such as detecting potential security threats or tracking changes to specific domains. The use of Kafka or websockets allows for the efficient and reliable streaming of large amounts of data, making it suitable for use in high-volume environments.

The IFXR logs extraction provides detailed information about DNS changes for enabled zones, while the CT logs extraction will provide data on newly registered domain extracted from the certificate transparency data. This data can be used to detect malicious or unauthorized changes to DNS zones, allowing for quick and effective response to potential security threats.

Zonestream is a tool developed under the OpenINTEL project, a research initiative of the University of Twente. This tool is designed to help security professionals and researchers to monitor and detect malicious activity in the DNS infrastructure.

Available resources

The following topics are currently available:

  • newly_registered_domain (1)

    Newly registered domain names extracted from CT Logs.

  • newly_registered_fqdn (1)

    FQDNs belonging to newly registered domain names extracted from CT Logs.

  • zone_diff (2)

    Zone diff extracted by IFXR.


  1. The domains extracted from CT logs include all top-level domains (TLDs) covered by the OpenINTEL project, with the exception of closed country code top-level domains (ccTLDs) that are under nondisclosure agreements (NDAs). For a comprehensive overview of the coverage, click here.
  2. Zone diff is currently supported for the following zones: .cd, .ch, .ee, .fj, .gp, .li, .mp, .mw, .ni, .nu, .sl and .se.

Connecting to the Kafka Stream

To connect to the Kafka stream, you can use the command-line tool kafkacat or a big data processing framework like Apache Spark.

Using kafkacat

Here is an example of how to use kafkacat to consume messages from the Kafka streams:

kafkacat -b kafka.zonestream.openintel.nl:9091 -t topic-name

Replace "topic-name" with the name of the topic you want to consume messages from.

Using Spark

Here is an example of how to use Apache Spark to consume messages from the Kafka streams:

from pyspark.sql import SparkSession
from pyspark.sql.types import *
from pyspark.sql.functions import *

# Create a SparkSession
spark = SparkSession.builder.appName("Zonestream").getOrCreate()

# Define the schema for the Kafka stream
schema = StructType([
    StructField("domain", StringType()),
    StructField("cert_index", IntegerType()),
    StructField("ct_name", StringType()),
    StructField("timestamp", IntegerType()),
])

# Read from the Kafka stream using the defined schema
df = spark.readStream.format("kafka") \
    .option("kafka.bootstrap.servers", "kafka.zonestream.openintel.nl:9091") \
    .option("subscribe", "newly_registered_domain") \
    .option("startingOffsets", "earliest") \
    .load() \
    .selectExpr("CAST(value AS STRING)") \
    .select(from_json(col("value"), schema).alias("data"))

Connecting to the websocket stream

To connect to the websocket stream, you can use the websocket-client library in Python.

Using websocket-client

Here is an example of how to use the websocket-client library to connect to the websocket stream:

import websocket

websocket.enableTrace(True)
ws = websocket.WebSocketApp(
    "wss://zonestream.openintel.nl/ws/newly_registered_domain",
    on_message = on_message,
    on_error = on_error,
    on_close = on_close
)
ws.on_open = on_open
ws.run_forever()

Connecting to the websocket stream from the browser


Waiting for data...

JSON Schema Explanation

The zone_diff topic follows the OpenINTEL fDNS schema and is structured as a JSON record.


The newly_registered_domain produces JSON records with the following fields:

  • domain: Newly registered domain name.
  • cert_index: Certificate index of first appearance.
  • ct_name: CT Log name of first appearance.
  • timestamp: Certificate timestamp.
  • confidence*: Closeness to RDAP-reported registration date. *Only present for the confirmed_newly_registered_domain topic.


The newly_registered_fqdn produces records with a similar format. The domain field is replaced by the fqdn field.

Confidence

The confidence level indicates how close we are to the RDAP-reported registration date for a newly registered domain. Lower levels mean higher confidence, as follows:

  • Level 1: Within 5 minutes (highest)
  • Level 2: Within 15 minutes
  • Level 3: Within 30 minutes
  • Level 4: Within 1 hour
  • Level 5: Within 2 hours
  • Level 6: Within 4 hours
  • Level 7: Within 8 hours
  • Level 8: Within 12 hours
  • Level 9: Within 18 hours
  • Level 10: More than 18 hours, but less than 24 hours (lowest)

Zonestream Output Example

{
    "domain": "example.com",
    "cert_index": 123456789,
    "ct_name": "Example Log 2023",
    "timestamp": 16234567,
    "confidence": 1
}

Data Access & Terms

The data collected by the OpenINTEL platform has numerous applications in network and network security research. To support these efforts, we make our open data available, under the terms and conditions outlined on our terms page. As using our data may require specialized knowledge and specific analysis infrastructure, we encourage academic researchers to contact us to discuss your needs.


Closed data may be available upon request (more information here). If you have an interest in licencing access to our data for commercial purposes feel free to contact us.