Zonestream is a tool that uses Kafka and websockets to stream data about changes in DNS zones in
real-time. The data is obtained by IFXR from open ccTLDs and CT logs and adapting them for use
in the stream.
The Zonestream tool is designed to allow for real-time monitoring of DNS zone changes, which can
be used for a variety of purposes, such as detecting potential security threats or tracking
changes to specific domains. The use of Kafka or websockets allows for the efficient and
reliable streaming of large amounts of data, making it suitable for use in high-volume
environments.
The IFXR logs extraction provides detailed information about DNS changes for enabled zones, while
the CT logs extraction will provide data on newly registered domain extracted from the
certificate transparency data. This data can be used to detect malicious or unauthorized changes
to DNS zones, allowing for quick and effective response to potential security threats.
Zonestream is a tool developed under the OpenINTEL project, a research initiative of the
University of Twente. This tool is designed to help security professionals and researchers to
monitor and detect malicious activity in the DNS infrastructure.
The following topics are currently available:
newly_registered_domain
(1)
Newly registered domain names extracted from CT Logs.
newly_registered_fqdn
(1)
FQDNs belonging to newly registered domain names extracted from CT Logs.
zone_diff
(2)
Zone diff extracted by IFXR.
.cd
, .ch
, .ee
, .fj
, .gp
,
.li
, .mp
, .mw
, .ni
, .nu
,
.sl
and .se
.
To connect to the Kafka stream, you can use the command-line tool kafkacat or a big data processing framework like Apache Spark.
Here is an example of how to use kafkacat to consume messages from the Kafka streams:
kafkacat -b kafka.zonestream.openintel.nl:9091 -t topic-name
Replace "topic-name" with the name of the topic you want to consume messages from.
Here is an example of how to use Apache Spark to consume messages from the Kafka streams:
from pyspark.sql import SparkSession
from pyspark.sql.types import *
from pyspark.sql.functions import *
# Create a SparkSession
spark = SparkSession.builder.appName("Zonestream").getOrCreate()
# Define the schema for the Kafka stream
schema = StructType([
StructField("domain", StringType()),
StructField("cert_index", IntegerType()),
StructField("ct_name", StringType()),
StructField("timestamp", IntegerType()),
])
# Read from the Kafka stream using the defined schema
df = spark.readStream.format("kafka") \
.option("kafka.bootstrap.servers", "kafka.zonestream.openintel.nl:9091") \
.option("subscribe", "newly_registered_domain") \
.option("startingOffsets", "earliest") \
.load() \
.selectExpr("CAST(value AS STRING)") \
.select(from_json(col("value"), schema).alias("data"))
To connect to the websocket stream, you can use the websocket-client library in Python.
Here is an example of how to use the websocket-client library to connect to the websocket stream:
import websocket
websocket.enableTrace(True)
ws = websocket.WebSocketApp(
"wss://zonestream.openintel.nl/ws/newly_registered_domain",
on_message = on_message,
on_error = on_error,
on_close = on_close
)
ws.on_open = on_open
ws.run_forever()
Waiting for data...
The zone_diff
topic follows the OpenINTEL
fDNS schema and is structured as a JSON record.
The newly_registered_domain
produces JSON records with the following fields:
confirmed_newly_registered_domain
topic.
The newly_registered_fqdn
produces records with a similar format.
The domain field is replaced by the fqdn field.
The confidence level indicates how close we are to the RDAP-reported registration date for a newly registered domain. Lower levels mean higher confidence, as follows:
{
"domain": "example.com",
"cert_index": 123456789,
"ct_name": "Example Log 2023",
"timestamp": 16234567,
"confidence": 1
}
The data collected by the OpenINTEL platform has numerous applications in network and network security research. To support these efforts, we make our open data available, under the terms and conditions outlined on our terms page. As using our data may require specialized knowledge and specific analysis infrastructure, we encourage academic researchers to contact us to discuss your needs.
Closed data may be available upon request (more information here). If you have an interest in licencing access to our data for commercial purposes feel free to contact us.