Varnish

DataDome Varnish module detects and protects against bot activity.

Before the regular Varnish process starts, the module makes a call to the DataDome API using a KeepAlive connection.
Depending on the API response, the module will either block the query or let Varnish proceed with the regular caching process.
The module has been developed to protect the users' experience: if any errors were to occur during the process, or if the timeout is reached, the module will automatically disable its blocking process and allow those hits.

Compatibility

The DataDome module supports the following versions of Varnish:

  • 3.0.3+
  • 4.0, 4.1
  • 5.0, 5.1, 5.2
  • 6.0, 6.1, 6.2, 6.3, 6.4, 6.5, 6.6
  • 7.0, 7.1

Every new release of the module is thoroughly tested on the following distributions:

  • Debian 7 to 10
  • Ubuntu 14/16/18/20/22 (LTS versions)
  • Centos 6/7
  • SUSE 13/15/42
  • Fedora 34/35/36/37

🚧

The following operating systems will no longer be supported starting 01 February 2023

Manual installation from repository

If you use a linux distribution that has a varnish in the base repository like Debian, Ubuntu, Fedora or openSUSE you can use our repository to install the module:

Manual installation from source

To build it from scratch, you must have following libs installed:

apt-get install automake make libtool python3-docutils libvarnishapi-dev pkg-config libpcre3-dev libreadline-dev
yum install automake make libtool python-docutils varnish-libs-devel

You need to install DataDome Vmod as follows:

rm -f DataDome-Varnish-latest.tgz
wget https://package.datadome.co/linux/DataDome-Varnish-latest.tgz
tar -zxvf DataDome-Varnish-latest.tgz
cd DataDome-VarnishDome-*
./autogen.sh
./configure
make
make install
# Copy `datadome.vcl` to your VCL folder 
cp datadome.vcl YOUR_VARNISH_VCL_FOLDER
## Varnish 3.0 requires the Varnish sources on the server. 
## In case Varnish was installed from binary, setup Varnish source first
## In case it was build from source, go to step 2.

# 1 - Setup Varnish source
wget https://varnish-cache.org/_downloads/varnish-3.0.7.tgz
tar -zxvf varnish-3.0.7.tgz
cd varnish-3.0.7
./autogen.sh
./configure
make
export VARNISHSRC=$PWD
cd ..

# 2 - Setup DataDome Module
rm -f DataDome-Varnish-latest.tgz
wget https://package.datadome.co/linux/DataDome-Varnish-latest.tgz
tar -zxvf DataDome-Varnish-latest.tgz
cd DataDome-VarnishDome-*
./autogen.sh
./configure
make
make install
# Copy `datadome.vcl` to your VCL folder 
cp datadome.varnish3.vcl YOUR_VARNISH_VCL_FOLDER/datadome.vcl

VCL integration

Set the the following parameters in the datadome.vcl file:

In your VCL file, add a line right after backend configuration and before any subroutine, similarly to the following example:

#
# This is an example VCL file for Varnish.
#
# It does not do anything by default, delegating control to the
# builtin VCL. The builtin VCL is called when there is no explicit
# return statement.
#
# See the VCL chapters in the Users Guide at https://www.varnish-cache.org/docs/
# and http://varnish-cache.org/trac/wiki/VCLExamples for more examples.

# Marker to tell the VCL compiler that this VCL has been adapted to the
# new 4.0 format.
vcl 4.0;

# Default backend definition. Set this to point to your content server.
backend default {
    .host = "127.0.0.1";
    .port = "80";
}

include "datadome.vcl";

sub vcl_recv {
    # Happens before we check if we have this in cache already.
    #
    # Typically you clean up the request here, removing cookies you don't need,
    # rewriting the request, etc.
}

sub vcl_backend_response {
    # Happens after we have read the response headers from the backend.
    #
    # Here you clean the response headers, removing silly Set-Cookie headers
    # and other mistakes your backend does.
}

sub vcl_deliver {
    # Happens when we have all the pieces we need, and are about to send the
    # response to the client.
    #
    # You can do accounting or modifying the final object here.
}

📘

VCL logic

If the same subroutine is defined multiple times, Varnish concatenates them by order of appearance. That's why you can include the datadome.vcl file without altering your subroutines.
If you include a logic based on the number of restarts, you should increase the count since DataDome module adds a restart for each hit.

Settings

SettingDescriptionRequiredDefault
data_dome_shield.init (data_dome_backend, "KEY");Replace "KEY" with your own DataDome License KeyYes
hostThe DataDome API server's hostname. Available endpointsYesapivarnish.datadome.co
connect_timeoutTimeout used at the initial API connection when initiating a new connection (in ms)Optional150
first_byte_timeout and between_bytes_timeoutTimeouts for regular API calls (in ms)Optional50
data_dome_shield.uri_regexProcesses matching URIs onlyOptional
data_dome_shield.uri_regex_exclusionIgnores all matching URIsOptionalexclude static asset

FAQ

How does the module work?

The module should be injected in varnish.vcl before including any other logic.

The module checks if the request should be processed by DataDome API in the vcl_recv stage by calling data_dome_shield.is_suitable. The function returns "true" when the requested URL has passed through the Regex and the request was not restarted (which means it was handled by DataDome). In other cases, the function returns "false" and lets Varnish handle the regular process.

The function data_dome_shield.prepare_request replaces the original request to the API and updates the original backend to a DataDome backend.

The function data_dome_shield.restore_request reverts to the original request.

The module doesn't impact the Varnish caching logic because when the request is allowed, the original Varnish process is restored.

What should I do when I upgrade to a new version of Varnish?

Upgrading to a new version of Varnish requires rebuilding the DataDome module. This also includes minor version upgrades.

Can I activate DataDome on a specific Domain/IP?

If you would like to process a specific virtual host or path, you can achieve that by editing vcl_recv.

Below is an example of the code that enables the DataDome module only for requests with path /public on the domain test.com or on IP A.B.C.D.

sub vcl_recv {
    # check that host is test.com and URL starts with /public
    if ((req.http.host == "test.com" || client.ip == "A.B.C.D") && req.url ~ "^/public") {
      # usual DataDome configuration
      if (data_dome_shield.is_suitable()) {
          data_dome_shield.prepare_request();
          return (pass);
      }
    }
}

Can I get Bot Name, Bot Type and Bot/Human flags in my application?

The DataDome module can inject headers in the HTTP request that can be read by your application.
You can find more information here.

How can I add Bot information in logs?

Bot information can be injected in WebServer logs. For instance, to create an access-log file that contains the request URI, 'is it a bot', and the API server response status and response time, you may use varnishncsa similarly to the below:

sudo varnishncsa -a -w /var/log/varnish/datadome.log -D -P /var/run/varnishncsa_datadome.pid -F '%h %l %u %t "%r" "%{X-DataDome-isbot}i %{VCL_Log:DataDome_status}x %{VCL_Log:DataDome_spent_time}x"'

DataDome_status can have the following values:

DataDome_statusDescription
200Hit was allowed
401/403Hit was blocked
700Hit doesn't match with the Regex and was not analyzed by DataDome
704API server response does not contain the expected X-DataDomeResponse header

Can I use a custom installation folder?

The source tree is based on autotools to configure the build, and also has the necessary bits in place to execute functional unit tests using the varnishtest tool.

Making a build requires the Varnish header files and uses pkg-config to find the necessary paths.

Usage:

./autogen.sh
 ./configure

If you have installed Varnish to a non-standard directory, call autogen.sh and configure with PKG_CONFIG_PATH pointing to the appropriate path.
Refer to the example below, when varnishd configuration is called with - -prefix=$PREFIX:

PKG_CONFIG_PATH=${PREFIX}/lib/pkgconfig
 export PKG_CONFIG_PATH

Make targets:

make           # builds the vmod.
make install   # installs your vmod.
make check     # runs the unit tests in ``src/tests/*.vtc``
make distcheck # run check and prepare a tarball of the vmod.

Can I change the setup folder?

By default, the vmod configure script installs the built vmod in the same directory as Varnish, determined via pkg-config(1). The vmod installation directory can be overridden by passing the VMOD_DIR variable to configure.

Other files such as man-pages and documentation are installed in the locations determined by configure, which inherits its default - -prefix setting from Varnish.

Can I build a DataDome module package (DEB/RPM)?

This module can be built as an rpm or dev package. The command below is an example:

rpmbuild -ba vmod-data_dome_shield.spec
dpkg-buildpackage -us -uc
rpmbuild -ba --define 'VARNISHSRC /home/user/varnish-3.0.7' vmod-data_dome_shield.spec
env DEBIAN_VARNISH_SRC=/home/user/varnish-3.0.5 dpkg-buildpackage -us -uc

Can I activate a debug mode?

  • Configure: error: Need varnish.m4 -- see README.rst

    Check if PKG_CONFIG_PATH has been set correctly before calling autogen.sh and configure

  • How to enable debug?

    Rebuild module with - -enable-debug at configure option.

How can I use our corporate proxy?

You can use your corporate proxy with our modules by changing your configuration in datadome.vcl:

# DataDome API Backend
backend data_dome_backend {
     # add the following line (replace with our closest API Server)
    .host_header = "api-fixed.datadome.co";
    # replace DataDome api server hostname by your corporate proxy
    .host = "corporated-proxy";
    .port = "3128";
    ...
}

sub vcl_init {
   # add after data_dome_shield.init add the following line (replace with our closest API Server)
   data_dome_shield.validate_url("http://api-fixed.datadome.co/validate-request/");
}