Varnish
DataDome Varnish module detects and protects against bot activity.
Before the regular Varnish process starts, the module makes a call to the DataDome API using a KeepAlive connection.
Depending on the API response, the module will either block the query or let Varnish proceed with the regular caching process.
The module has been developed to protect the users' experience: if any errors were to occur during the process, or if the timeout is reached, the module will automatically disable its blocking process and allow those hits.
Compatibility
Varnish
- 3.0.3+
- 4.0, 4.1
- 5.0, 5.1, 5.2
- 6.0, 6.1, 6.2, 6.3, 6.4, 6.5, 6.6
- 7.0, 7.1
Operating System
- Debian 7 to 12
- Ubuntu 14/16/18/20/22 (LTS versions)
- Centos 6/7
- SUSE 13/15/42
- Fedora 34/35/36/37
The following operating systems will no longer be supported starting 01 February 2023
- Fedora 16/17/18/19/20/21/22/23/24/25/26/27/28/29/30/31/32/33 Vendor announcement
Manual installation from repository
If you use a linux distribution that has a varnish in the base repository like Debian, Ubuntu, Fedora or openSUSE you can use our repository to install the module:
Manual installation from source
Install compilation dependencies:
- For Varnish Cache (open source project):
apt-get install automake make libtool python3-docutils libvarnishapi-dev pkg-config libpcre3-dev libreadline-dev
yum install automake make libtool python-docutils varnish-libs-devel
- For Varnish Enterprise (product):
apt-get install automake make libtool python3-docutils varnish-plus-devel pkg-config libpcre3-dev libreadline-dev
yum install automake make libtool python-docutils varnish-plus-devel
Compile and install DataDome vmod:
rm -f DataDome-Varnish-latest.tgz
wget https://package.datadome.co/linux/DataDome-Varnish-latest.tgz
tar -zxvf DataDome-Varnish-latest.tgz
cd DataDome-VarnishDome-*
./autogen.sh
./configure
make
make install
# Copy `datadome.vcl` to your VCL folder
cp datadome.vcl YOUR_VARNISH_VCL_FOLDER
## Varnish 3.0 requires the Varnish sources on the server.
## In case Varnish was installed from binary, setup Varnish source first
## In case it was build from source, go to step 2.
# 1 - Setup Varnish source
wget https://varnish-cache.org/_downloads/varnish-3.0.7.tgz
tar -zxvf varnish-3.0.7.tgz
cd varnish-3.0.7
./autogen.sh
./configure
make
export VARNISHSRC=$PWD
cd ..
# 2 - Setup DataDome Module
rm -f DataDome-Varnish-latest.tgz
wget https://package.datadome.co/linux/DataDome-Varnish-latest.tgz
tar -zxvf DataDome-Varnish-latest.tgz
cd DataDome-VarnishDome-*
./autogen.sh
./configure
make
make install
# Copy `datadome.vcl` to your VCL folder
cp datadome.varnish3.vcl YOUR_VARNISH_VCL_FOLDER/datadome.vcl
VCL integration
In the datadome.vcl
file:
- Add the License Key
- Add the API Server endpoint
In the main VCL file:
- Add
include "datadome.vcl";
line after backend configuration and before any subroutine (example below)
#
# This is an example VCL file for Varnish.
#
# It does not do anything by default, delegating control to the
# builtin VCL. The builtin VCL is called when there is no explicit
# return statement.
#
# See the VCL chapters in the Users Guide at https://www.varnish-cache.org/docs/
# and http://varnish-cache.org/trac/wiki/VCLExamples for more examples.
# Marker to tell the VCL compiler that this VCL has been adapted to the
# new 4.0 format.
vcl 4.0;
# Default backend definition. Set this to point to your content server.
backend default {
.host = "127.0.0.1";
.port = "80";
}
include "datadome.vcl";
sub vcl_recv {
# Happens before we check if we have this in cache already.
#
# Typically you clean up the request here, removing cookies you don't need,
# rewriting the request, etc.
}
sub vcl_backend_response {
# Happens after we have read the response headers from the backend.
#
# Here you clean the response headers, removing silly Set-Cookie headers
# and other mistakes your backend does.
}
sub vcl_deliver {
# Happens when we have all the pieces we need, and are about to send the
# response to the client.
#
# You can do accounting or modifying the final object here.
}
VCL logic
If the same subroutine is defined multiple times, Varnish concatenates them by order of appearance. That's why you can include the
datadome.vcl
file without altering your subroutines.
If you include a logic based on the number of restarts, you should increase the count since DataDome module adds arestart
for each hit.
Settings
Setting | Description | Required | Default |
---|---|---|---|
data_dome_shield.init (data_dome_backend, "KEY"); | Replace "KEY" with your own DataDome License Key | Yes | |
host | The DataDome API server's hostname. Available endpoints | Yes | apivarnish.datadome.co |
connect_timeout | Timeout used at the initial API connection when initiating a new connection (in ms) | Optional | 150 |
first_byte_timeout and between_bytes_timeout | Timeouts for regular API calls (in ms) | Optional | 50 |
data_dome_shield.uri_regex | Processes matching URIs only | Optional | |
data_dome_shield.uri_regex_exclusion | Ignores all matching URIs | Optional | exclude static asset |
Host/Endpoint IP address changes
- Our host/endpoint IP addresses may change occasionally and this can affect systems that rely on static IP.
- Varnish performs a DNS lookup on startup to identify our endpoint.
- If an IP address changes after Varnish has already started, it will not be automatically recognized.
- To mitigate this, whenever an IP address is updated, our support team will reach out to ensure that a manual reload is performed to trigger a new DNS lookup:
sudo service varnish reload
- This will refresh the IP address lookup and allow Varnish to reconnect with the updated IP.
- For more information on
varnish reload
, please check the official Varnish documentation here(During these transitions, DataDome always maintain the original IP address active to ensure continuous availability to our customers.)
FAQ
How does the module work?
The module should be injected in varnish.vcl before including any other logic.
The module checks if the request should be processed by DataDome API in the vcl_recv
stage by calling data_dome_shield.is_suitable
. The function returns "true" when the requested URL has passed through the Regex and the request was not restarted (which means it was handled by DataDome). In other cases, the function returns "false" and lets Varnish handle the regular process.
The function data_dome_shield.prepare_request
replaces the original request to the API and updates the original backend to a DataDome backend.
The function data_dome_shield.restore_request
reverts to the original request.
The module doesn't impact the Varnish caching logic because when the request is allowed, the original Varnish process is restored.
What should I do when I upgrade to a new version of Varnish?
Upgrading to a new version of Varnish requires rebuilding the DataDome module. This also includes minor version upgrades.
Can I activate DataDome on a specific Domain/IP?
If you would like to process a specific virtual host or path, you can achieve that by editing vcl_recv
.
Below is an example of the code that enables the DataDome module only for requests with path /public
on the domain test.com
or on IP A.B.C.D
.
sub vcl_recv {
# check that host is test.com and URL starts with /public
if ((req.http.host == "test.com" || client.ip == "A.B.C.D") && req.url ~ "^/public") {
# usual DataDome configuration
if (data_dome_shield.is_suitable()) {
data_dome_shield.prepare_request();
return (pass);
}
}
}
Can I get Bot Name, Bot Type and Bot/Human flags in my application?
The DataDome module can inject headers in the HTTP request that can be read by your application.
You can find more information here.
How can I add Bot information in logs?
Bot information can be injected in WebServer logs. For instance, to create an access-log file that contains the request URI, 'is it a bot', and the API server response status and response time, you may use varnishncsa similarly to the below:
sudo varnishncsa -a -w /var/log/varnish/datadome.log -D -P /var/run/varnishncsa_datadome.pid -F '%h %l %u %t "%r" "%{X-DataDome-isbot}i %{VCL_Log:DataDome_status}x %{VCL_Log:DataDome_spent_time}x"'
DataDome_status
can have the following values:
DataDome_status | Description |
---|---|
200 | Hit was allowed |
401/403 | Hit was blocked |
700 | Hit doesn't match with the Regex and was not analyzed by DataDome |
704 | API server response does not contain the expected X-DataDomeResponse header |
Can I use a custom installation folder?
The source tree is based on autotools to configure the build, and also has the necessary bits in place to execute functional unit tests using the varnishtest
tool.
Making a build requires the Varnish header files and uses pkg-config to find the necessary paths.
Usage:
./autogen.sh
./configure
If you have installed Varnish to a non-standard directory, call autogen.sh
and configure
with PKG_CONFIG_PATH
pointing to the appropriate path.
Refer to the example below, when varnishd configuration is called with - -prefix=$PREFIX
:
PKG_CONFIG_PATH=${PREFIX}/lib/pkgconfig
export PKG_CONFIG_PATH
Make targets:
make # builds the vmod.
make install # installs your vmod.
make check # runs the unit tests in ``src/tests/*.vtc``
make distcheck # run check and prepare a tarball of the vmod.
Can I change the setup folder?
By default, the vmod configure
script installs the built vmod in the same directory as Varnish, determined via pkg-config(1)
. The vmod installation directory can be overridden by passing the VMOD_DIR
variable to configure
.
Other files such as man-pages and documentation are installed in the locations determined by configure
, which inherits its default - -prefix
setting from Varnish.
Can I build a DataDome module package (DEB/RPM)?
This module can be built as an rpm or dev package. The command below is an example:
rpmbuild -ba vmod-data_dome_shield.spec
dpkg-buildpackage -us -uc
rpmbuild -ba --define 'VARNISHSRC /home/user/varnish-3.0.7' vmod-data_dome_shield.spec
env DEBIAN_VARNISH_SRC=/home/user/varnish-3.0.5 dpkg-buildpackage -us -uc
Can I activate a debug mode?
-
Configure: error: Need varnish.m4 -- see README.rst
Check if
PKG_CONFIG_PATH
has been set correctly before callingautogen.sh
andconfigure
-
How to enable debug?
Rebuild module with
- -enable-debug
atconfigure
option.
How can I use our corporate proxy?
You can use your corporate proxy with our modules by changing your configuration in datadome.vcl:
# DataDome API Backend
backend data_dome_backend {
# add the following line (replace with our closest API Server)
.host_header = "api-fixed.datadome.co";
# replace DataDome api server hostname by your corporate proxy
.host = "corporated-proxy";
.port = "3128";
...
}
sub vcl_init {
# add after data_dome_shield.init add the following line (replace with our closest API Server)
data_dome_shield.validate_url("http://api-fixed.datadome.co/validate-request/");
}
Updated 3 months ago