HAProxy

Supported versions

DataDome supports HAProxy and HAProxy Enterprise Edition versions until they reach their End of Life date.
You can find the maintenance schedule of each HAProxy version on their website.


📘

Module version 2.0.0+

The configuration described below requires DataDome HAProxy module version 2.0.0 or later. If you are running an older version, refer to the migration guide or download the latest module.

Prerequisites

📘

HaProxy minimum version

Lua HTTP client was introduced in 2.5 and is fully operational for this module in 2.8.12 / 2.9.12 / 3.0.6 - changelog

HAProxy

You can check the version of HAProxy by running:

$ haproxy -v
HAProxy version 2.8.10-f28885f 2024/06/14 - https://haproxy.org/
Status: long-term supported branch - will stop receiving fixes around Q2 2028.
Known bugs: http://www.haproxy.org/bugs/bugs-2.8.10.html
Running on: Linux 5.15.0-1066-aws #72~20.04.1-Ubuntu SMP Thu Jul 18 10:41:27 UTC 2024 x86_64

You can confirm Lua is built in HAProxy by running:

$ haproxy -vv
HAProxy version 2.8.10-f28885f 2024/06/14 - https://haproxy.org/
[...]
Built with Lua version : Lua 5.4.7
[...]

LuaSocket

📘

About timestamps in Lua

By default, Lua generates timestamps with a second-level precision and a dependency on lua-socket is required to support millisecond precision. This enables better latency measures on the Protection API.

The lua-socket package can be installed using:

apt install lua-socket -y

The presence of this package will be detected by Lua during runtime, no further action is required to improve time measurements.

Configuration

You need to follow the steps below:

  1. Download the latest DataDome module and extract it in your HAProxy configuration directory.
    The archive includes the following files.
  • datadome.lua: a Lua script that handles the transformation of the HTTP request
  • core/helper.lua: Lua scripts to manipulate strings and HTTP Requests
  • haproxy.cfg: Example file of a working configuration
  1. Edit the HAProxy configuration file and set DATADOME_SERVERSIDE_API_KEY with your own API server key provided by DataDome. You can find this key inside our dashboard.
  2. Update your HAProxy configuration file by replacing <PATH> with the actual path where you placed the files, and setting the different blocks needed.
  3. Create or use an existing DNS Resolver to resolve the DataDome endpoint (api.datadome.co).
  4. Add a frontend api-datadome-tcp to handle requests to DataDome Bot Protect API.
  5. Add a backend api-datadome-co that leverages the DNS Resolver to connect to our Bot Protect API.
  6. Add the request and response hook in your main frontend as in the example below.
    http-request lua.Datadome_request_hook if !excluded_files
    http-response lua.Datadome_response_hook
    The Haproxy configuration file should look like:
global
  [...]
  setenv DATADOME_SERVER_SIDE_KEY "${DATADOME_SERVER_SIDE_KEY}"
  presetenv DATADOME_ENDPOINT "api.datadome.co"
  presetenv DATADOME_TIMEOUT "150"
  # setenv DATADOME_ENABLE_REFERRER_RESTORATION "false"
  
  # Edit Path here if needed, this is the path where the lua script is located
  lua-prepend-path /usr/local/etc/haproxy/?.lua
  lua-load-per-thread /usr/local/etc/haproxy/datadome.lua

  httpclient.retries 0
  httpclient.timeout.connect "${DATADOME_TIMEOUT}"ms

# DNS Resolver - This is an example, an existing DNS resolver will also work if it can resolve api.datadome.co
resolvers custom-dns
  nameserver local 127.0.0.1:53
  nameserver ns1 1.1.1.1:53
  nameserver ns2 8.8.8.8:53
  timeout retry   1s
  hold valid 30s
  hold other 3s
  hold obsolete 30s
  
# Frontend to redirect DataDome payload
frontend api-datadome-tcp
  mode tcp
  option	tcplog
  bind 127.0.0.1:10444
  default_backend api-datadome-co

# Example of frontend which will be protected
frontend http
[...]
    
  # Here check uri is not in exclusion path
  acl excluded_files path_reg -i .\.(avi|flv|mka|mkv|mov|mp4|mpeg|mpg|mp3|flac|ogg|ogm|opus|wav|webm|webp|bmp|gif|ico|jpeg|jpg|png|svg|svgz|swf|eot|otf|ttf|woff|woff2|css|less|jsf|js|map)$
    
  http-request lua.Datadome_request_hook if !excluded_files
  http-response lua.Datadome_response_hook
    
  default_backend [...]
  
# Backend to receive DataDome payload
backend api-datadome-co
  mode tcp
  server-template datadome-api_ 5 "${DATADOME_ENDPOINT}":443 check resolvers custom-dns init-addr last,libc,none

Settings

SettingsDescriptionDefault ValueRequired
DATADOME_SERVER_SIDE_KEYYour DataDome server-side key - Available inside our dashboardYes
DATADOME_ENDPOINTURL of the closest endpoint.
See API server documentation
api.datadome.coNo
DATADOME_TIMEOUTAPI request timeout for reused connections in ms150msNo
DATADOME_ENABLE_REFERRER_RESTORATIONSee FAQfalseNo

FAQ

How can I have DataDome response status in the log?

  • For each request protected by DataDome, the txn.dd.x_datadome_response contains the value of the HTTP response API.
  • If there is an issue in the call to DataDome, the variable txn.dd.error contains the error code and details.
  • The main errors are as follow:
    • 400 - Bad Request
    • 504 - API Server times out
    • 503 - Invalid response from API Server - DNS Resolution must be checked

How can I get Bot Name, RuleType and Bot/Human flags in my application?

It is possible to specify a Log-Format to log the returned DataDome headers.
Some headers returned by the API are:

  • X-DataDome-botname
  • X-DataDome-ruletype
  • X-DataDome-isbot
  • X-DataDome-requestid
  • ...
    And can be logged using the method lua.ddHeaders as follows.
log-format "X-DataDome-botname: %{+Q}[lua.ddHeaders(X-DataDome-botname)] | X-DataDome-family: %{+Q}[lua.ddHeaders(X-DataDome-ruletype)] | X-DataDome-isbot: %{+Q}[lua.ddHeaders(X-DataDome-isbot)] | %{+Q}[lua.ddHeaders(X-DataDome-requestid)]"

You can find more information here.

How can I exclude files from DataDome protection?

In the HaProxy configuration, option to call DataDome is managed by an HAProxy ACL.
The default ACL is as follows:

acl excluded_files path_reg -i .\.(avi|flv|mka|mkv|mov|mp4|mpeg|mpg|mp3|flac|ogg|ogm|opus|wav|webm|webp|bmp|gif|ico|jpeg|jpg|png|svg|svgz|swf|eot|otf|ttf|woff|woff2|css|less|jsf|js|map)$

How can I exclude specific IPs from DataDome protection?

📘

Access Control List

ACL are powerful and not limited to IP or file extension - they can be leveraged easily to tweak which part of the traffic is behind DataDome Protection.

In the HaProxy configuration, option to call DataDome is managed by an HaProxy ACL, and you can add your own specific rules.

Example to disable DataDome protection for a specific IP range

acl disable_datadome_protection src 192.168.140.0/32
  
# and then add this acl in the condition when declaring the lua hook

http-request lua.Datadome_request_hook if !disable_datadome_protection

How do I restore the Referer request header after a challenge is passed?

After passing a DataDome challenge on browsers other than Firefox, the referrer value is updated to the current URL which can lead to inconsistent results in website analytics.

  • Contact our support team, they will review your requirements and provide you with the best recommendations.
  • Set the boolean value of DATADOME_ENABLE_REFERRER_RESTORATION to true in the haproxy.cfgfile.
global
[...]
setenv DATADOME_ENABLE_REFERRER_RESTORATION "true"
[...]

How can I log processing time and total time?

  • Processing Time :txn.dd.ptime
    The processing time is the latency added by the Bot protection, defined in ms (milliseconds).
    Its measurement start when the request is received in the HaProxy frontend, and stops when a response is received from the DataDome Protection API.
  • Total Time: txn.dd.ttime
    The total time is the processing time, plus the time spent by HaProxy to enrich DataDome headers in the response to the end-user.
    As adding headers is performant in HaProxy, time added to the processing time is measured in µs (microseconds) and can be considered negligeable. In most case, txn.dd.ptime and txn.dd.ttime will have the same value.

Logging these values is done by adding them to the log-format in the haproxy.cfg file as follow:

log-format "ptime=%{+Q}[var(txn.dd.ptime)] ttime=%{+Q}[var(txn.dd.ttime)]"