HAProxy (spoe)
DataDome HAProxy module detects and protects against bot activity.
Before the regular HAProxy process starts, the module makes a call to one of our Regional Endpoints using a KeepAlive connection.
Depending on the response, the module will either block the query or let HAProxy proceed with the regular process.
The module has been developed to protect the visitors' experience: If any errors were to occur during the process, or if the timeout is reached, the module will automatically disable its blocking process and allow those hits.
Compatibility
DataDome module has been tested with HAProxy versions 1.8 and higher.
Due to a bug found in HAProxy SPOE, the following minor versions are not compatible: 1.8.9, 1.8.21, 1.9.8 until 1.9.11 included.
Unsupported HAProxy versions
Most non-LTS (Long Term Support) versions are unmaintained. We encourage you to upgrade your setup if you are using one of the following versions which are not supported by HAProxy anymore.
- 1.7+
- 1.8+
- 1.9+
- 2.1+
- 2.3+
- 2.5+
Install HAProxy with Lua support
If you already have HAProxy binary with Lua support, you can skip this section.
yum install https://centos7.iuscommunity.org/ius-release.rpm
yum install haproxy18u
apt-get install haproxy
Configuration
You need to follow the steps below:
- Download the latest DataDome module here and unzip it in your HAProxy configuration directory. The archive includes the following files:
- spoe-datadome.conf: configuration of the SPOE filter
- datadome.lua: a LUA script that handles the transformation of the HTTP request
- Edit the spoe-datadome.conf file and replace DATADOME_API_KEY with your own API Key
- Update your HAProxy configuration file by replacing with the actual path where you placed the file, and setting the different blocks needed:
global
[...]
lua-load <PATH>/datadome.lua
[...]
# Example of frontend which will be protected
frontend http
[...]
# Insert these lines on each frontend you want to protect
http-request set-var(txn.placeholder1) var(txn.dd.x_datadome_request_headers)
http-request set-var(txn.placeholder2) var(txn.dd.x_datadome_headers)
http-request set-var(txn.placeholder3) var(txn.dd.x_datadome_response)
http-request set-var(txn.placeholder4) var(txn.dd.body)
http-request set-var(txn.placeholder5) var(txn.dd.error)
filter spoe engine datadome config <PATH>/spoe-datadome.conf
http-request lua.Datadome_request_hook
http-response lua.Datadome_response_hook
# Insert this line before all default_backend / use_backend directives
use_backend failure_backend if { var(txn.dd.status) -m str "blocked" }
default_backend [...]
# Backend to server the "blocked page"
backend failure_backend
mode http
http-request use-service lua.failure_service
# Backend to contact Datadome API
backend spoe-datadome
mode tcp
timeout connect 1s
option tcp-check
tcp-check connect ssl
server datadome-spoe1 api.datadome.co:12346 check ssl verify none
Optional: To maximize high availability, our endpoints rely on several IPs. To benefit from this IP resolution, we suggest inserting a "resolvers" section inside your HAProxy configuration. You can find the full documentation for HAproxy v1.8 in the following link: https://cbonte.github.io/haproxy-dconv/1.8/configuration.html#5.3.2
Keep the
failure_backend
declaration firstHAProxy is using backends in the order they are defined in the configuration file.
backend failure_backend
should remain first in order to be used by HAProxy when a request is blocked by DataDome. If not, blocked requests will be let through and reach the backend you defined in priority.Reference documentation from HAProxy here
All of these rules are evaluated in their declaration order, and the first one which matches will
assign the backend.
Note: The TCP connection to DataDome is based on the values set in the global and default sections.
Settings
Settings | Description | Default value |
---|---|---|
API endpoint URL | URL of the closest endpoint. More info here | api.datadome.co |
API endpoint port | Plain TCP: 12345 SSL: 12346 | |
Timeout hello (spoe-datadome.conf) | Timeout for the SPOE for beginning handshake. Should be at least 4 times the latency RTT with DataDome (1 for TCP, 2 for TLS, 1 for SPOE) +10 ms. | 100 ms |
Timeout idle (spoe-datadome.conf) | Maximum time to wait for an agent to close an idle connection. Value must be smaller than the "timeout server" of the SPOE backend. | 10 minutes |
Timeout processing (spoe-datadome.conf) | Maximum time to wait for a stream to process an event. A hit is generated if the upper-bound limit of DataDome latency overhead is reached. You can find the number of timeouted connections by logging the txn.dd.error variable. On timeout, this variable is set to 1 (see below for other codes). | 50 ms |
ACL static_file url_reg | Using HAProxy ACL. By default no calls will be made to DataDome for static assets. | .(js|css|jpg|jpeg|png|ico| gif|tiff|svg|woff|woff2|ttf| eot|mp4|otf)$ |
FAQ
Can I have DataDome response status in the log?
Module compatibility
Only supported on HAProxy18
1.8.0+
and HAPEE1.5.1+
modules.
The specific HAProxy variables are set as below:
- When the interrogation is correctly handled by DataDome, the txn.dd.x_datadome_response contains the value of the HTTP response API
- When there is an issue in the call to DataDome, the variable txn.dd.error contains the SPOE error code:
- The complete code list can be found in the link below: https://www.haproxy.org/download/1.8/doc/SPOE.txt
- The main codes are as follows:
- 1: A timeout occurred during the event processing
- 2: An error was triggered during the resource allocation
- 5: The frame processing has been interrupted by HAProxy
- 255: An unknown error occurred during the event processing
- Higher than 256: A SPOP error occurred during the event processing (Refer to SPOE documentation)
Can I get Bot Name, Bot Type and Bot/Human flags in my application?
From version 1.8.0 of this module, you can log the values of the DataDome headers by configuring your log format.
HA Proxy configuration. The list of all headers exposed is available in our Log Enrichment page.
# frontend settings with DataDome integration
http-request lua.Datadome_request_hook
http-response lua.Datadome_response_hook
# Custom log for DataDome Enrich headers
log-format "X-DataDome-botname: %{+Q}[lua.ddHeaders(X-DataDome-botname)] | X-DataDome-isbot: %{+Q}[lua.ddHeaders(X-DataDome-isbot)] | X-DataDome-ruletype: %{+Q}[lua.ddHeaders(X-DataDome-ruletype)]"
use_backend failure_backend if { var(txn.dd.status) -i -m str blocked }
I see that some requests are blocked in the DataDome dashboard, but the captcha is not displayed
HAProxy evaluates the use_backend
directive in declaration order, and picks the first one matching the rules defined.
The failure_backend
must be declared first to display the captcha. It is triggered only when the request is blocked by DataDome.
Exclude pages from DataDome protection
In the spoe-datadome.conf file, the option to call DataDome is managed by an HAProxy ACL.
By default, we exclude requests with paths ending by js, css, jpg, jpeg, png, ico, gif, tiff, svg, woff, woff2, ttf, eot, mp4, otf.
You can use the complete HAProxy ACL rules set to choose which requests will be sent to DataDome.
How to restore the Referer
request header after a challenge has been passed?
Referer
request header after a challenge has been passed?When passing a DataDome challenge on browsers other than Firefox, the referrer value is updated which can lead to inconsistent results in website analytics.
To restore the Referer
header to its original value for your backend:
- Contact our support team, they will review your requirements and provide you with the best recommendations.
- Ensure that you have DataDome HAProxy module version 1.9.0 or higher,
- Add the line
http-request lua.DataDome_restore_dd_referrer
inside your HaProxy configuration file, like this:
frontend http
[...]
# Insert these lines on each frontend you want to protect
http-request lua.DataDome_restore_dd_referrer
http-request set-var(txn.placeholder1) var(txn.dd.x_datadome_request_headers)
http-request set-var(txn.placeholder2) var(txn.dd.x_datadome_headers)
http-request set-var(txn.placeholder3) var(txn.dd.x_datadome_response)
http-request set-var(txn.placeholder4) var(txn.dd.body)
http-request set-var(txn.placeholder5) var(txn.dd.error)
filter spoe engine datadome config <PATH>/spoe-datadome.conf
http-request lua.Datadome_request_hook
http-response lua.Datadome_response_hook
# Insert this line before all default_backend / use_backend directives
use_backend failure_backend if { var(txn.dd.status) -m str "blocked" }
default_backend [...]
Updated 3 days ago