HAProxy

DataDome HAProxy module detects and protects against bot activity.

Before the regular HAProxy process starts, the module makes a call to one of our Regional Endpoints using a KeepAlive connection.

Depending on the response, the module will either block the query or let HAProxy proceed with the regular process.
The module has been developed to protect the visitors' experience: If any errors were to occur during the process, or if the timeout is reached, the module will automatically disable its blocking process and allow those hits.

Compatibility

DataDome module has been tested with HAProxy versions 1.8 and higher.

Due to a bug found in HaProxy SPOE, the following minor versions are not compatible: 1.8.9, 1.8.21, 1.9.8 till 1.9.11.

Get HAProxy with LUA through package manager

If you already have HAProxy binary with LUA support, you can skip this section.

yum install https://centos7.iuscommunity.org/ius-release.rpm
yum install haproxy18u
apt-get -t stretch-backports install haproxy

Configuration

You need to follow the steps below:

  • Download the latest DataDome module from this link: https://package.datadome.co/linux/DataDome-Haproxy18-latest.tgz and unzip it in your HAProxy configuration directory. The archive includes the following files:
    • spoe-datadome.conf: configuration of the SPOE filter
    • datadome.lua: a LUA script that handles the transformation of the HTTP request
  • Edit the spoe-datadome.conf file and replace DATADOME_API_KEY with your own API Key
  • Update your HAProxy configuration file by replacing with the actual path where you placed the file, and setting the different blocks needed:
global
    [...]
    lua-load <PATH>/datadome.lua 
    [...]

# Example of frontend which will be protected
frontend http
    [...]
    # Insert these lines on each frontend you want to protect
    http-request set-var(txn.dummy1) var(txn.dd.x_datadome_request_headers)
    http-request set-var(txn.dummy2) var(txn.dd.x_datadome_headers)
    http-request set-var(txn.dummy3) var(txn.dd.x_datadome_response)
    http-request set-var(txn.dummy4) var(txn.dd.body)
    http-request set-var(txn.dummy5) var(txn.dd.error)
    filter spoe engine datadome config <PATH>/spoe-datadome.conf
    http-request lua.Datadome_request_hook
    http-response lua.Datadome_response_hook
    # Insert this line before all default_backend / use_backend directives
    use_backend failure_backend if { var(txn.dd.status) eq "blocked" }
    default_backend [...]

# Backend to server the "blocked page"
backend failure_backend
    mode http
    http-request    use-service     lua.failure_service 

# Backend to contact Datadome API
backend spoe-datadome
    mode tcp
    timeout connect 1s
    option tcp-check
    tcp-check connect ssl
    server datadome-spoe1 api.datadome.co:12346 check ssl verify none

Optional: To maximize high availability, our endpoints rely on several IPs. To benefit from this IP resolution, we suggest inserting a "resolvers" section inside your HAProxy configuration. You can find the full documentation for HAproxy v1.8 in the following link: https://cbonte.github.io/haproxy-dconv/1.8/configuration.html#5.3.2

Note: The TCP connection to DataDome is based on the values set in the global and default sections.

Settings

Settings

Description

Default value

API endpoint URL

URL of the closest endpoint.
More info here

api.datadome.co

API endpoint port

Plain TCP: 12345
SSL: 12346

Timeout hello (spoe-datadome.conf)

Timeout for the SPOE for beginning handshake.
Should be at least 4 times the latency RTT with DataDome (1 for TCP, 2 for TLS, 1 for SPOE) +10 ms.

100 ms

Timeout idle (spoe-datadome.conf)

Maximum time to wait for an agent to close an idle connection.
Value must be smaller than the "timeout server" of the SPOE backend.

10 minutes

Timeout processing (spoe-datadome.conf)

Maximum time to wait for a stream to process an event.
A hit is generated if the upper-bound limit of DataDome latency overhead is reached.
You can find the number of timeouted connections by logging the txn.dd.error variable. On timeout, this variable is set to 1 (see below for other codes).

50 ms

ACL static_file url_reg

Using HAProxy ACL. By default no calls will be made to DataDome for static assets.

.(js|css|jpg|jpeg|png|ico|
gif|tiff|svg|woff|woff2|ttf|
eot|mp4|otf)$

FAQ

Can I have DataDome response status in the log?

The specific HAProxy variables are set as below:

  • When the interrogation is correctly handled by DataDome, the txn.dd.x_datadome_response contains the value of the HTTP response API
  • When there is an issue in the call to DataDome, the variable txn.dd.error contains the SPOE error code:
    • The complete code list can be found in the link below: https://www.haproxy.org/download/1.8/doc/SPOE.txt
    • The main codes are as follows:
      • 1: A timeout occurred during the event processing
      • 2: An error was triggered during the resource allocation
      • 5: The frame processing has been interrupted by HAProxy
      • 255: An unknown error occurred during the event processing
      • Higher than 256: A SPOP error occurred during the event processing (Refer to SPOE documentation)

Can I get Bot Name, Bot Type and Bot/Human flags in my application?

DataDome module can inject headers in the HTTP request that can be read by your application.
You can find more information here.

Exclude pages from DataDome protection

In the spoe-datadome.conf file, the option to call DataDome is managed by an HAProxy ACL.
By default, we exclude requests with paths ending by js, css, jpg, jpeg, png, ico, gif, tiff, svg, woff, woff2, ttf, eot, mp4, otf.
You can use the complete HAProxy to choose which requests will be sent to DataDome.

Updated 8 days ago

HAProxy


DataDome HAProxy module detects and protects against bot activity.

Suggested Edits are limited on API Reference Pages

You can only suggest edits to Markdown body content, but not to the API spec.