CloudFront

DataDome CloudFront Lambda detects and protects against bot activity.

This module is to be used on the CloudFront distribution, using the AWS CloudFront Lambda@Edge service.

Compatibility

  • [Recommended] Node.js 14.x+
  • Python 3.9+

Installation

Requirements

Protect your traffic

  1. Connect to your AWS console and go to the Lambda@Edge homepage.

📘

The function must be created on US-EAST-1 region.

AWS automatically selects US-EAST-1 region when you access the Lambda@Edge portal.
Please don't change the region.

  1. Click on the Create function button, then select Author from scratch.
Create a function from scratch.

Create a function from scratch.

📘

You can configure DataDome on an existing Lambda@Edge function

In case you already have an existing Lambda@Edge function configured, you can refer to how do I call and configure the module from another function file? below.

  1. In the Basic information section:
  • Enter a name for your Lambda function, e.g. DataDomeModule-{YOUR WEBSITE NAME},
  • Select Node.js 20.x or Python 3.11 for the runtime,
  • Click on Create function.
Basic information section.

Basic information section.

  1. In the Code source tab:
  • Choose Upload a file from Amazon S3 and paste the following URL for the selected module:
https://s3.amazonaws.com/dd-lambda-edge/datadome-lambda-edge-latest.zip
https://s3.amazonaws.com/dd-lambda-edge/datadome-lambda-edge-py-latest.zip
Upload DataDome code from Amazon S3 location.

Upload DataDome code from Amazon S3 location.

  1. Open the file datadome.js (or datadome.py).
  2. Replace YOUR_DATADOME_LICENSE_KEY with your own DataDome server-side key, available in your DataDome dashboard.
The DataDome code is uploaded as a code source.

The DataDome code is uploaded as a code source.

  1. Scroll down to the Runtime settings tab, click on Edit .
Edit the runtime settings.

Edit the runtime settings.

  1. In the Runtime settings page:
  • Enter datadome.handler (for the Node.js runtime) or datadome.lambda_handler (for the Python runtime) in the Handler field.
  • Click on Save.
Change the Handler setting to datadome.handler.

Change the Handler setting to datadome.handler.

  1. In the Configuration tab and General configuration menu, click on Edit.
Set up the general configuration.

Go to the general configuration page.

  1. In the Edit basic settings page:
  • Set Timeout to 0 min 1 sec,
  • Select an existing role with the required permissions,
  • Click on Save.
Set the general configuration.

Set the basic settings.

  1. Click on Actions and select Publish new version. You can set a version description and click on Publish.
  1. In the Configuration tab click on Add trigger.
  1. In the Add trigger page:
  • Choose CloudFront as trigger and click on Deploy Lambda@Edge,
  • Select the CloudFront distribution that will send events to the Lambda function,
  • Select Viewer Request for CloudFront Event,
  • Do not check the Include body box,
  • Check the Confirm deploy to Lambda@Edge box,
  • Click on Deploy.

🚧

Lambda on viewer request

It is mandatory to associate the lambda function on viewer request. - see Can I associate the Lambda function to the CloudFront distribution on origin request?

  1. Go to your CloudFront distribution general page.
  2. In the Error pages tab, click on Create custom error response.
Create custom error response.

Create custom error response.

  1. In the Create custom error response page:
  • Select HTTP code 403 ,
  • Set minimal TTL 0 ,
  • Check no for Customize error response.
  • Click on Create custom error response.

Congrats! You can now see your traffic in your DataDome dashboard.

Configuration

By default, the configuration is located in the first code block of the datadome.js(or datadome.py) file.

Refer to the next Settings section for the full list of possible configuration settings.

Settings

SettingDescriptionRequiredDefault
DATADOME_LICENSE_KEYYour DataDome server side key, found in your DashboardYes
DATADOME_TIMEOUTThe request timeout to DataDome API, in millisecondsOptional300
DATADOME_URI_REGEXRegular expression to include URIs in the DataDome analysed trafficOptional
DATADOME_URI_REGEX_EXCLUSIONRegular expression to exclude URIs from the DataDome analysisOptionalList of excluded static assets below
DATADOME_LOG_BOT_INFOBoolean to log the requests' bot information in CloudWatch (premium feature)Optionalfalse
DATADOME_ENABLE_GRAPHQL_SUPPORTBoolean to enable GraphQL support - See How can I enable GraphQL support?Optionalfalse
/\.(avi|flv|mka|mkv|mov|mp4|mpeg|mpg|mp3|flac|ogg|ogm|opus|wav|webm|webp|bmp|gif|ico|jpeg|jpg|png|svg|svgz|swf|eot|otf|ttf|woff|woff2|css|less|js|map)$/i

FAQ

How do I get the logs of the Lambda@Edge?

All logs are stored in your CloudWatch dashboards, in the "Logs" section.

If the Lambda@Edge doesn't trigger any logs in the different region used, please check your IAM role and add the following configuration:

{
        "Effect": "Allow",
        "Action": [
            "logs:CreateLogGroup",
            "logs:CreateLogStream",
            "logs:PutLogEvents"
        ],
        "Resource": [
            "*"
        ]
    }

Can I get Bot Name, Rule Type and Bot/Human flags in my application?

The DataDome module can inject headers in the HTTP Request that can be read by your application.
The list of all headers exposed is available in our Log Enrichment page.
These headers are recorded in your CloudWatch logs.

How can I disable CloudFront caching for requests protected by DataDome?

If you are caching dynamic requests (not javascript, css, images) at CloudFront level and these requests are protected by DataDome, you need to change your backend origin to ask CloudFront to not cache these requests if they contain a set-cookie in the response.

Indeed, by default CloudFront will cache http requests even if the backend returned a cookie. It can lead to unexpected bot detection issue. Your backend/origin needs to return this header : Cache-Control: no-cache="Set-Cookie"

You can find more information about this CloudFront behavior in AWS Documentation, in the Disable caching of Set-Cookie headers section.

If you were caching files that were also protected by DataDome, you may want to invalidate the cache by following this CloudFront documentation.

How can I protect only a part of a CloudFront Distribution?

In order to protect only a part of a CloudFront Distribution, select one of the possibilities below:

  • First option: set an exclusion based on file extension. Modify the value DATADOME_URI_REGEX_EXCLUSION inside datadome.js(or datadome.py) in order to exclude hits to the Datadome API. In this case, the Lambda@Edge is still executed (and billed) at the Amazon infrastructure level.
  • Second option: set an exclusion based on path. Define a behavior in you CloudFront Distribution and attach the Lambda@Edge only to the needed paths. In this case, there is no Lambda execution at Amazon infrastructure nor at Datadome API.

Can I associate the Lambda function to the CloudFront distribution on origin request?

No. Protection is not operational when the Lambda function is associated with origin request.
CloudFront will cache requests, even if the caching is completely disabled.
Most of the requests will be executed by the AWS caching mechanism, with modified headers (like user-agent: Amazon CloudFront and will not intercepted by the DataDome module.

If there is already another function associated with theviewer request event, see how to merge functions below in how do I call and configure the module from another function file?

How can I configure the role?

The needed permissions are listed in AWS documentation.

In the Role section:

  • Click on the Permissions tab and select Add inline policy.

  • Select the JSON view and paste the following permissions:
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "VisualEditor0",
            "Effect": "Allow",
            "Action": [
                "iam:CreateServiceLinkedRole",
                "lambda:GetFunction",
                "cloudfront:UpdateDistribution",
                "lambda:EnableReplication"
            ],
            "Resource": "*"
        }
    ]
}
  • Input a name for the permissions and save.
  • Click on the Trust relationships tab and Edit the trust relationship.

  • Paste the following trusted entities:
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Service": [
          "lambda.amazonaws.com",
          "edgelambda.amazonaws.com"
        ]
      },
      "Action": "sts:AssumeRole"
    }
  ]
}

How can I integrate DataDome on a multi-account architecture?

If you have several CloudFront distributions deployed on different AWS accounts, one Lambda@Edge function per account is required. Repeat steps 1 to 16 of the installation from the Protect your traffic section.

How do I call and configure the module from another function file?

From version 1.18.0 of the Node.js Lambda@Edge, calling and configuring the module can be done inside another file.
The following example explains how to update a handler in the file index.js.

  1. Import the DataDome code as mentioned in the installation steps.
  2. Import the DataDome module inside your index.jsfile:
const datadome = require("./datadome");
  1. Configure the DataDome module inside your index.jsfile:
// Configure DataDome module
const configuration = {
  serverSideKey:        'serverSideKeyValue',
  timeout:              300,
  maxSockets:           100,
  debug:                false,
  urlPatternInclusion:  null,
  urlPatternExclusion:  /\.(avi|flv|mka|mkv|mov|mp4|mpeg|mpg|mp3|flac|ogg|ogm|opus|wav|webm|webp|bmp|gif|ico|jpeg|jpg|png|svg|svgz|swf|eot|otf|ttf|woff|woff2|css|less|js|map)$/i
};
datadome.configure(configuration);

📘

Update the configuration values (only serverSideKey is mandatory).

Other keys are shown with their default values.

  1. Update your handler from the index.jsfile to execute the DataDome protection:
exports.handler = (event, context, callback) => {
  // Call DataDome handler
  datadome.handler(event, context, callback);
  // [...] 
}
  1. Make sure the handler configured for this Lambda@Edge is index.handler in the Runtime settings section.

How can I dynamically configure the module?

It is not possible to use environment variables in Lambda@Edge due to an AWS limitation - see Restrictions on edge functions.
However you can still follow the steps to configure the module from another function file and override a configuration depending on AWS Secrets Manager for instance.

Can I use CloudFront Functions?

It is not possible to set up DataDome inside CloudFront Functions as they do not provide network access to call third-party APIs - see Restrictions on CloudFront Functions.

How can I enable GraphQL support?

Since version 1.19.0 of CloudFront Node.js, it is possible to enable GraphQL support:

  1. Change the value of DATADOME_ENABLE_GRAPHQL_SUPPORT to true inside the datadome.js:
const DATADOME_ENABLE_GRAPHQL_SUPPORT = true;
  1. Configure the CloudFront distribution behavior:
  • Allow POST methods
  • Include the body of the request in the DataDome Lamda@Edge on the viewer request by checking the corresponding box:

Some headers are missing on viewer requests. How can I get them?

Some custom headers such as X-DataDome-ClientID for the session by header feature might not be visible on viewer requests.
To make them available, we recommend to change the origin request policy of your CloudFront distribution to include all headers.
You can achieve this by using the managed policy called AllViewer as described on this AWS documentation.

How can I enable HTTP Client hints?

HTTP Client hints can be collected by the DataDome module to enhance the detection.

For these values to be defined by the browser, Accept-CH header must be sent by the origin. You can achieve this by using a response header policy .

For the header Accept-CH set the value:

Sec-CH-UA,Sec-CH-UA-Mobile,Sec-CH-UA-Platform,Sec-CH-UA-Arch,Sec-CH-UA-Full-Version-List,Sec-CH-UA-Model,Sec-CH-Device-Memory
Create a response header policy for Accept-CH

Create a response header policy for Accept-CH

How to upgrade the module

Update the code of the lambda and publish a new version:

  1. Select the lambda function to update
  2. Store the active configuration of the module (List of possible configuration )
  3. Click Upload from then Amazon S3 location as shown below
  1. We support both NodeJS and Python framework, paste S3 location depending of your framework and click on Save
https://s3.amazonaws.com/dd-lambda-edge/datadome-lambda-edge-latest.zip
https://s3.amazonaws.com/dd-lambda-edge/datadome-lambda-edge-py-latest.zip
  1. Replace YOUR_DATADOME_LICENSE_KEY with your own DataDome server-side key, available in your DataDome dashboard.

  2. Restore other specific configuration stored during Step 2.

  3. We need to Deploy this up to date version to Lambda@Edge. Click to Actions and Deploy to Lamda@Edge

  4. Select the Cloudfront distribution and click Deploy