Cyral
Get Started Sign In

Protect S3 Data Endpoints with Cyral

You can protect your S3 buckets using Cyral. Once you've associated a Cyral sidecar with your S3 storage, data users can connect to that storage using SSO authentication, and Cyral will monitor activity there.

Track the S3 endpoint in Cyral

  • In the Cyral management console, navigate to the Data Repos tab and click the plus button.

  • In the pop-up dialog, enter Amazon S3 as the Repository Type, and click Track.


Install the sidecar

  • If you don't already have a sidecar deployed that can serve this repository, add it now as shown in Install a sidecar.


Associate the S3 endpoint with your sidecar

To protect a repository and allow users to connect to it, you must associate the tracked repository with its sidecar.

  • In the Sidecars tab, select the sidecar to which you'd like to assign the repository and click the plus sign.

  • In the Assign a Repository window, choose the name of the S3 repository you created above, and specify the hostname and port where data users will connect to this repository. The TLS toggle is always set to ON for S3 repositories.

  • In the pop-up dialog, select your repository and click Track.

Your S3 data endpoint is now accessible through the Cyral sidecar. See the next section for connection instructions.

Create a data map and policy to protect S3 endpoints

You protect an S3 endpoint in Cyral in the same way that you protect any data repository: 

  • Create a data map listing the S3 endpoints to be protected, applying a label or labels to each endpoint. Policies refer to endpoints using only labels. Note that many endpoints can be grouped under one label, so you can write an easy-to-read policy that treats your endpoints consistently, even when you have many S3 buckets to protect. 

  • Write your policy as you would for a data repository, specifying the users and groups who can access each label, and which actions they can perform. 


See the sections below to learn how to create data maps and policies that protect S3 endpoints. (For an introduction to the structure of data maps and policies, see the policy guide.)


Data map for S3 endpoints

Data maps follow the structure shown below. Within a label declaration you can add as many repo entries as you like. Each repo entry represents one or more locations in a database or data collection.



Data map:

FINANCE_2019:

  - repo: S3

    attributes:

      - finance-data-role-a.2019

FINANCE_2020:

  - repo: S3

    attributes:

      - finance-data-role-a.2020



Policy for S3 endpoints

Your policy applies access rules to your data map labels:


Policy:
data:

  - FINANCE_2019

  - FINANCE_2020

rules:

  - identities:

      users:

        - frank.hardy@hhiu.us

    reads:

      - data:

          - FINANCE_2020

        rows: any

        severity: medium

    updates:

      - data:

          - FINANCE_2020

        rows: any

        severity: medium



Once you've saved your policy and data map, you're ready to start letting users connect. 

  • To see sample queries and the log entries they generate, see the next two sections on activity logs.

  • To learn how to connect through the sidecar, see Connect to an S3 data endpoint through Cyral, later in this document.


Activity log for an allowed S3 endpoint access event

When a user submits the following request:


>  aws s3 cp s3://finance-data-role-a/2020/december ./


Cyral allows access and generates a log entry like the following example:


Activity Log 1:

{

    "activityId": "3.138.37.216:39752:1628109069169872320:1",

    "activityTime": "2021-08-04 20:31:09.183801957 +0000 UTC",

    "activityTimeNanos": 1628109069183801957,

    "activityTypes": [

        "query"

    ],

    "identity": {

        "endUser": "frank.hardy@hhiu.us",

        "endUserEmail": "frank.hardy@hhiu.us",

        "repoUser": "728433162697:role/roleA"

    },

    "repo": {

        "id": "1uZpiLQs1SrgcXqxA8x0lNaPrHu",

        "name": "S3",

        "type": "s3",

        "host": "s3.amazonaws.com",

        "port": 443

    },

    "client": {

        "connectionId": "3.138.37.216:39752:1628109069169872320",

        "connectionTime": "2021-08-04 20:31:09.16987232 +0000 UTC",

        "connectionTimeNanos": 1628109069169872320,

        "host": "3.138.37.216",

        "port": 39752,

        "applicationName": "aws-cli/1.18.147 Python/2.7.18 Linux/4.14.232-177.418.amzn2.x86_64 botocore/1.18.6"

    },

    "sidecar": {

        "id": "1vmhFBIeGriWeZbgLy7aiyFxv9i",

        "name": "jc-mongo-demo",

        "autoScalingGroupInstance": "i-0cbcca8a2423024c7"

    },

    "request": {

        "statement": "get: finance-data-role-a/2020/december",

        "statementType": "GET OBJECT",

        "isSensitive": true,

        "datasetsAccessed": [

            {

                "dataset": "finance-data-role-a",

                "accessType": "read"

            }

        ],

        "fieldsAccessed": [

            {

                "field": "finance-data-role-a.2020.december",

                "label": "FINANCE_2020",

                "accessType": "read"

            }

        ]

    },

    "response": {

        "message": "OK",

        "isError": false,

        "records": 1,

        "bytes": 1014654,

        "executionTime": "61.237082ms",

        "executionTimeNanos": 61237082

    },

    "policyViolated": false

}


Activity log for a blocked S3 endpoint access event

When a user submits the following request:


>  aws s3 cp s3://finance-data-role-a/2019/summary.csv ./


Cyral blocks the request and generates a log entry like the following example:


Activity Log 2:

{

    "activityId": "3.138.37.216:42970:1628109149859249807:1",

    "activityTime": "2021-08-04 20:32:30.019326479 +0000 UTC",

    "activityTimeNanos": 1628109150019326479,

    "activityTypes": [

        "query"

    ],

    "identity": {

        "endUser": "frank.hardy@hhiu.us",

        "endUserEmail": "frank.hardy@hhiu.us",

        "repoUser": "728433162697:role/roleA"

    },

    "repo": {

        "id": "1uZpiLQs1SrgcXqxA8x0lNaPrHu",

        "name": "S3",

        "type": "s3",

        "host": "s3.amazonaws.com",

        "port": 443

    },

    "client": {

        "connectionId": "3.138.37.216:42970:1628109149859249807",

        "connectionTime": "2021-08-04 20:32:29.859249807 +0000 UTC",

        "connectionTimeNanos": 1628109149859249807,

        "host": "3.138.37.216",

        "port": 42970,

        "applicationName": "aws-cli/1.18.147 Python/2.7.18 Linux/4.14.232-177.418.amzn2.x86_64 botocore/1.18.6"

    },

    "sidecar": {

        "id": "1vmhFBIeGriWeZbgLy7aiyFxv9i",

        "name": "jc-mongo-demo",

        "autoScalingGroupInstance": "i-0cbcca8a2423024c7"

    },

    "request": {

        "statement": "head: finance-data-role-a/2019/summary.csv",

        "statementType": "HEAD OBJECT",

        "isSensitive": true,

        "datasetsAccessed": [

            {

                "dataset": "finance-data-role-a",

                "accessType": "read"

            }

        ],

        "fieldsAccessed": [

            {

                "field": "finance-data-role-a.2019.\"summary.csv\"",

                "label": "FINANCE_2019",

                "accessType": "read"

            }

        ]

    },

    "policyViolated": true,

    "policyViolations": [

        {

            "label": "FINANCE_2019",

            "policyName": "S3_Policy",

            "policyId": "1wHGf3niEerzFObdtKm7qn38qo5",

            "accessType": "read",

            "selectedIdentity": "user:frank.hardy@hhiu.us",

            "reasons": [

                "read is disallowed"

            ],

            "severity": "high"

        }

    ],

    "blockedQuery": "requestBlocked"

}




Connect to an S3 data endpoint through Cyral


These instructions assume you already have performed the following tasks:

  • Tracked the S3 endpoint in Cyral

  • Deployed a sidecar

  • Associated the S3 endpoint with the sidecar in the Cyral Control Plane


For connecting to your S3 endpoint through the sidecar, we need to take the following steps:


  1. Get the endpoint address

Go to the Cyral Control Plane and find the sidecar endpoint and port associated with the S3 data repository. Click Sidecars, click the name of the sidecar, find the name of your S3 data endpoint, and note the Sidecar endpoint address:



In the above figure, the endpoint is www.sidecar-endpoint.com with the port 453


  1. Store the endpoint address as the proxy address

In a shell session, we need to export the following environment variables:



and if the host machine is a AWS EC2 instance, you also need to export the variable:


export NO_PROXY=169.254.169.254


Detailed information and settings for configuring the proxy endpoint for different systems is available at: https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-proxy.html


  1. Get an SSO token and add it to your credentials file

Append the following profile information to the AWS credentials file. This file is usually found at ~/.aws/credentials .


[sidecar]

aws_access_key_id=userA01@company.com:c17c7b65c159544f75bd98e91d584

aws_secret_access_key=none


  • userA01@company.com  is the email address used to log in to the Control Plane.

  • c17c7b65c159544f75bd98e91d584 is the access token copied from the Control Plane. You can also get the token from the Cyral CLI token retriever for SSO, gimme_db_token. For instructions, see Connect to S3 data using the CLI token retriever.


  1. Access S3 using the profile you created

Run S3 commands specifying the previously created sidecar profile:


> aws s3 ls --profile sidecar


This uses the credentials collected from the Control Plane and validates the user, sending all traffic through the sidecar. Logs should be available at the log location configured in your Cyral installation, such as Cloudwatch, ELK, or Splunk.



Alternative and extra configurations

Adding a certificate authority bundle to the CLI tool

The sidecar needs to intercept the TLS communication between the user and AWS servers. For keeping the connection secure, the sidecar signs the messages using its own certificate. Some CLI clients and tools may complain about the certificate used by the sidecar, so an extra configuration needs to be performed:


First, download the certificate bundle from the control plane using the following command:

curl https://<CONTROLPLANE_ENDPOINT>:8000/v1/templates/ca_bundle -o cyral_ca_bundle.pem


Next, provide the CA bundle to the AWS CLI using one of the following commands.

  1. Using an environment variable:

export AWS_CA_BUNDLE=/path/to/cyral_ca_bundle.pem


  • cyral_ca_bundle.pem: This file is used by the AWS CLI tool to validate the certificate sent by the sidecar. This file will be shared by the Cyral’s Support team.

  1. Using profile configuration: Append the following information to the configuration file (the location will be either or defined by the environment variable)

[profile sidecar]

ca_bundle = /path/to/cyral_ca_bundle.pem


Both options above (a) and (b) are equivalent and only one needs to be performed


Using a profile configuration instead of environment variables for proxy settings


Throughout this guide, we've used environment variables to configure the proxy settings used by the AWS CLI tool. An alternative approach is to add the proxy settings to the configuration profile instead. To do this, we need to add a third-party plugin, "awscli-plugin-s3-proxy", to the AWS CLI tool.


For the steps below, we are assuming the profile sidecar_sample_profile will be used. 


Note: If your system doesn't recognize the python -m pipcommand, you can try one of the following:
  • Replace python -m pip with just pip in the CLI commands
  • Replace python -m pip with either python -m pip3 or just pip3 (for environments with pip support  for Python 2 and Python 3)
  • Install pip following: https://pip.pypa.io/en/stable/installing/


Procedure


1. Install the plugin, "awscli-plugin-s3-proxy", using python pip:

python -m pip install awscli-plugin-s3-proxy --user


2. If you are using the AWS CLI v2 (aws --version), you will need to define the appropriate location for the plugin:

a. Get the plugin location:

python -m pip show awscli-plugin-s3-proxy

It is expected to have a Location key in the generated output, for example:

  • Linux: Location: /home/<username>/.local/lib/python3.8/site-packages
  • Mac: Location: /Users/<username>/.local/lib/python3.8/site-packages
  • Windows: Location: c:\Users\<username>\Application Data\python\python38\site-packages

b. Set the plugin location with the command:

aws configure set plugins.cli_legacy_plugin_path <LOCATION>

which for our Mac output above, would be, for example:

aws configure set plugins.cli_legacy_plugin_path /Users/<username>/.local/lib/python3.8/site-packages


3. Configure using the AWS CLI:

aws configure set plugins.s3-proxy awscli_plugin_s3_proxy


4. Add the sidecar as a proxy for S3:

aws configure --profile sidecar_sample_profile set s3.proxy http://<YOUR_SIDECAR_ADDRESS>:453

replacing <YOUR_SIDECAR_ADDRESS> with the actual address.


5. Add the Cyral CA bundle to the same profile:

aws configure --profile sidecar_sample_profile set ca_bundle /path/to/cyral_ca_bundle.pem


Enable SSO login on your S3 data endpoint

To enable SSO login on your S3 data repository, select the S3 repository and go to the Advanced tab. Here, make the following settings.

  1. Choose your Identity Provider from the drop-down box.

  2. Unselect the checkbox Allow native authentication.

  3. Under Enforcement, enable Block on violations.




Provide the IAM roles needed for accessing S3

Under “Local Accounts'' tab for the S3 repo specify the IAM roles that provide various levels of access to S3. Depending on which end user is connecting, the sidecar will assume one of these roles and make the S3 request using the local account assigned to that user. For more information on IAM role settings see "Make AWS IAM role settings” at the end of this section


  1. In the Cyral control plane UI, go to Repositories, click Local Accounts, and click Track Account.

  2. In the Track Account form, enter the IAM role ARN

  3. Click Track.


In the below screenshot a role providing full s3 access is shown as example


Map an SSO user or group to the IAM Role

When a user authenticates, they can be mapped to use a specific IAM role to access S3 based on their username, or based on their membership in an SSO group. Set up the mapping as follows.


  1. In the Repositories page, click Identity to Account Map and click the plus sign.

  2. Choose User or Group as the identity type.

  3. In the Identity field, specify the SSO user name or group name as it's written in your identity service.

  4. In the Local Account field, choose the name of the IAM role

  5. In the Duration field, set a length of validity for the access, or click Unlimited to grant access that will not expire automatically.

  6. Click Create.


In the below screenshot users belonging to “Data Analysts” SSO group are mapped to user “SidecarS3FullAccess” role 



Make AWS IAM role settings

It is important to make sure that the IAM role associated with the sidecar is trusted by the IAM roles used for managing S3. 


Find your sidecar host role

After deploying the sidecar, the IAM role associated with the sidecar is created with name <sidecar-cft-stack-name>SidecarHostRole-*

Below is an example sidecar host role created for a sidecar with stack name “jc-t01”


> aws iam list-roles



// Sidecar Role

{

    "Path": "/",

    "RoleName": "jc-t01-SidecarHostRole-MOVF2C5ORWCY",

    "RoleId": "AROA2TGP77HETTOXZZB46",

    "Arn": "arn:aws:iam::<accountId>:role/jc-t01-SidecarHostRole-MOVF2C5ORWCY",

    "CreateDate": "2020-12-22T23:58:42Z",

    "AssumeRolePolicyDocument": {

        "Version": "2012-10-17",

        "Statement": [

            {

                "Effect": "Allow",

                "Principal": {

                    "Service": "ec2.amazonaws.com"

                },

                "Action": "sts:AssumeRole"

            }

        ]

    },

    "Description": "",

    "MaxSessionDuration": 3600

},


Create suitable IAM roles for S3 access and establish trust with sidecar role

The various roles and the extent of permissions the roles get is up to the organization. In the below example, we have show two roles “SidecarReadOnlyRole” and “SidecarS3FullAccess”. There is a trust relationship between these roles and the SidcarHostRole, which allows sidecar to assume these roles.


// S3 Access Read Only

{

    "Path": "/",

    "RoleName": "SidecarReadOnlyRole",

    "RoleId": "AROA2TGP77HEZGE2FX3G4",

    "Arn": "arn:aws:iam::<accountId>:role/SidecarReadOnlyRole",

    "CreateDate": "2020-12-14T18:48:08Z",

    "AssumeRolePolicyDocument": {

        "Version": "2012-10-17",

        "Statement": [

            {

                "Effect": "Allow",

                "Principal": {

                    "Service": "ec2.amazonaws.com"

                },

                "Action": "sts:AssumeRole"

            },

            {

                "Effect": "Allow",

                "Principal": {

                    // ARN of the sidecar IAM role

                    "AWS":"arn:aws:iam::<accountid>:role/jc-t01-SidecarHostRole-MOVF2C5ORWCY"

                },

                "Action": "sts:AssumeRole"

            }

        ]

    },

    "Description": "Allows EC2 instances to call AWS services on your behalf.",

    "MaxSessionDuration": 3600

}




// S3 Role Full Access

{

    "Path": "/",

    "RoleName": "SidecarS3FullAccess",

    "RoleId": "AROA2TGP77HEYZQT3DH4A",

    "Arn": "arn:aws:iam::<accountId>:role/SidecarS3FullAccess",

    "CreateDate": "2020-12-14T18:48:32Z",

    "AssumeRolePolicyDocument": {

        "Version": "2012-10-17",

        "Statement": [

            {

                "Effect": "Allow",

                "Principal": {

                    "Service": "ec2.amazonaws.com"

                },

                "Action": "sts:AssumeRole"

            },

            {

                "Effect": "Allow",

                "Principal": {

                    // ARN of the sidecar IAM role

                    "AWS": "arn:aws:iam::<accountId>:role/jc-t01-SidecarHostRole-MOVF2C5ORWCY"

                },

                "Action": "sts:AssumeRole"

            }

        ]

    },

    "Description": "Allows EC2 instances to call AWS services on your behalf.",

    "MaxSessionDuration": 3600

}



By having the sidecar role as a trusted entity for your S3 management roles, you are allowing the sidecar to assume these custom roles, when required, for connecting to the S3 servers when handling SSO connections.


This information can also be edited in the AWS Console in the IAM → Roles section.



Users can connect to S3 using the CLI token retriever for SSO

Data users can use the Cyral CLI token retriever for SSO, gimme_db_token, to quickly authenticate and connect to their S3 data endpoints. For instructions, see Connect to S3 data using the CLI token retriever.




Did you find it helpful? Yes No

Send feedback
Sorry we couldn't be helpful. Help us improve this article with your feedback.