You can configure your Apache NiFi instance to load data from a Cyral-protected S3 bucket. This article assumes you’ve already configured a sidecar — if you haven’t, then please see Protect S3 Data Endpoints with Cyral.
Set the proxy address
Each S3 processor in Nifi has Proxy Host and a Proxy Port fields. Set the proxy to the address and port of the Cyral sidecar that protects your S3 instance. For details, see the Apache Nifi ListS3 documentation.
For example, if your pipeline uses the ListObjects, GetObject (to retrieve the data from bucket A) and PutObject (to insert the data into bucket B), then you must set the Cyral sidecar as the proxy for all three NiFi processors.
To set the proxy:
In the NiFi dashboard, double-click on your processor to open the Configuration panel. Then click Properties.
Add the sidecar endpoint in the Proxy Host field (no HTTP or HTTPS schema required). Please also add the Sidecar listening port in the Proxy Host Port field. Proxy username and Proxy password can be left blank
Scroll up in the properties page and add your SSO credentials:
For ACCESS KEY ID, use ssoUserName:CyralAccessToken
For Secret Access Key, use “none”
Enter any needed additional configuration values, like bucket name and object key.
Click Apply.
Add the Cyral CA to the trusted store
Include the Cyral CA in the default trusted store used by the Java virtual machines that run your NiFi processors. (For an example of adding a trusted CA to a JVM, see these example instructions.)
To get the Cyral CA cert, use the API call:
curl https://<CONTROLPLANE_ENDPOINT>:8000/v1/templates/ca_bundle -o cyral_ca_bundle.pem
Once you’ve made the above configurations, requests to S3 from your NiFi instance will be monitored by the Cyral sidecar — as such, these requests will produce corresponding activity logs and can be governed through data access policies.
Additional resources
See Protect S3 Data Endpoints with Cyral to understand how to configure Cyral to protect S3 resources, and how to write a policy that grants selective access to data in S3.