Trino + Apache Ranger on Kubernetes with Helm: Full Guide

Trino + Apache Ranger on Kubernetes (Helm): A Practical, Audited Data Query Platform

Accelerate secure analytics on Kubernetes by combining Trino's fast, federated SQL engine with Apache Ranger's fine‑grained authorization and auditing. This guide gives you a reproducible Helm-based deployment with MySQL-backed Ranger policies, Solr audits, and an optional TPCH catalog for quick validation. It also includes production hardening tips, troubleshooting, and common pitfalls so you can run with confidence—not just "get it to start."

What you'll build

You'll deploy:

Ranger namespace with three core services: ranger-admin, ranger-mysql, and ranger-solr, packaged via Helm.
A Trino coordinator StatefulSet configured to enforce authorization through Apache Ranger.
Optional TPCH connector for zero‑dependency test queries.
End‑to‑end auditing to Solr, visible from the Ranger Admin UI.

Repo with Helm charts and Dockerfile: https://github.com/karthigaiselvanm/trino-ranger-k8s/tree/main

Reference: Building a custom Ranger image with MySQL + Solr baked in: https://www.k2ddna.com/2025/08/building-apache-ranger-from-source.html

Why Trino + Ranger?

Centralized policy management: Define catalog/schema/table permissions once, enforce everywhere.
Attribute- and action-level control: Allow query execution, schema visibility, impersonation, or updates—per user or group.
Full audit trail: Who queried what, when, and whether it was allowed or blocked.

Prerequisites

A working Kubernetes cluster (kind, Minikube, K3s, or managed like EKS/GKE/AKS).
kubectl configured against your cluster.
Helm 3.x installed.
A custom Apache Ranger Docker image that includes Ranger Admin, MySQL, and Solr, built per this guide: https://www.k2ddna.com/2025/08/building-apache-ranger-from-source.html

Optional but recommended: 2 vCPUs, 4–8 GB RAM for a smooth local experience.

Quick start

1) Clone the repo

git clone https://github.com/karthigaiselvanm/trino-ranger-k8s.git
cd trino-ranger-k8s

2) Install Apache Ranger stack (Admin, MySQL, Solr)

helm install ranger ./ranger --namespace ranger --create-namespace
kubectl get pods -n ranger

Wait for all pods to be Running. Check Ranger Admin logs if needed:

kubectl logs -n ranger deploy/ranger-admin

If you're local, port-forward the UI:

kubectl port-forward -n ranger svc/ranger-admin 6080:6080

Open http://localhost:6080 and log in to Ranger Admin.

3) Install Trino (coordinator)

helm install trino ./trino --namespace ranger
kubectl get pods -n ranger -l app.kubernetes.io/name=trino
kubectl logs -n ranger statefulset/trino-coordinator

When ready, you should see the pod trino-coordinator-0 Running.

How the integration works

Trino is configured to delegate access decisions to Apache Ranger.
The deployment includes the following key files in the Trino container:
- access-control.properties
- ranger-trino-security.xml
- ranger-trino-audit.xml
Audits are shipped to Solr at:

<property>
  <name>xasecure.audit.solr.solr_url</name>
  <value>http://ranger-solr:8983/solr/ranger_audits</value>
</property>

Enable the TPCH connector (optional)

Add this to trino/values.yaml under catalogs: to enable a test catalog with sample data shapes:

catalogs:
  tpch.properties: |
    connector.name=tpch

After deploying, you can validate with any SQL client by connecting to the Trino service.

Create the Trino service in Ranger and set policies

1) In Ranger Admin, create a new Trino service.

2) Add policies such as:

Resource	Permission / Action	Notes
`queryid`	`ExecuteQuery`	Allow users to run queries.
`catalog, schema, table`	`select, update, etc.`	Grant data access at the desired granularity.
`trinouser`	`impersonate`	Enable user impersonation where required.

Start strict: deny by default, then add allow rules for specific users/groups and resources. Validate by attempting allowed and disallowed queries and reviewing audit entries.

Verifying end-to-end

Connectivity: use DBeaver or trino-cli to connect to the Trino service endpoint.
Authorization: run a query as trinouser on a catalog/schema/table that is not permitted—expect a block.
Audit: in Ranger Admin → Audits, confirm both allow and deny events are recorded from the Trino plugin.

Example trino-cli commands (adjust host/port):

trino \
  --server http://TRINO_SERVICE_HOST:8080 \
  --user trinouser \
  --execute "SHOW CATALOGS"

Audit events:

Unauthorized access:

Add access policies for trinouser:

Troubleshooting and common pitfalls

Ranger Admin up, but Trino denies everything: ensure you created the Trino service in Ranger and that policies are published. Check the "Last Policy Update Time" in Ranger plugin status if available.
No audit events in UI: verify Solr is Running; check ranger-trino-audit.xml URL and Trino logs for audit delivery errors.
DNS or service resolution issues: confirm Kubernetes service names match the values in your configs (e.g., ranger-solr), and that you're using the correct namespace.
Plugin not pulling policies: make sure the Ranger Admin URL is reachable from Trino, and credentials/API user are correct if applicable.
Local clusters and resources: Solr and MySQL can be resource-hungry. Increase memory or use node selectors/limits to prevent OOM restarts.

Quick checks:

kubectl get pods -n ranger
kubectl describe pod -n ranger trino-coordinator-0
kubectl logs -n ranger trino-coordinator-0
kubectl logs -n ranger deploy/ranger-admin
kubectl logs -n ranger deploy/ranger-solr

Production hardening checklist

Externalize state: use PersistentVolumes for MySQL and Solr.
Backups: schedule MySQL backups and Solr index snapshots.
HA & scaling: deploy multiple Trino workers (this chart starts with a coordinator only). Add workers and test failover.
TLS everywhere: enable HTTPS for Trino and Ranger; secure Solr and MySQL connections.
Authentication: integrate with your IdP/LDAP; avoid shared local users in production.
Network policy: restrict egress/ingress to only what Trino and Ranger require.
Observability: add Prometheus/Grafana dashboards; set alerts on plugin policy sync failures and audit ingestion lag.

Performance tips

Catalog tuning: only enable connectors you need; unused connectors slow discovery.
Memory: increase Trino query and heap memory for large datasets.
Pushdown: choose connectors that support predicate pushdown to reduce scanned data.
Split sizing: tune query.max-memory, query.max-memory-per-node, and worker counts as you add concurrency.

FAQ

Q: Do I need the custom Ranger image?
A: For this setup, yes—the custom image simplifies bootstrapping MySQL and Solr alongside Ranger Admin. You can swap in your own managed MySQL/Solr if preferred—update values and secrets accordingly.

Q: Can I use an external MySQL/Solr?
A: Absolutely. Point the Helm values to your endpoints and credentials; disable the in-cluster deployments.

Q: Where do I see who was blocked or allowed?
A: Ranger Admin → Audits. You'll see user, resource, action, and result (allow/deny) with timestamps and policies applied.

Credits

Trino: https://trino.io/
Apache Ranger: https://ranger.apache.org/
Helm: https://helm.sh/
Guides that inspired this approach:
- https://medium.com/data-science/integrating-trino-and-apache-ranger-b808f6b96ad8
- https://qnguyen3496.medium.com/deploying-trino-with-apache-ranger-and-superset-on-kubernetes-b85d834e5987

License: Apache 2.0

K2D Data Analytics