Building Apache Ranger from source
As of this document (Aug 03, 2025), Apache Ranger still doesn't provide a tarball file to deploy, so we need to build it ourselves. This guide will help you do exactly that.
Prerequisites:
- Java 8
- Maven
1. Check out the code from GIT repository
git clone https://gitbox.apache.org/repos/asf/ranger.git
cd ranger
Alternatively, you can checkout the code from GitHub:
git clone https://github.com/apache/ranger
cd ranger
2. Ensure you have set JAVA_HOME and execute the following Maven commands
export JAVA_HOME=%jdk 7 Home%
mvn -Pall clean
mvn -Pall -DskipTests=false clean compile package install
If the build is successful, you will see the below BUILD SUCCESS message:
3. Apache Ranger Admin tarball and other plugins
Once the build is successful, all the required tarballs can be found under the target/
directory:
There are two server-side components:
1. 🛡️ Apache Ranger Admin
Purpose: UI-based central security policy administration
✅ What it does:
- Policy Management: Provides a web UI and REST APIs for administrators to define fine-grained access control policies for various data sources (like Hive, Trino, HDFS, Kafka, etc.).
- Audit Logs: Stores and displays audit trails (who accessed what, when, and whether it was allowed or denied).
- Plugins Communication: Pushes policies to Ranger plugins running within data source services (like Trino or Hive).
- Database Backing: Stores all policies and metadata in a backend database (like MySQL/PostgreSQL).
🔗 Example:
You log into the Ranger Admin UI at http://<ranger-host>:6080
and define:
- A policy allowing user
alice
to run SELECT queries on thesales
schema in Trino.
2. 👥 Apache Ranger UserSync
Purpose: Keeps Ranger Admin in sync with your organization’s identity system
✅ What it does:
- Fetches users and groups from external identity providers like LDAP/Active Directory or Unix local system.
- Syncs identities into the Ranger Admin database so you can build access control policies based on real users and groups.
- Runs as a background daemon process.
🔗 Why it matters:
Without UserSync, you would have to manually create users and groups in Ranger Admin — not scalable or secure.
🔄 How They Work Together
+---------------------+ REST/API +-----------------------+
| External LDAP/AD |----------------------->| Ranger UserSync |
| | | (syncs users/groups) |
+---------------------+ +-----------------------+
|
v
+--------------+
| Ranger Admin |
| UI + DB |
+--------------+
|
pushes policies |
v
+------------------+
| Ranger Plugin |
| (Trino, Hive etc) |
+------------------+
⚙️ Optional but Recommended
- Ranger Admin is mandatory to manage policies.
- Ranger UserSync is optional but recommended for production deployments using LDAP or Active Directory.
4. Creating docker image that integrates Apache Ranger with MySQL and Solr
Unarchive the built tarball into the ranger-docker/ranger-packages
directory:
tar -xvzf ranger/target/ranger-3.0.0-SNAPSHOT-admin.tar.gz -C trino-ranger-k8s/ranger-docker/ranger-packages/
Once done, your directory will look like:
.
├── Dockerfile
├── install.properties
├── ranger-entrypoint.sh
├── ranger-packages
│ └── ranger-3.0.0-SNAPSHOT-admin
└── README.md
1. Dockerfile
Defines how the Apache Ranger Admin container is built:
- Uses Ubuntu 20.04
- Installs Java 11 and dependencies
- Copies the built Ranger Admin and MySQL connector
- Prepares for setup using
install.properties
and entrypoint script
2. install.properties
Used by setup.sh
. Contains:
- DB settings (MySQL host, credentials)
- Solr endpoint
- Service and repo names
- JDBC and Java paths
3. ranger-entrypoint.sh
Container startup script:
- Waits for MySQL
- Runs
setup.sh
to init DB and configs - Starts Ranger Admin
- Keeps container alive
4. ranger-packages/
Where the Ranger Admin tarball is extracted and copied into the container.
Build the Docker Image
Use the below command to build the image:
docker build -t apache-ranger-admin:3.0.0 .
We now have a custom Docker image for Apache Ranger bundled with all necessary backend components like MySQL, Solr and the Admin server. In the next post, we'll walk through deploying both Apache Ranger and Trino on a Kubernetes cluster and integrating them seamlessly. Stay tuned!
Post a Comment