Introduction
Analytical (also called analytic or window) functions are one of the most powerful features of the Oracle database. They allow you to perform complex calculations across a set of rows related to the current row without collapsing rows, which is a major limitation of traditional aggregate functions.
In Oracle 19c, analytical functions are heavily used in Banking, FinTech, Risk, Fraud Detection, Reporting, and Data Analytics systems where ranking, running totals, moving averages, and comparative analysis are everyday requirements.
This blog is industry-oriented, includes real-time use cases, and follows Oracle’s official documentation standards.
1. What is an Analytical Function?
An Analytical Function performs a calculation across a group of rows called a window, while still returning one result per row.
Key Characteristics
- Operates on a set of rows related to the current row
- Does not reduce the number of rows in the result set
- Uses the
OVER()clause - Frequently used for ranking, trend analysis, and comparisons
Simple Example
SELECT employee_id, department_id, salary,
AVG(salary) OVER (PARTITION BY department_id) AS dept_avg_salary
FROM employees;
# Each employee row is preserved, but the department-wise average is calculated.
2. Difference Between Analytical Functions and Aggregate Functions
| Feature | Aggregate Function | Analytical Function |
|---|---|---|
| Number of rows returned | Reduced | Same as input |
| OVER() clause | ❌ Not used | ✅ Mandatory |
| GROUP BY required | ✅ Yes | ❌ No |
| Example | SUM(salary) | SUM(salary) OVER() |
| Use case | Summary reports | Detailed analytics |
Diagram – Row Preservation
Aggregate Function:
-------------------
Input Rows ---> GROUP BY ---> Reduced Rows
Analytical Function:
--------------------
Input Rows ---> OVER() ---> Same Rows + Calculations
Syntax of Analytical Functions
analytic_function (expression)
OVER (
[PARTITION BY partition_expression]
[ORDER BY sort_expression]
[ROWS | RANGE windowing_clause]
)
Syntax Components
- PARTITION BY – Divides data into logical groups
- ORDER BY – Defines row order within the partition
- ROWS / RANGE – Defines the window frame
Using Analytical Functions on a Table (Practical Demo)
Table Design (Oracle 19c Compatible)
CREATE TABLE account_transactions (
txn_id NUMBER PRIMARY KEY,
account_id NUMBER NOT NULL,
txn_date DATE NOT NULL,
txn_type VARCHAR2(10) CHECK (txn_type IN ('CREDIT', 'DEBIT')),
amount NUMBER(12,2) CHECK (amount > 0)
);
INSERT INTO account_transactions VALUES (1, 1001, DATE '2024-01-01', 'CREDIT', 5000);
INSERT INTO account_transactions VALUES (2, 1001, DATE '2024-01-05', 'DEBIT', 1500);
INSERT INTO account_transactions VALUES (3, 1001, DATE '2024-01-10', 'DEBIT', 1000);
INSERT INTO account_transactions VALUES (4, 1002, DATE '2024-01-02', 'CREDIT', 8000);
INSERT INTO account_transactions VALUES (5, 1002, DATE '2024-01-08', 'DEBIT', 2000);
(a) RANK(), DENSE_RANK(), ROW_NUMBER(), FIRST, LAST
These are ranking and positional analytical functions used to order data within partitions.
1. Function name: ROW_NUMBER()
Description:
Assigns a unique sequential number to each row within a partition, based on the specified order. Even if values are the same, numbers will not repeat.
Syntax:
ROW_NUMBER() OVER (
PARTITION BY partition_expression
ORDER BY sort_expression
)
Query example:
SELECT account_id,
txn_date,
amount,
ROW_NUMBER() OVER (
PARTITION BY account_id
ORDER BY txn_date
) AS row_num
FROM account_transactions;
Query explanation:
- Data is partitioned by
account_id - Transactions are ordered by
txn_date - Each transaction gets a unique sequence number
Practical use cases:
- Fetch latest transaction per account
- Remove duplicates
- Pagination logic in reports
2. Function name: RANK()
Description:
Assigns ranking to rows with gaps in rank values when ties occur.
Syntax:
RANK() OVER (
ORDER BY expression
)
Query example:
SELECT account_id,
amount,
RANK() OVER (ORDER BY amount DESC) AS rank_amt
FROM account_transactions;
Query explanation:
- Transactions ranked by amount (highest first)
- If two transactions have same amount → same rank
- Next rank is skipped
Practical use cases:
- Customer ranking by transaction value
- Leaderboards
- Performance comparison reports
3. Function name: DENSE_RANK()
Description:
Similar to RANK() but does not skip ranking numbers when duplicates exist.
Syntax:
DENSE_RANK() OVER (
ORDER BY expression
)
Query example:
SELECT account_id,
amount,
DENSE_RANK() OVER (ORDER BY amount DESC) AS dense_rank_amt
FROM account_transactions;
Query explanation:
- Continuous ranking without gaps
- Useful when rank continuity matters
Practical use cases:
- Tier-based customer segmentation
- Risk classification
- Reward slabs
4. Function name: FIRST and LAST
Description:
Used with KEEP (DENSE_RANK ...) to fetch the first or last value based on ordering.
Syntax:
FIRST_VALUE(expression) OVER (...)
LAST_VALUE(expression) OVER (...)
(Traditional aggregate alternative)
MIN(expr) KEEP (DENSE_RANK FIRST ORDER BY expr)
Practical use cases:
- Identify first/last transaction
- Lifecycle analysis
- Entry and exit point tracking
(b) FIRST_VALUE() and LAST_VALUE()
Function name: FIRST_VALUE()
Description:
Returns the first value in the ordered window frame.
Syntax:
FIRST_VALUE(expression) OVER (
PARTITION BY partition_expression
ORDER BY sort_expression
)
Query example:
SELECT account_id,
txn_date,
amount,
FIRST_VALUE(amount) OVER (
PARTITION BY account_id
ORDER BY txn_date
) AS first_txn_amt
FROM account_transactions;
Query explanation:
- For each account, fetches the first transaction amount
- Value is repeated across rows in the partition
Practical use cases:
- Account opening balance
- First purchase analysis
- Customer onboarding metrics
Function name: LAST_VALUE()
Description:
Returns the last value in the window frame (requires proper framing).
Syntax:
LAST_VALUE(expression) OVER (
PARTITION BY partition_expression
ORDER BY sort_expression
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
)
Query example:
SELECT account_id,
txn_date,
amount,
LAST_VALUE(amount) OVER (
PARTITION BY account_id
ORDER BY txn_date
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
) AS last_txn_amt
FROM account_transactions;
Query explanation:
- Explicit frame ensures Oracle looks at the entire partition
- Avoids common
LAST_VALUEmistake
Practical use cases:
- Latest transaction value
- Account closure balance
- End-of-period analysis
(c) LAG() and LEAD()
Function name: LAG()
Description:
Accesses data from a previous row in the same partition.
Syntax:
LAG(expression, offset, default) OVER (
PARTITION BY partition_expression
ORDER BY sort_expression
)
Query example:
SELECT account_id,
txn_date,
amount,
LAG(amount, 1, 0) OVER (
PARTITION BY account_id
ORDER BY txn_date
) AS prev_amount
FROM account_transactions;
Query explanation:
- Retrieves previous transaction amount
- Default value is
0for first row
Practical use cases:
- Month-over-month comparison
- Fraud spike detection
- Trend analysis
Function name: LEAD()
Description:
Accesses data from a future row.
Syntax:
LEAD(expression, offset, default) OVER (...)
Query example:
SELECT account_id,
txn_date,
amount,
LEAD(amount) OVER (
PARTITION BY account_id
ORDER BY txn_date
) AS next_amount
FROM account_transactions;
Query explanation:
- Compares current and next transaction
- Useful for gap or sequence checks
Practical use cases:
- Predictive modeling
- Transaction sequencing
- SLA breach detection
(d) LISTAGG()
Function name: LISTAGG()
Description:
Concatenates values from multiple rows into a single delimited string.
Syntax:
LISTAGG(expression, delimiter)
WITHIN GROUP (ORDER BY expression)
OVER (PARTITION BY partition_expression)
Query example:
SELECT account_id,
LISTAGG(txn_type, ', ')
WITHIN GROUP (ORDER BY txn_date)
OVER (PARTITION BY account_id) AS txn_history
FROM account_transactions;
Query explanation:
- Builds transaction history per account
- Order preserved by transaction date
Practical use cases:
- Audit trails
- Customer summaries
- Reporting & dashboards
(e) Top-N Queries (Analytical Approach)
Function name: Top-N using ROW_NUMBER()
Description:
Fetches top records per group without subqueries.
Syntax:
ROW_NUMBER() OVER (PARTITION BY ... ORDER BY ...)
Query example:
SELECT *
FROM (
SELECT account_id,
amount,
ROW_NUMBER() OVER (
PARTITION BY account_id
ORDER BY amount DESC
) AS rn
FROM account_transactions
)
WHERE rn <= 2;
Query explanation:
- Ranks transactions per account
- Filters top 2 transactions
Practical use cases:
- Top spenders
- Highest risk transactions
- Best-performing products
(f) QUERY_PARTITION_CLAUSE (PARTITION BY)
Concept name: PARTITION BY Clause
Description:
Defines logical subsets of data for analytical processing.
Syntax:
OVER (PARTITION BY expression)
Query example:
SELECT account_id,
amount,
SUM(amount) OVER (PARTITION BY account_id) AS total_amt
FROM account_transactions;
Query explanation:
- Each account treated independently
- Aggregation without row loss
Practical use cases:
- Customer-wise metrics
- Department-wise KPIs
- Segmented analytics
Other example on HR Schema
1. ROW_NUMBER()
- The ROW_NUMBER() function assigns a unique number to each row based on the specified order.
- ROW_NUMBER function ignores ties and always gives a unique number to each row.
SELECT employee_id, salary, ROW_NUMBER() OVER (ORDER BY salary DESC) AS ranking
FROM employees;
2. RANK()
- The RANK() function assigns a rank to each row based on the specified order.
- Rows with equal values receive the same rank, and the next rank is skipped.
SELECT department_id, salary,
RANK() OVER (PARTITION BY department_id ORDER BY salary DESC) AS department_rank
FROM employees;
3. DENSE_RANK()
The DENSE_RANK() function is similar to RANK(), but it does not skip rank values for rows with equal values.
SELECT department_id, salary,
DENSE_RANK() OVER (PARTITION BY department_id ORDER BY salary DESC) AS department_dense_rank
FROM employees;
4. LAG() and LEAD()
The LAG() function allows you to access the value of a previous row, and LEAD() allows you to access the value of the next row.
SELECT employee_id, salary,
LAG(salary, 1, 0) OVER (ORDER BY employee_id) AS prev_salary,
LEAD(salary, 1, 0) OVER (ORDER BY employee_id) AS next_salary
FROM employees;
5. SUM() and AVG() Over a Window
You can use SUM() and AVG() over a specified window of rows.
SELECT department_id, salary,
SUM(salary) OVER (PARTITION BY department_id) AS department_salary_sum,
AVG(salary) OVER (PARTITION BY department_id) AS department_salary_avg
FROM employees;
6. FIRST_VALUE() and LAST_VALUE()
FIRST_VALUE() returns the value of the specified expression for the first row in the window, and LAST_VALUE() returns the value for the last row.
SELECT department_id, salary,
FIRST_VALUE(salary) OVER (PARTITION BY department_id ORDER BY salary) AS first_salary,
LAST_VALUE(salary) OVER (PARTITION BY department_id ORDER BY salary) AS last_salary
FROM employees;
The PARTITION BY clause divides the result set into partitions to which the function is applied, and the ORDER BY clause specifies the order of rows within each partition.
PERCENT_RANK()
The PERCENT_RANK() function in Oracle SQL is an analytical function that calculates the relative rank of a row within a partition of a result set. It returns a decimal value between 0 and 1 representing the percentile ranking of a row.
Formula:
PERCENT_RANK= Rank − 1 / Total Rows−1
- The lowest rank (first row in the partition) gets 0.
- The highest rank gets a value close to 1 (but never exactly 1).
- If there is only one row in the partition, PERCENT_RANK returns 0.
Syntax:
PERCENT_RANK() OVER (PARTITION BY column_name ORDER BY column_name)
- PARTITION BY: (Optional) Divides the result set into partitions before applying the function.
- ORDER BY: Defines the ranking order.
Example Usage
Example 1: PERCENT_RANK Without Partitioning
SELECT emp_id, emp_name, salary,
PERCENT_RANK() OVER (ORDER BY salary) AS percent_rank
FROM employees;
Output:
| emp_id | emp_name | salary | percent_rank |
| 101 | Alice | 30,000 | 0.0000 |
| 102 | Bob | 35,000 | 0.2500 |
| 103 | Charlie | 40,000 | 0.5000 |
| 104 | David | 50,000 | 1.0000 |
Here, Alice gets 0.0000 because she has the lowest salary, and David gets 1.0000 since he has the highest salary.
Example 2: PERCENT_RANK With Partitioning
SELECT department_id, emp_id, emp_name, salary,
PERCENT_RANK() OVER (PARTITION BY department_id ORDER BY salary) AS percent_rank
FROM employees;
Here, the function calculates the percentile rank separately for each department_id.
Interview Questions on Analytical Functions (MNC Level)
- Difference between
RANK()andROW_NUMBER()? - Can analytical functions be used in
WHEREclause? - Explain
ROWSvsRANGE - How do you calculate running totals?
- Use
LAG()to compare month-over-month data - Performance impact of analytical functions
- Analytical vs Aggregate function use cases
Practice Questions on Analytical Functions
- Write a query using the ROW_NUMBER() function to assign unique row numbers to each record in a table.
- Explain the difference between the RANK() and DENSE_RANK() functions. Provide examples to illustrate their usage.
- Create a query that uses the LAG() function to display the previous value of a column alongside the current value.
- Use the LEAD() function to find the next highest salary for each employee in a table.
- Write a query to calculate the running total of a numeric column using the SUM() window function.
- Implement the FIRST_VALUE() function to find the first order date for each customer in a sales table.
- Create a query that uses the LAST_VALUE() function to find the last login date for each user in a user activity log.
- Explain the concept of a window in the context of Analytical Functions. Provide examples to demonstrate window specifications.
- Use the PERCENTILE_CONT() function to find the median salary in a salary table.
- Write a query to find the cumulative percentage of total sales using the PERCENT_RANK() function.
- Explain the purpose of the CUME_DIST() function and provide an example of its usage.
- Write a query to find the difference between the current row’s value and the next row’s value using the LAG() function.
- Implement a query that utilizes the NVL() function along with Analytical Functions to handle NULL values appropriately.
- Use the LISTAGG() function to concatenate the names of employees in each department, separated by commas.
- Write a query to find the top 5 highest-paid employees using the DENSE_RANK() function.
- Create a query that utilizes the LEAD() function to find the time difference between consecutive log entries in a log table.
