> For the complete documentation index, see [llms.txt](https://www.pranaypourkar.co.in/the-programmers-guide/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://www.pranaypourkar.co.in/the-programmers-guide/database/sql-databases/sql-fundamentals/sql-clauses/connect-by.md).

# CONNECT BY

{% hint style="info" %}
The `CONNECT BY` clause is specific to Oracle and is a feature unique to Oracle Database for handling hierarchical queries. However, other relational database management systems (RDBMS) have their own ways of dealing with hierarchical data, often using Common Table Expressions (CTEs) to achieve similar results.

For hierarchical queries, the pseudocolumn LEVEL starts with 1 (for queries not using CONNECT BY, LEVEL is 0, unless we are on release 10g and later when LEVEL is available only when using CONNECT BY) and increments by one after each evaluation (for each level of depth in the hierarchy).
{% endhint %}

## Description

The `CONNECT BY` clause in Oracle is used for hierarchical queries, which are queries that deal with data that has a parent-child relationship, such as organizational structures, bill of materials, or tree-structured data. It allows to traverse the hierarchy and generate a result set that reflects the hierarchical structure of the data.

## Components of `CONNECT BY`

1. **PRIOR**: This operator specifies the parent-child relationship in the hierarchy. It is used to indicate the column that holds the reference to the parent row.
2. **LEVEL**: A pseudo-column that returns the level number of a node in a tree structure. The root level is 1, the next level is 2, and so on.
3. **START WITH**: This clause specifies the root of the hierarchy, i.e., the starting point of the query.
4. **SYS\_CONNECT\_BY\_PATH**: A function that returns the path of a node from the root.
5. **CONNECT\_BY\_ROOT** and **CONNECT\_BY\_ISLEAF:** `CONNECT_BY_ROOT` and `CONNECT_BY_ISLEAF` are hierarchical query pseudocolumns used with the `CONNECT BY` clause to navigate and manipulate hierarchical data structures, such as organizational charts or parts explosions. These pseudocolumns are especially useful when working with self-referential tables where a row refers to another row in the same table.

## Syntax

```sql
SELECT column_list
FROM table_name
START WITH condition
CONNECT BY PRIOR parent_column = child_column;
```

### CONNECT\_BY\_ROOT

The `CONNECT_BY_ROOT` pseudocolumn is used to return the root row of the hierarchy from the current row. This means it retrieves the starting point (root) of the hierarchy for each row in the result set.

```sql
CONNECT_BY_ROOT column_name
```

### CONNECT\_BY\_ISLEAF

The `CONNECT_BY_ISLEAF` pseudocolumn is used to indicate whether a row is a leaf node in the hierarchy. A leaf node is a node that does not have any children

```sql
CONNECT_BY_ISLEAF
```

## Examples

### Employee and Manager

Consider a table `employees` with the following structure:

| EMPLOYEE\_ID | NAME | MANAGER\_ID |
| ------------ | ---- | ----------- |
| 1            | John | NULL        |
| 2            | Mike | 1           |
| 3            | Sara | 1           |
| 4            | Kate | 2           |
| 5            | Paul | 2           |

In this table:

* `EMPLOYEE_ID` is the unique identifier for each employee (child).
* `MANAGER_ID` is the unique identifier for the manager of each employee (parent).

#### Hierarchical Query Using `CONNECT BY`

To retrieve the hierarchy starting from the top-level manager:

```sql
SELECT EMPLOYEE_ID, NAME, LEVEL,
       SYS_CONNECT_BY_PATH(NAME, '->') AS PATH
FROM employees
START WITH MANAGER_ID IS NULL
CONNECT BY PRIOR EMPLOYEE_ID = MANAGER_ID
```

#### Explanation:

* **SELECT EMPLOYEE\_ID, NAME, LEVEL, SYS\_CONNECT\_BY\_PATH(NAME, '->') AS PATH**: Selects the employee ID, name, hierarchical level, and the path from the root.
* **FROM employees**: Specifies the source table.
* **START WITH MANAGER\_ID IS NULL**: Specifies the root of the hierarchy (employees with no manager).
* **CONNECT BY PRIOR EMPLOYEE\_ID = MANAGER\_ID**: Establishes the parent-child relationship. Each employee's manager is specified by `MANAGER_ID`.

{% hint style="info" %}
**Step-by-step breakdown of the hierarchy**

1. Start with the root node(s) where `MANAGER_ID IS NULL` (John).
2. For each row, find rows where `MANAGER_ID` matches the `EMPLOYEE_ID` of the current row.
3. Continue this process recursively to build the hierarchy.

**Setting Up the Hierarchical Query**

To establish the relationship that each employee reports to their manager, we want to specify that:

* **The parent row’s `EMPLOYEE_ID` should match the child row’s `MANAGER_ID`**.

This translates to:

* **PRIOR EMPLOYEE\_ID = MANAGER\_ID**
  {% endhint %}

#### Result:

| EMPLOYEE\_ID | NAME | LEVEL | PATH               |
| ------------ | ---- | ----- | ------------------ |
| 1            | John | 1     | ->John             |
| 2            | Mike | 2     | ->John->Mike       |
| 3            | Sara | 2     | ->John->Sara       |
| 4            | Kate | 3     | ->John->Mike->Kate |
| 5            | Paul | 3     | ->John->Mike->Paul |

**LEVEL**: Indicates the depth of the current row in the hierarchy. The root node has a `LEVEL` of 1, its children have a `LEVEL` of 2, and so on.

**PRIOR**: Used to refer to the parent row in the hierarchy. In `PRIOR parent_column = child_column`, `PRIOR` refers to the value in the parent row.

**START WITH**: Specifies the root row(s) of the hierarchy. If omitted, all rows are considered roots, which is usually not desired.

**SYS\_CONNECT\_BY\_PATH**: Returns the path from the root to the current row, which is useful for displaying the hierarchy as a string.

### Print difference of 2 dates into rows

Given 2 dates, print all the dates within those 2 dates including itself.

```
SELECT 
    TO_DATE(level - 1 + TO_CHAR(TO_DATE('2024-05-28', 'YYYY-MM-DD'), 'YYYYDDD'), 'YYYYDDD') AS calendar_date
  FROM DUAL
  CONNECT BY LEVEL <= TRUNC(TO_DATE('2024-05-31', 'YYYY-MM-DD')) - TRUNC(TO_DATE('2024-05-28', 'YYYY-MM-DD')) + 1
```

<figure><img src="/files/v944s53NK1tMxureliYK" alt="" width="131"><figcaption></figcaption></figure>

### Display each employee along with their root manager

```
SELECT
    employee_id,
    employee_name,
    manager_id,
    CONNECT_BY_ROOT employee_name AS root_manager
FROM
    employees
CONNECT BY
    PRIOR employee_id = manager_id
START WITH
    manager_id IS NULL;
```

### Display whether each employee is a leaf node in the hierarchy

```
SELECT
    employee_id,
    employee_name,
    manager_id,
    CONNECT_BY_ISLEAF AS is_leaf
FROM
    employees
CONNECT BY
    PRIOR employee_id = manager_id
START WITH
    manager_id IS NULL;
```


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://www.pranaypourkar.co.in/the-programmers-guide/database/sql-databases/sql-fundamentals/sql-clauses/connect-by.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
