> For the complete documentation index, see [llms.txt](https://www.pranaypourkar.co.in/the-programmers-guide/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://www.pranaypourkar.co.in/the-programmers-guide/system-design/design-principles-and-patterns/design-pattern/examples/data-collector.md).

# Data Collector

Let's design a flexible and extensible solution to extract various data points from different sources or formats.

{% hint style="info" %}
**Strategy Pattern**: Encapsulate data collection algorithms in separate classes (Data Collectors).

**Pipeline Pattern**: Break down the data extraction process into smaller, independent steps (each Data Collector). Process data sequentially for clarity and maintainability.
{% endhint %}

Create DataCollector interface along with a method collect which takes input as well as output object, which is to be updated.

***DataCollector.java***

```java
public interface DataCollector {
    void collect(InputData inputData, OutputData outputData);
}
```

Create several implementation (for eg. header collector, token collector, request collector etc) of the interface.

***HeaderDataCollector.java***- Collects headers and set in the output

```java
@Component
public class HeaderDataCollector implements DataCollector {
    @Override
    public void collect(InputData inputData, OutputData outputData) {
        var headers = inputData.getHeaders();
        outputData.setHeaderHost(headers.get("x-host"));
        outputData.setHeaderTraceId(headers.get("x-trace-id"));
        outputData.setHeaderSpanId(headers.get("x-span-id"));
    }
}
```

***TokenDataCollector.java***- Collects token metadata and set in the output

```java
@Component
public class TokenDataCollector implements DataCollector {
    @Override
    public void collect(InputData inputData, OutputData outputData) {
        // Add logic to extract claims from JWT token and Set the values accordingly.

        // Sample values are provided below.
        outputData.setUserId("claim.userid");
        outputData.setUserIp("claim.ip");
        outputData.setUserName("claim.subject");
        outputData.setUserEmailId("claim.emailid");
        outputData.setUserPreferredLanguage("claim.language");
    }
}
```

**HttpRequestDataCollector.java**- Collects http request related parameters

```java
@Component
public class HttpRequestDataCollector implements DataCollector {
    @Override
    public void collect(InputData inputData, OutputData outputData) {
        outputData.setCookie(inputData.getCookie());
        outputData.setHttpType(inputData.getHttpType());
        outputData.setRequest(inputData.getRequest());
        outputData.setResponse(inputData.getResponse());
        outputData.setUrl(inputData.getUrl());
    }
}
```

Now, we will create a pipeline class which goes through all the above implementation to extract and set data in the output.

***DataCollectorPipeline.java***

```java
@RequiredArgsConstructor
@Component
public class DataCollectorPipeline {

    private final List<DataCollector> dataCollectors;

    public void collect(InputData inputData, OutputData outputData) {
        dataCollectors.forEach(collector -> collector.collect(inputData, outputData));
    }
}
```

{% hint style="info" %}
@RequiredArgsConstructor will inject all the DataCollector beans into `dataCollectors` list
{% endhint %}

Pipeline can be called in the service class like below.

***DataService.java***

```java
@RequiredArgsConstructor
@Service
public class DataService {

    private final DataCollectorPipeline dataCollectorPipeline;

    public OutputData extractData(InputData inputData) {
        var outputData = new OutputData();
        dataCollectorPipeline.collect(inputData, outputData);
        return outputData;
    }
}
```


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://www.pranaypourkar.co.in/the-programmers-guide/system-design/design-principles-and-patterns/design-pattern/examples/data-collector.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
