Graph Representations and Traversals in Java: A Refresher for Data Engineers 

As data engineers, we often deal with complex datasets that can be modeled using graphs. Graphs are powerful data structures that allow us to represent and analyze relationships between entities, making them invaluable tools in various domains, such as social networks, recommendation systems, and data lineage tracking.

In this post, we'll revisit graph representations and traversals in Java, focusing on libraries and functionalities available in Java 8 or later. We'll also explore some real-world scenarios where graphs can be particularly useful for data engineers.

Graph Representations in Java 

There are two primary ways to represent graphs in Java:

// Example adjacency matrix for a simple graph

int[][] adjMatrix = {

    {0, 1, 0, 0},

    {1, 0, 1, 1},

    {0, 1, 0, 0},

    {0, 1, 0, 0}

};

// Example adjacency list for the same graph

List<List<Integer>> adjList = new ArrayList<>();

adjList.add(Arrays.asList(1));

adjList.add(Arrays.asList(0, 2, 3));

adjList.add(Arrays.asList(1));

adjList.add(Arrays.asList(1));

Java Libraries for Graphs (Java 8+):

Graph Traversals 

In Java Traversals involve systematically visiting all vertices (or edges) in a graph. Here are two common approaches:

// Recursive DFS implementation using JCF

public void dfs(int vertex, boolean[] visited, List<List<Integer>> adjList) {

    visited[vertex] = true;

    System.out.print(vertex + " ");

    for (int neighbor : adjList.get(vertex)) {

        if (!visited[neighbor]) {

            dfs(neighbor, visited, adjList);

        }

    }

}

// BFS implementation using a Queue (java.util.Queue)

public void bfs(int vertex, boolean[] visited, List<List<Integer>> adjList) {

    Queue<Integer> queue = new LinkedList<>();

    queue.add(vertex);

    visited[vertex] = true;

    while (!queue.isEmpty()) {

        vertex = queue.poll();

        System.out.print(vertex + " ");

        for (int neighbor : adjList.get(vertex)) {

            if (!visited[neighbor]) {

                queue.add(neighbor);

                visited[neighbor] = true;

            }

        }

    }

}

Java 8 Streams (Optional): Java 8 streams can be used for concise graph traversal implementations, especially for simple operations.

LeetCode Exercises for Practice

Here are some LeetCode problems to solidify your understanding of graphs in Java:

Easy:

Medium:

Applications of Graphs for Data Engineers:

Conclusion

 Graphs are powerful tools for data engineers, offering a structured way to represent relationships within complex datasets. By understanding graph representations like adjacency matrices and adjacency lists, along with traversal techniques like DFS and BFS, you'll be well-equipped to tackle various graph-based problems in big data applications. Remember, Java 8 and later offer several libraries and functionalities to simplify graph manipulation.

The provided LeetCode exercises serve as a great starting point to solidify your understanding and practice your graph traversal skills. Don't hesitate to explore further resources and delve into more complex graph algorithms like Dijkstra's algorithm for shortest paths or topological sorting for dependency management. As you continue practicing, you'll unlock the full potential of graphs in your data engineering endeavors, from data lineage tracking to social network analysis and recommendation systems.