|Genre:||Health and Food|
|Published (Last):||4 July 2009|
|PDF File Size:||4.25 Mb|
|ePub File Size:||10.43 Mb|
|Price:||Free* [*Free Regsitration Required]|
APAC: An exact algorithm for retrieving cycles and paths in all kinds of graphs. This paper presents an alorithm for retrieving all paths and all cycles between two vertices in random directed or undirected connected graphs.
This algorithm can be easily implemented and is highly modular; with minor changes it can be adapted to obtain different parameters from the graphs. It is also demonstrated that the complexity of the algorithm increases linearly with the number of paths.
The algorithm can be used in a myriad of applications. Aside from calculating all the paths and cycles in a graph, it can be used to calculate all the paths with length l between two vertices in the graph, as well as a solution to the clique decision problem. Thus, it has applications in computer networks, material science and electric networks, as well as in any problem where it is necessary to know the number of paths not the optimal paths in a directed or undirected connected graph or in multigraphs.
The algorithms currently available in the literature, such as the depth-first search DFS , are unable to solve this type of problems in a straightforward way. Keywords: graph theory, connected graphs, random graphs; network paths, nanofiber networks. This article focuses on the problem of finding all the paths and all the cycles between two vertices in random directed or undirected connected graphs. This information can be used, for example, as a metric in several situations that arise in random network problems.
Finding all paths on a graph implies finding the longest simple path in the graph. With this in mind, a simple and highly modular algorithm was developed, which was termed APAC All Paths And Cycles , and which can be used in all types of relatively dense graphs. A typical solution for finding the shortest paths between all vertices in an unweighted graph is running a breadth-first search for all vertices in the graph.
This optimization problem focuses on finding the k best solutions between two vertices. Aside from that, they rely on sophisticated data structures, which makes them far from trivial to implement. Yet another constrain is that the definition of paths varies from algorithm to algorithm; whereas some find the k- shortest paths with repeated vertices, others define the paths without repeated vertices. A priori, the number of paths is known to be small.
Therefore, an algorithm that scales linearly with the number of the paths in a graph is appropriate for the type of problems describe above. Aside from the obvious application of identifying the paths and cycles in a graph, this paper also indicates two other possible applications of this algorithm:.
Throughout this article, the book of Cormen is used for the terminology and notation not defined here. Definition 1. The set of all edges e ij is E. Definition 2. Definition 3. Definition 4. A path P between vertices is a sequence of edges with no repeated vertices that allows one to proceed in a continuous manner from one vertex to another. The length of a path is its number of edges. Definition 5. A cycle C is a closed path P.
Definition 6. Definition 7. Let G, G 1 , and G 2 be graphs. Definition 8. Definition 9. Theorem 1. In a pioneer paper, Lars-Erik developed an algorithm that solves the problem of finding all paths, as a reachability matrix, in a directed graph.
That algorithm is suitable if the number of edges is lower than the number of nodes. It uses an incidence matrix A as an input and outputs the reachability matrix B.
The first steps of our algorithm are based in this initial work; however, the present work is not looking for the reachability matrix but for all simple paths between any two given vertices. Horst and Kim introduced a new distance metric based on the topology of the graph and formally proved it. One needs two less dense graphs to use the metric in an efficient way. To circumvent this problem one can construct from the initial graph G a subgraph G 1 using the k -shortest paths of G. Thus, using the previously described procedure, the MCS based metric can be applied to two graphs in a relatively fast way.
This is possible because one can control the density of G 1 by varying the k -shortest paths and it is known that the induced subgraph G 2 has a number of edges higher than the vertices but lower than a complete graph. These were chosen by their expected speed of operation in undirected graphs, with no parallel edges, no self-loops and no negative edges.
The numerical study shows that the Lawler algorithm, with O kn 3 for the computational complexity, had the best result. Other references for the same problem can be found in Eppstein, ; Hershberger et al, For the problem of finding the maximum common subgraph, the work by Bunke et al provides a comparative study between a space search based algorithm, and one based on clique detection performed on randomly connected graphs.
As a result of this study, the authors conclude that for low density graphs the best algorithm is the space based one. Thus, let us set a limit to less dense connected graphs, meaning, graphs where the number of edges is larger than the number of vertices but lower than in a complete graph.
Another important aspect is the types of graphs where this algorithm is planned to be implemented. The graphs can be undirected, directed, or even graphs with parallel edges multigraphs. Finally the algorithm must be sufficiently modular; with simple modifications, one can explore other attributes of the paths without having to rewrite large parts of code or the need to employ a different algorithm. The depth-first search DFS approach resembles this algorithm, but as can be seen in the following sections, the APAC algorithm does not need to keep track of all visited vertices, and only stores the feasible paths.
Another important difference is the range of application of the two algorithms, which is quite different. The APAC algorithm retrieves all paths in a graph and the respective cycles , while the DFS is used to transverse directed and undirected graphs but does not retrieve all paths in a graph. Due to that, the DFS is typically used as a subroutine in other algorithms Cormen, In this section are presented some definitions required to understand the APAC algorithm pseudo-code shown in Figure 1.
Lemma 1. From line 9 it is imposed that there are no repeated vertices in the set P. If the condition in line 9 and 10 is fulfilled by the definition of a path definition 4 , u,v is a path. The set P is also unique because, at each step of the algorithm, either the the s,t path is found or the set P is incremented with another edge line As the added edges of the graph are unique there are no repeated vertices , then P is unique.
From Lemma 1, it is known that the set of edges paths retrieved in line 11 is unique. Suppose that for some undirected connected graph G and for any given iteration of the algorithm there exists a set P sequence of vertices such that all vertices that belong to P are unique.
Finally suppose that the condition in line 9 is always fulfilled. If at this point the stack is also empty, the algorithm terminates by the instruction in line 6 , proving c.
For each set P retrieved from the stack in line 7 , the instruction in line 10 will be tested for w vertices associated to that specific set. Continuing with the last argument, and supposing that parameters are imposed such that the condition in line 9 is not fulfilled, then the Out-Cycle is called producing a cycle with length l , which proves b. Here, m is the maximum number of iterations of the while cycle of line 6 , C init is the computational execution cost associated to the first 5 lines, and C 9 is the overall cost in the execution of line 9.
On average, n paths is the average path length multiplied by the number of paths. Based on these assumptions, Equation 1 shows the derivation of the execution time of the APAC algorithm. As can be seen, the algorithm scales linearly with the number of paths, assuming that the number of paths is very large with respect to the average degree, proving d.
The algorithm was tested using the model proposed by Erdos and Renyi for generating a random graph. In their pioneer work, they study the distribution of the minimum and maximum degree in a random graph.
They generate a random graph by initially defining a maximum number of edges and, from an initial set of disconnected vertices, randomly picking one pair of vertices and forming an edge. A similar procedure was followed for generating the graphs undirected. In this method, the number of vertices was fixed to 10, and connected graphs were generated with a random number of edges. The maximum number of edges was set to that of a complete graph.
For each generated graph, the APAC algorithm was applied and it was found that the time complexity agrees with the expected values; see Figure 2. With this simple modification, one can change the way paths appear in the output. Let us turn our attention to the graph represented in Figure 3. Observing Figure 3, the maximum length of a path in the graph can be easily seen to be three and the minimum length to be two.
Another modification that can be used using a queue is adding another restriction in line 6. If the length of the set is higher than some threshold, then stop the while cycle. With the previous modification the algorithm only retrieves the paths of length k, which enables easy finding of k-shortest paths with no costs associated in the edges.
Using the stack procedure ensures that every possible path is exhausted from an initial branch of the graph. In practical terms, it should be pointed out that the initial graph is stored in an adjacency list type data structure and the paths and the cycles are stored as sequences of edges in some user defined data structure. The algorithm presented in Figure 4 is the one implemented. One can easily see that it is the same algorithm presented in section 4.
Note that in section 4. The time bounds are not affected by this small change. For this purpose, the version of APAC with the queue instead of the stack will be employed, and a prescribed path length of 3 edges is imposed the minimum value in this graph. Figure 6. Subgraph G1. Solid lines and nodes represent edges of the subgraph; dashed lines and nodes represent edges from the initial graph shown only for visualization purposes.
The previous subgraph was built with paths of length k retrieved from the graph represented in Figure 5. Note that this graph is also a subgraph of G.
Isomorfismo de grafos