Ciprian Zavoianu - weblog: Tutorial: Bandwidth reduction

1. The problem

I've recently come across an interesting NP-complete problem :D. It's the bandwidth reduction problem, also known in the field of sparse matrix applications as the bandwidth minimization problem (or BMP in short):

For a given simetric sparse matrix, M(nxn), the problem is to reduce its bandwidth B by permuting rows and columns such as to move all the nonzero elements of M in a band as close as possible to the diagonal.

In other words, the problem consists of transforming through successive row and column permutations this (8x8 input matrix):

into, possibly, this (8x8 output matrix):

Now here's an interesting thing regarding this problem. If you consider the given sparse matrix (the input matrix) and the resulting matrix (the output matrix) as adjacency matrices for some given graphs, this is what you end up with:

The graph associated with the input matrix.

The graph associated with the output matrix.

As you can easily see, the two graph structures are identical, the only thing that is different is the node (vertex) labeling. In other words the bandwidth reduction problem can also be viewed as a graph labeling problem:

Find the node labeling that minimizes the bandwidth B of the adjacency matrix of the graph G(n) , where we can formally define: B=max|Li-Lj|, i,j=1..n and Li is the label of node i, Lj is the label of node j and nodes i and j are adjacent or neighbours (i.e., there is an edge in graph G(n) connecting nodes i and j).

2. The solution

Back in the 60's and 70's many graph algorithms were proposed for "solving" the bandwidth reduction problem. I wrote "solving" beacause the problem is after all NP-complete and none of the algorithms described claimed to find the exact solution(s) regardless of the input matrix, but they were more or less succesfull in finding a relatively good solution in a resonable amount of time (the so called heuristic approach to NP-complete problem solving).
.
Initially the most widely used of these heuristics was the Reverse Cuthill-McKee algorithm (RCM), a modification by Alan George of the original algorithm developed by CutHill and McKee in 1969. In 1976 the GPS algorithm (named after its developers Gibbs, Poole and Stockmeyer) was proposed. On average GPS and RCM find solutions of relative equal quality, but GPS is up to ten times as fast.

One last thing I should remind you before I continue with the description of the RCM algorithm: the degree of a node in a graph is the number of nodes adjacent to it. So here is the RCM algorithm for a given graph G(n).

Step 0: Prepare an empty queue Q and an empty result array R.

Step 1: Select the node in G(n) with the lowest degree (ties are broken arbitrarily) that hasn't previously been inserted in the result array. Let us name it P (for Parent).

Step 2: Add P in the first free position of R.

Step 3: Add to the queue all the nodes adjacent with P in the increasing order of their degree.

Step 4.1: Extract the first node from the queue and examine it. Let us name it C (for Child).

Step 4.2: If C hasn't previously been inserted in R, add it in the first free position and add to Q all the neighbours of C that are not in R in the increasing order of their degree.

Step 5: If Q is not empty repeat from Step 4.1 .

Step 6: If there are unexplored nodes (the graph is not connected) repeat from Step 1 .
Step 7: Reverse the order of the elements in R. Element R[i] is swaped with element R[n+1-i].

The result array will be interpreted like this: R[L] = i means that the new label of node i (the one that had the initial label of i) will be L.

As you can see steps 1-6 describe nothing but a classic Breadth First Search (BFS) with the minor modification that the nodes are explored in the increasing order of their degree. Step 7 is not mandatory, it is the modification introduced by George to the initial algorithm (it has the purpose of further reducing the profile of a matrix, but that is another story). If you don't reverse the result array you end up with the original Cuthill-McKee algorithm.

Here is how the RCM algorithm should perform on the 8x8 input matrix presented at the start of this post:
- At first we must choose a starting node. Nodes initially labeled with 1, 4 and 7 are equally good choices as all of them have the degree of 1.
- Let us start with node 1. We'll add 1 in R and its neighbours in Q in the increasing order of their degree: R=[1]; Q=[5];
- We extract the first element from Q. As 5 isn't in R we add it. We also add all the neighbours of 5, that are not in R, in Q in the increasing order of their degree: R[1,5]; Q=[3];
- We extract the first element from Q. 3 is not in R so we add 3 to R and all the neighbours of 3, that are not in R, to Q in the increasing order of their degree: R=[1,5,3]; Q=[2];
- We extract the first element from Q. 2 is not in R so we add 2 to R and all the neighbours of 2, that are not in R, to Q in the increasing order of their degree: R=[1,5,3,2]; Q=[6,8];

- We extract the first element from Q. 6 is not in R so we add 6 to R and all the neighbours of 6, that are not in R, to Q in the increasing order of their degree: R=[1,5,3,2,6]; Q=[8,8];

- We extract the first element from Q. 8 is not in R so we add 8 to R and all the neighbours of 8, that are not in R, to Q in the increasing order of their degree: R=[1,5,3,2,6,8]; Q=[8];
- We extract the first element from Q. 8 is in R so we do nothing: R=[1,5,3,2,6,8]; Q=[];
- Q is empty so we check the size of R. If the size of R equals 8 then we are done, otherwise there are still unexplored nodes (the graph is not connected). As the size of R is 6 we must restart from Step 1.
- We must again choose a starting node. Nodes labeled 4 and 7 are equally good start solutions as they both have a degree of 1.
- We choose node 4. We'll add 4 in R and its neighbours in Q in the increasing order of their degree: R=[1,5,3,2,6,8,4]; Q=[7];

- We extract the first element from Q. 7 is not in R so we add 7 to R and all the neighbours of 7, that are not in R, to Q in the increasing order of their degree: R=[1,5,3,2,6,8,4,7]; Q=[];

- Q is empty so we check the size of R. The size of R is 8 so all the nodes have been relabeled.
- We reverse R and obtain the solution R=[7,4,8,6,2,3,5,1] with the following interpretation:

If you look closely at the two graph images at the start of the post you can see how the node which had the initial label of 7 will be relabeled with 1, the node which had the initial label of 4 will be relabeled with 2, the node which had the initial label of 8 will be relabeled with 3, and so on.

.
The only thing left now is to build the adjacency matrix of the relabeled graph. You should obtain the 8x8 output matrix presented at the start of this tutorial.

P.S. For a Delphi implementation (source code included) of the Cuthill-McKee and RCM algorithms, as well as a comparison of these methods with a more modern aproach based on a genetic algorithm and simulated annealing please see this post.

P.P.S. Help required / Collaboration opportunity

I have recently (February 2013) been contacted by Dr. Lubomir ONDRIS (lubomir.ondris(at)gmail.com) who is a scientist from Slovakia that has dedicated a lot of effort over the years into studying the bandwidth reduction problem and improving the available algorithms that attempt to solve it.

He created a personal hybrid algorithm that is quite competitive on the Everstine and other mesh collections (i.e., common benchmarks for testing the performance of bandwidth reduction routines) and he would need access to non-artificial, fairly big (>500 nodes) and complicated (inter-connected) meshes (matrices) in order to further test and improve his algorithm. Typically, such matrices are encountered when solving very large equation systems that arise in finite element analysis.

If you can help Dr. Ondris in any way with regard to his data requirements, please contact him directly via the e-mail address provided above. Here is a short description, in Dr. Ondris' own words, of what he requires:

"Again, what I need are parts of standard input data for finite element programs: Finite elements with their node numbers and nodes with their degrees of freedoms. And, of course, the bandwidth and profile achieved by the renumbering routine used."

Hope that you can help & thank you very much!

33 comments:

Anonymous said...: Excelent post! Congratulations! I admit it's been a while since I haven't seen such a clear, objective and thorough explanation of the RCM algorithm.; February 13, 2009 at 5:52 PM
Anonymous said...: Excelent!!! Crystal Clear!! Congratulations.; March 19, 2009 at 9:17 PM
Anonymous said...: this is just perfect!!!!!!!; May 12, 2009 at 4:48 AM
Anonymous said...: I don't know how to create the graph based on the input matrix. If I understand correctly, an incidence matrix maps the vertices to rows and edges to columns, thus each column should have two 1s. How to proceed with such input matrix?; May 22, 2009 at 5:56 AM
Anonymous said...: Excellent, superb work. Thanks a lot; May 22, 2009 at 7:11 PM
Anonymous said...: Incidence matrices map vertices to rows and edges to coloumns. However, the matrix used in this article is the "adjacency-matrix" of the graph.; July 19, 2009 at 4:37 PM
Ciprian said...: @ last anonymous
Yes you are right. My mistake ... Sorry for this one. I've made the corrections.; July 22, 2009 at 5:11 PM
Unknown said...: The steps are very clear! Superub!

We obtain a band matrix

But it is not clear how it helps to find submatrix or cluster...as far i know it helps to find submatrices. Can someone help me understand. Do we have to use any other algorithm to isolate submatrices..; February 15, 2010 at 3:08 PM
Ahmed Samieh said...: Excellent post, clear and simple

thanks Ciprian; June 8, 2010 at 10:02 AM
Razvan Popovici said...: Felicitari, apari in Google pe al doilea loc dupa Wikipedia cand cauti dupa "Cuthill-McKee".

Caut un algoritm asemanator, am o matrice m x n, evident asimetrica, sparse, pe care vreau sa o aduc cat mai aproape de diagonala.; June 25, 2012 at 9:39 PM
Ciprian said...: @Razvan
Merci! Apar atat de sus deoarece nu sunt prea multe informatii online, detaliate, despre algoritmul acesta ...

Daca ai o matrice rara, asimetrica si/sau dreptunghiulara, ai putea incerca o abordare cu metaeuristici (algoritmi genetici, particle swarm optimization, etc.).

Ceva hinturi despre cum ai putea face asta gasesti in postul la care fac referinta in ultimul link al tutorialului. Tot in postul acela, indic un articol stiintific cu mai multe informatii despre astfel de metode de rezolvare a problemei.; June 26, 2012 at 9:02 AM
Anonymous said...: Excellent explanation. Thanks a lot really best stuff online ti understand how RCM works

Absaar; September 19, 2012 at 1:11 PM
Vivek Parekh said...: anybdy with C program for this algo?; October 26, 2012 at 6:09 AM
Ciprian said...: I'm only familiar with the Boost Graph Library for a C++ implementation of the algorithm.; October 26, 2012 at 1:12 PM
Dr. B. K. Thakkar said...: Wonderful. Thanks a lot. I tried the algorithm and it works very well with 2D structures. But fails flat on 3D (space) structures. Could you please shed some light on how to modify the algorithm for 3D?

Thanks again :); December 3, 2012 at 5:18 AM
Ciprian said...: @ Dr. Thakkar
I don't know what modification can make the RCM work with 3D matrices.
However, it's my guess that it shouldn't be very hard to modify the metaheuristic algorithm referenced in the P.S. section of the post such as to cope with 3D structures. Of course, you need to redefine what are the legal "moves" (e.g., interchanging only vectors or only 2D matrices or both) and what is the objective function (i.e., what is the "bandwidth" in the case of a 3D matrix).; December 4, 2012 at 2:03 PM
yasin motcu said...: Good Work,

Thank you; January 7, 2013 at 9:18 PM
SJ said...: Normally you don't really care about the dimension of the matrix. As long as you work directly with the adjacency graph.

Good luck!; January 15, 2013 at 1:54 PM
Anonymous said...: Hi,
Thanks for your explanations on this algorithm.
Awesome work.

Thanks again from France,
Killian.; April 29, 2013 at 10:28 AM
Kintaar said...: Great tutorial! I used this tutorial for software in my most recent blog post.
http://mathformeremortals.wordpress.com/2013/09/02/regularizedata3d-the-excel-spreadsheet-function-to-regularize-3d-data-to-a-smooth-surface/; September 3, 2013 at 3:10 AM
Anonymous said...: Excellent, superb work; September 24, 2013 at 12:13 PM
Anonymous said...: Very good...
easy and clear
thank you; October 4, 2013 at 1:09 AM
Anonymous said...: Great and clear article.
I am implementing It in .net as an educational project.; June 7, 2014 at 10:28 AM
Anonymous said...: Very clear article. Thank you.
If you have time, try to explain us the GPS algorithm.; January 28, 2015 at 10:41 AM
Anonymous said...: Awesome tutorial! Well done!; November 10, 2015 at 1:37 PM
Anonymous said...: Excellent , this is one of the best tuto of RCM algorithm !
Never felt such satisfied; April 5, 2016 at 11:36 PM
moon said...: Thank you very much.It's very clear and easy understanding.; May 30, 2016 at 5:41 AM
Anonymous said...: Nice and helpful.; June 8, 2016 at 6:38 AM
marek said...: Hello and thank you for the explanation. Do you please know about a good implementation of these algorithms? Best would be MATLAB or R? For me it is particularly important that the resulting matrix is "approximately" block diagonal. The function in matlab that does the reverse C-M algorithm just reduces the bandwidth but does no block-diagonalisation (see here). Thank you!; June 9, 2016 at 10:48 PM
laidawoa said...: hello,what's the difference between CM and RCM? And what's the advantage of RCM; November 21, 2016 at 3:26 AM
Ciprian said...: As I mention in the post, as far as I know, CM does not contain Step 7 of the described algorithm (it only consists of the first 6 steps). Step 7 is responsible for the R part of RCM and it's purpose is to further reduce the profile of the obtained matrix.; November 21, 2016 at 10:58 AM
Anonymous said...: Beautiful explanation!; September 16, 2018 at 11:08 PM
Dreamer said...: I got to know the RCM algorithm working procedure but still, i have one doubt regarding choosing a start node to apply the algorithm. Suppose, in graph 4 is the minimum graph degree of 3 nodes. Now among these three, which can start be node? Please help me if anyone knows about this. My research is similar kind of.

Thank you,; October 7, 2018 at 9:27 PM

Ciprian Zavoianu - weblog

January 15, 2009

Tutorial: Bandwidth reduction - The CutHill-McKee Algorithm

33 comments: