Write a parallel algorithm for matrix multiplication

Parallel matrix multiplication java

The matrices A and B are arranged in such a manner that every processor contains a pair of elements for multiplication. The elements of matrix A will move in left direction and the elements of matrix B will move in upward direction. A p-dimensional mesh network having kP nodes has a diameter of p k—1. An n-dimensional hypercube is also known as an n-cube or an n-dimensional cube. In this chapter, Multiplication matrix is implemented on various communication networks such as mesh and hypercube is discussed. On a single machine this is the amount of data transferred between RAM and cache, while on a distributed memory multi-node machine it is the amount transferred between nodes; in either case it is called the communication bandwidth. These changes in the position of the elements in matrix A and B present each processing element, PE, a new pair of values to multiply. Elements for matrices A and B areaijandbijrespectively. N-dimensional hypercube is also known as an n-cube or an n-dimensional cube. The result submatrices are then generated by performing a reduction over each row.

Network connectivity is high in mesh and hypercube enabling faster algorithm than other networks. Here, fork is a keyword that signal a computation may be run in parallel with the rest of the function call, while join waits for all previously "forked" computations to complete.

Mesh Network A topology where a set of nodes form a p-dimensional grid is called a mesh topology. Arrange the matrices A and B in such a way that every processor has a pair of elements to multiply. A new pair of values for multiplication are represented by these position changes of the elements of matrix A and B.

These changes in the position of the elements in matrix A and B present each processing element, PE, a new pair of values to multiply. Mesh and hypercube have higher network connectivity, so they allow faster algorithm than other networks like ring network. In the 2D algorithm, each processor is responsible for one submatrix of C.

The numerical and non-numerical data is arranged in a fixed number of rows and columns forming a matrix.

vector matrix multiplication in parallel algorithm

Bisection width — When a mesh network is divided into two halves, the minimum number of edges removed from the network is Bisection width.

Next Page A matrix is a set of numerical and non-numerical data arranged in a fixed number of rows and column. This algorithm can be combined with Strassen to further reduce runtime.

Rated 7/10 based on 93 review
Download
Parallel matrix matrix multiplication