Axially Symmetric Steady Flow

by Norm Beekwilder




Bottom Line

Application Overview

Data Decomposition

Parallel Algorithm

Result Measurements

Results

Bottom Line

Application Overview

Purpose of this code: to see how and where particles will collide with a sruface at the end of the gas jet. This simulates part of the chip manufacturing process in an attempt to increase the yield of a production line. The Sequential code algorithm:

  1. All the particles are moved, only check to see if they have collided with a surface or have left the simulation area.
  2. Particles are indexed by cell
  3. Pairs of particles in each cell are selected to collide.
Data structure used: Two one-dimensional arrays to track the position and velocity of the particles. Another index array is used within the cells to find the location of the particles that the cells contained.

Data Decomposition

Decomposition Approach: The data structures were decomposed into a two-dimentional grid of patches with several cells per patch. Since most operations are local to a cell, a regular grid is then divided into several sub-grids to decompose particles.

In above example, each 4x4 block is a patch that runs on its own CPU. All the squares in the patch are the cells of the original simulation. Shaded cells are interior cells, which calculate collisions while particles are transferred between patches. The arrows between patches represent communications that must occur each time step.

Parallel Algorithm

Another Modification: Random number generator
It's replaced with a Fibonacci lag random number generator which was slightly faster than the original linear congruential generator and was able to produce an independent random sequence of numbers for each patch. The problem with the linear one: it may merely cause each process to start at different points in the same sequence of number. As for Fibonacci lag generator, on the other hand, each process starts with its own sequence of random numbers.

Result Measurements

Results





















The table column headers denote how the problem was decomposed
and the total number of processors is show in parentheses.
The highlighted cell in each column is the best time and was used
to calculate the speed up graph.

[Centurion Overview] [Applications] [Photos]

[Index Page] [Overview] [Project Status] [Download Legion] [Security] [Prototypes] [Documents] [Documentation] [Promotional Material] [Presentations] [Workshops] [Contact Information] [Team Members] [Job Opportunities] [Access Statistics] [Centurion]

legion@virginia.edu
http://legion.virginia.edu/