![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
|
![]() ![]() ![]() ![]() ![]() ![]() |
| Grid at work |
![]() |
Grid applications - Problèmes computationnels | |||
|
Grid architecture
|
One way of categorizing a computational problem in computer science is by
its degree of "parallelism". Can your problem be split
into many smaller sub-problems that can be worked on by different processors
in parallel? If so, you can speed up your computation a lot by using many
computers.
Another category is the granularity of the problem. Is each sub-problem highly dependent on the result of other sub-problems? If so you are dealing with fine-grained parallel calculations. This is the case, for example, in a calculation of the weather, which can be split into many smaller calculations of the weather in small volumes of the atmosphere. Each of these calculations is strongly affected by what is happening in neighbouring volumes. In fact, even changes in the weather very far away can have an impact - this is the origin of the famous saying that a butterfly beating its wings in China can cause a hurricane in America. In practice, fine-grained parallel calculations require very clever programming to make the most of their parallelism, so that the right information is available to processors at the right time. At the opposite end of the granularity scale are coarse-grained or embarrassingly parallel calculations, where each sub-problem is independent of all others. This is the case, for example for so-called Monte Carlo simulations, where you vary the parameters in a complex model of a real system and study the results using statistical techniques - a sort of computer experiment. Each calculation can be done independently of the others in this case. Another example of embarrassingly parallel is the analysis of a large databank of medical images, where each image is independent of the others. As a rule of thumb, fine-grained calculations are better suited to big, monolithic supercomputers, or at least very tightly coupled clusters of computers, which have lots of identical processors with an extremely fast, reliable network between the processors, to ensure there are no bottlenecks in communications. This type of computing is often referred to as high-performance computing. On the other hand, embarrassingly parallel calculations are ideal for a more loosely-coupled network of computers, since delays in getting results from one processors will not affect the work of the others. These types of calculations are often referred to as high-throughput computing. At first sight, this distinction seems to suggest that the Grid is only good for embarrassingly parallel calculations. On the contrary, many of the interesting problems in science require a combination of fine- and coarse-grained approaches. For example, when doing complex climate modeling of the Earth, researchers want to see how the calculations depend on different parameters in their models. So they want to launch many similar calculations. Each one is a fine-grained parallel calculation, because predicting climate is like predicting the weather on a longer time scale. So each calculation needs to be run on a single cluster connected to the Grid. The many independent calculations could be distributed over many different clusters on the Grid, thus adding coarse-grained parallelism and saving a lot of time. Even if the Grid is only used to ensure that the scientist gets access to a single available cluster, wherever this might be in the world, this can still save him a lot of time compared to waiting in a queue for access to a local cluster.
|
|||
![]() |
||||
![]() |
|
![]() |
![]() |