7-06-2008 - Cloud Computing and HPCC

Cloud Computing and HPCC


There has been a huge amount of buzz about Cloud Computing and how it is changing everything in the computing world. We’re seeing signs that yes indeed, the cloud is changing computing. But as I mentioned in one of my previous blogs (insert link), sometimes when a new technology comes along, people think it is the cure all for everything. As I have said in other blogs, the new technology becomes something of a hammer and people run around looking for nails, assuming that most everything looks like a nail. This is true with Cloud Computing. But let’s dig into what cloud computing is and how it affects HPCC.


What is Cloud Computing?

Let’s start by talking about what Cloud Computing is, what it looks like, and why people are excited about it. I’ll try to be brief in my defining Cloud Computing, because there is so much that can be written about it. Actually we can use Cloud Computing to define Cloud Computing. In this case using Google, we find many definitions for Cloud Computing. My favorite definition is,



“Cloud computing is a computing paradigm in which tasks are assigned to a combination of connections, software and services accessed over a network. This network of servers and connections is collectively known as "the cloud." Computing at the scale of the cloud allows users to access supercomputer-level power.”



I think this definition captures what Cloud Computing is all about and what it means, but it is a bit more technical than I like. More over, you can already see that definitions like this are already throwing around phrases such as “supercomputer-level power” without totally understanding or qualifying what they mean. Perhaps a higher-level definition of Cloud Computing is something like,


“Cloud Computing allows applications to run somewhere on the ‘cloud,’ which can be your PC, the company or home network, or the internet itself, but fundamentally, the user doesn’t need to know where the application runs and doesn’t care where the application is run.”


Examples of Cloud Computing are all around us even if you don’t really see them. Google is probably the king example. Using Google as your search engine creates a search that is run over a number of machines that Google operates. You don’t know if the systems are nearby or thousands of miles of away. You can also use Gmail or any of the other popular Google developed and provided tools (Google Docs) and they function in a similar fashion. For applications such as Google Docs, your data is stored somewhere on the internet (you don’t care where and you actually don’t know where), and you can access it with any internet connected system (pretty convenient).

Other examples of Cloud Computing include Amazon, Facebook, Youtube, Myspace, and other social sites. You can also include Massive Multi-Player On-Line Games (MMOG) in the Cloud Computing category as well. In a MMOG you play with or against teams or individuals from around the world on systems that are located somewhere (you don’t particularly care where they are located). There are many examples of these including EverQuest, World of Warcraft, Ultima Online, and Second City.

So Cloud Computing is all around us and people are using it to great advantage all of the time. You can see why many people are saying that Cloud Computing is the future of HPCC and why not? It’s everywhere, it’s easy to use, it’s cheap (for the most part), and people say that it provides supercomputer level performance. Let’s start examining that statement by looking at what are called “application profiles.” That is, the set of resources required by the application to run successfully.


Cloud Computing Application Profiles

One of the keys to Cloud Computing is the application profile. That is, what resources the application needs and how it functions. In a general sense, the current generation of Cloud Computing applications have the following profile:
  • The applications are not parallel in any sense and are not threaded
  • Almost all of the applications don’t require much memory bandwidth or very high levels of CPU usage
  • The applications do very little IO relative to their computation. However there are exceptions to this, such as databases although databases aren’t really run on “the cloud” yet
  • The applications can self-heal. That is, if the application fails for whatever reason, it can easily be restarted or recovered with little interruption to the user


For most everyday activities such as creating presentations, writing documents and spreadsheets, playing games, etc, the applications are serial and in most cases, many of the functions are pushed to the desktop (most likely using AJAX ). In addition, companies such as Google have created storage systems for Cloud Computing, but the file systems are oriented toward applications that have the previously mentioned profile (see for example, hadoop). However you have to use MapReduce to access and use the file system which limits it’s applicability to traditional HPC applications.

The previous list of application characteristics are also true for many Enterprise applications. That is, they are serial applications with fairly small amounts of IO. Of course there are Enterprise applications that are IO intensive, such as a database, but they typically do not dominate the application ranking overall.


HPCC Computing Application Profiles

HPCC applications have a wide range of application profiles. The typical application involves parallel computing and may or may not have quite a bit of IO. At the same time, there are HPCC applications that are serial (single node) that may or may not do a lot of IO. So the range of profiles for HPCC applications is quite large. Nevertheless the following is a brief list of some of the elements of HPCC applications.
  • Many of the applications are parallel where there is some data communication between the various processes.
    • Sometimes the inter-process data transfer is fairly small
      • An example are some CFD codes
    • Sometimes the inter-process data transfer is very large
      • An example is WRF
  • Some applications are either serial or threaded and run on a single node
  • Some applications, either serial or parallel, do a large amount of IO
  • Some applications, either serial or parallel, don’t do a large amount of IO
  • Some applications have the ability to create a checkpoint that is a snapshot of the progress of the computation. If the system fails for whatever reason, it can be restarted from the last checkpoint so that it doesn’t have to be restarted from the beginning.
    • Not all applications can do this


It’s pretty easy to look at the application profile and see that HPCC applications vary widely. In addition, it’s also pretty obvious that HPCC applications have a different application profile than current Cloud Computing applications. It is this difference that explains why some HPCC application could fit into the Cloud Computing model and some can’t. In the next section, I’ll talk about which ones work well and which ones won’t work and aren’t likely to ever work well.


HPCC Applications in the Cloud

We’re at the point where the Cloud applications and HPCC applications appear very different, but if you look carefully, there is some overlap. There are some HPCC applications that don’t require communication between nodes and don’t require much IO (one example is BLAST). The application profile of these particular applications matches Cloud Computing capability very well. What are the applications where this is true? It’s difficult to just list the applications because it’s possible to run parallel applications on a single node. As long as the data set fits onto the node without swapping, you can run the application in the Cloud. Here’s a quick list of the requirements of the HPCC application or, more appropriately, the combination of the application and the data set that can run well in the cloud
  1. The application must run on a single node
    1. The dataset must fit on a single node
  2. Not IO intensive
  3. Application runs quickly or creates a checkpoint (self-check-pointing)

So the application profile of HPCC applications that could work well in Cloud Computing is fairly limited but definitely not zero.

For example, any application that can be classed as a Monte Carlo simulationis a good candidate to be run in the Cloud. Mont Carlo simulations are simulations that are potentially run millions of times with slightly different data or slightly different input parameters. It can also refer to applications that have a probabilistic nature or some set of uncertainty in the computations.

Another set of applications that could run well in the Cloud are large searches. Google have already mastered Cloud Computing but there are applications of search in different fields. For example, BLAST is used to search and compare biological sequence information. So to search a huge database of sequence information, a set of a large number of searches is created and run. Each search can be made to run on a single node (although there is a parallel version of BLAST).

One last important aspect of running in the Cloud that must be mentioned is data security. HPCC applications are typically used by companies and labs to create new products or to perform new research. This gives the companies and the lab, an advantage over others. Consequently, protecting this data is very important to the company. If you are running your application with very proprietary in the Cloud, you have to be concerned about the security of the data. So far, the Cloud doesn’t really discuss or make allowances for data security.


Doesn’t The Cloud Sound Like The Grid?

If you were in HPC a few years ago the buzz was around something people called Grid Computing. The idea was very similar to Cloud Computing in that you can pool a set of disparate resources, possibly separated by a long distance, and aggregate them into a single system. The goal was to run your HPC codes on the a Grid by submitting the job to the centralized job scheduler which would then determine where to run the application. The application could run locally or it could run on systems thousands of miles away. This sure sounds like Cloud Computing to me. But I believe there is a fundamental difference between Cloud Computing and Grid Computing.

In Grid Computing, you would be able to run any of your HPC codes, even ones that have lots of interprocess communication or IO. Cloud Computing is not really designed to do this. In my opinion, the promise of Grid Computing was that you could run your MPI codes on the Grid allowing you to combine physically distance systems into a single system. The reality is that applications are not run in this way because of the possibly limited bandwidth and very high latency between systems. The way Grids typically run today is something more like meta-scheduling. You submit your job to a central scheduling point and it decides which system to run it on. So the application is not run across multiple systems but rather a single system within the grid.

I think there is one important difference between Grid Computing and Cloud Computing. Cloud Computing wants applications with no interprocess communication across nodes and little IO. Grid Computing fundamentally allows applications with a great deal of interprocess communication across nodes and lots of IO as long as it can stay on a single system.


Summary

Let’s recap the discussion so far.
  • Cloud Computing uses applications that serial (run on a single core or a single node), have data sets that fit within a single node, do very little IO, and don’t require data security.
  • HPCC applications have a wide variety of requirements from running on a single node (e.g. BLAST) to running on many nodes in parallel. They can also have lots of IO and data security is an important consideration
  • Grid computing runs HPCC applications but the user may not now where it runs (much like the Cloud). But the Grid runs HPCC applications so it inherits all of the requirements of HPCC applications
Based on these observations of the three types of computing models it’s pretty easy to tell the difference between them and which one is appropriate for your computing needs.


Observations

So what is the Cloud and how will it affect HPCC? Great question – I’m glad you asked. The Cloud, in many ways looks like what people call Grid Computing. But it has some fundamental differences in application profiles that really restrict what HPCC applications could take advantage of Cloud Computing. In my opinion, Cloud Computing can influence only certain areas of HPCC – those that have applications that don’t do any interprocess communication (or very little of it) and very little IO. In many ways I think Cloud Computing is the current “hammer” that is looking at every opportunity as though it were a “nail”. So don’t change your HPCC strategy because Cloud Computing is the “hammer du jour”.

However, let me also close by saying that the definition of HPCC is changing. Traditionally, HPCC has been focused on applications that perform scientific or engineering computations on something larger than a desktop or applications with extreme amounts of computing, but still with a mathematical or scientific focus. But lately, I’m starting to see people that consider applications such as Business Intelligence (BI) to be HPCC applications. Even things such as databases are sometimes referred to in an HPCC context. Even more extreme are people who consider games or entertainment applications such as Second Life or Massive Multi-Player On-Line Games such as Ultima Online to be HPCC. This also includes even Google as an HPCC application.

With the change in definition of HPCC to include these new fields, perhaps Cloud Computing does have a role to play in HPCC. But who knows? Maybe the world will decide to call these applications something else and leave poor HPCC alone.


Jeff


No user avatar
laytonjb
Latest page update: made by laytonjb , Jul 7 2008, 8:15 PM EDT (about this update About This Update laytonjb Edited by laytonjb

2332 words added

view changes

- complete history)
More Info: links to this page

Related Content

  (what's this?Related ContentThanks to keyword tags, links to related pages and threads are added to the bottom of your pages. Up to 15 links are shown, determined by matching tags and by how recently the content was updated; keeping the most current at the top. Share your feedback on Wetpaint Central.)