On Fri, Oct 06, 2006 at 03:35:30PM -0600, mgd at santafe.edu wrote:
> Quoting Douglas Roberts <doug at parrot-farm.net>:
>
> > If you go to any of the supercomputing centers such as NCSA, SDSC, or PSC,
> > you do not see parallel java apps running on any of their machines (with the
> > occasional exception of a parallel newbie trying, with great difficulty to
> > make something work). The reasons:
> >
> > 1. there are few supported message passing toolkits that support
> > parallel java apps,
> > 2. java runs 3-4 times slower than C, C++, Fortran, and machine time
> > is expensive, and finally
> > 3. there are well-designed and maintained languages, tookits and APIs
> > for implementing HPC applications, and || developers use them instead of
> > java.
>
> I expect in the next few years some supercomputing niches will start to use
> hypervisors like Xen. By paying the 20% cost or so this will allow queueing
> systems like LSF to bring jobs on and off line with uniform checkpointing. It
> will remove the need for ad-hoc checkpointing code in applications by allowing
> any executable to be stored (much a laptop does when it goes to sleep) and/or
> migrated from one system to another.
>
> I would be very happy I could submit a job and have it run indefinitely to
> completion...instead of having it kicked out every 6-12 hours for manual restart
> or a procedure where I have to write scripts to figure out where things stand
> and adaptive resubmit. 20% is nothing compared to that inefficency!
Interesting comment, but checkpointing of single image tasks was never a
real showstopper. It was a practical option on many systems, eg Irix,
and could have been solved for Linux at any time.
Also EcoLab (for agent based modelling) provides trivial checkpointing
functionality for serial codes - but it does get more interesting when
using it in parallel.
However Xen will not solve the showstoppers that occur for codes that
use sockets - ie all distributed memory message passing jobs, and jobs
using floating licensed commercial software.
I would prefer that the use of Xen be simply a user specifiable option
for doing checkpointing.
>
> While there are well-designed and maintained languages for HPC (MPI and OpenMP),
> they are only the most basic of infrastructure. MPI is a pain to use and OpenMP
> requires a big SMP system. Maybe when there are Hypertransport cables and
> implementations of languages like Fortress or Chapel life will be better. (e.g.
> Infiniband hasn't resulted in useful and used distributed shared memory systems.)
>
ClassdescMP takes away most of the pain of MPI. There other options too,
for example the recently added Boost.MPI package. I'm planning on
taking a look at Boost.MPI to see how it compares with ClassdescMP...
(Note I'm being a one-eyed C++ person here, though...)
>
> ============================================================
> FRIAM Applied Complexity Group listserv
> Meets Fridays 9a-11:30 at cafe at St. John's College
> lectures, archives, unsubscribe, maps at http://www.friam.org
--
*PS: A number of people ask me about the attachment to my email, which
is of type "application/pgp-signature". Don't worry, it is not a
virus. It is an electronic signature, that may be used to verify this
email came from me if you have PGP or GPG installed. Otherwise, you
may safely ignore this attachment.
----------------------------------------------------------------------------
A/Prof Russell Standish Phone 0425 253119 (mobile)
Mathematics
UNSW SYDNEY 2052 R.Standish at unsw.edu.au
Australia http://parallel.hpc.unsw.edu.au/rks
International prefix +612, Interstate prefix 02
----------------------------------------------------------------------------