[FRIAM] FW: Distribution / Parallelization of ABM's
Douglas Roberts
doug at parrot-farm.net
Sat Oct 7 12:41:30 EDT 2006
I forgot to mention: you have to tinker the snot out of a NUMA application
to get optimal performance. NUMA means that you have to pay close attention
to what parts of your calculation are using which memory, location-wise.
Non-uniform means different latency/bandwith for different memory locations
relative to any cpu in the system. IMO it actually takes longer to develop
an effective NUMA app than it does to field a distributed memory app.
--Doug
On 10/7/06, Douglas Roberts <doug at parrot-farm.net> wrote:
>
> On 10/6/06, mgd at santafe.edu <mgd at santafe.edu> wrote:
> >
> > Quoting Douglas Roberts <doug at parrot-farm.net>:
> >
> > > I disagree about the InfiniBand bit. Myrinet, and now, the newer
> > > Infiniband technology are commonly used on distributed memory
> > systems.
> >
> > Systems or applications? What systems? I know Intel sells a version of
> > Treadmarks that has a DSM server, but what hardware vendor uses
> > Infiniband to
> > make a unified NUMA memory, e.g. like an Altix?
>
>
> Lots of systems use Infiniband interconnet technology, but for distributed
> memory machines, not NUMA.
>
> http://www.osc.edu/press/releases/2004/voltaire.shtml
> http://www.beowulf.org/archive/2001-October/005268.html
> http://www.linuxdevices.com/news/NS7459807643.html
> http://www.hpcwire.com/hpc/506904.html
>
> etc.. There's nothing magic about Infiniband, it's just a faster,
> lower-latency Myrinet. See below for a note regarding NUMA machines.
>
>
> > The +2GB/sec bandwidth of these interconnect fabrics is important for
> > message
> > passing applications.
> >
> > For a NUMA system, where application parallelism isn't limited to
> > message
> > passing, latency is more important than bandwidth. Message passing as a
> > way to
> > write programs is what I find constraining!
> >
>
> Distributed applications will probably always scale better than shared
> memory applications because there are not very many shared memory or NUMA
> machines out there, and the distributed memory machines are much bigger than
> any of the shared memory or NUMA machines currently in production. The
> Altix 3000 is one of the few NUMA machines currenly still running at a few
> places, and SGI no longer is in business, at least with respect to NUMA
> machines. NUMA machines, while fun to play on, are really better suited to
> the hobbyist, since you don't find them (with but a few exceptions) in the
> production world.
>
> Learning how to design effective message passing distributed applications
> is not easy, but is is worth it when you have an application that needs to
> scale.
>
> --Doug
> --
> Doug Roberts, RTI International
> droberts at rti.org
> doug at parrot-farm.net
> 505-455-7333 - Office
> 505-670-8195 - Cell
>
--
Doug Roberts, RTI International
droberts at rti.org
doug at parrot-farm.net
505-455-7333 - Office
505-670-8195 - Cell
-------------- next part --------------
An HTML attachment was scrubbed...
URL: /pipermail/friam_redfish.com/attachments/20061007/2fa909fd/attachment.html
More information about the Friam
mailing list