Welcome!

News Feed Item

Fujitsu Laboratories Develops Technology to Reduce Network Switches in Cluster Supercomputers by 40%

Maintains network performance, lowers energy consumption

Tokyo, July 15, 2014 - (JCN Newswire) - Fujitsu Laboratories Ltd. today announced that it has developed a technology that reduces the number of network switches used in a cluster supercomputer(1) system comprised of several thousand units by 40% while maintaining the same level of network performance.

Existing cluster supercomputers typically use a "fat tree" network topology(2), in which, for example, 6,000 servers would require about 800 switches, or possibly more than 2,000 switches, with network performance that needs redundancy and other features. Networks account for up to about 20% of the power consumed by a supercomputer system, which means there are high expectations for a new network technology that can maintain good network performance with fewer switches.

Fujitsu Laboratories has used a multi-layer full mesh topology in combination with a newly developed communications algorithm that controls transmission sequences to avoid data collisions. This means that, even in all-to-all communications, which are prone to bottlenecks during application execution, performance stays on par with existing technology while using roughly 40% fewer switches, saving energy without sacrificing performance.

Details of this technology are being presented at the Summer United Workshops on Parallel, Distributed and Cooperative Processing 2014 (SWoPP 2014), opening July 28 in Niigata City, Japan.

Background

Cluster supercomputers have been widely used in the fields of manufacturing, such as for the design of mobile phones, cars, and airplanes, as well as scientific technology computing. Increasingly, though, they are being used in new areas, such as in in silico drug discovery and medicine, and to analyze earthquakes and weather phenomena, and these applications require even more powerful supercomputers.

To realize increased supercomputing performance, multiple servers are connected by networks. These servers are equipped with high-performance computation units consisting of accelerators that are typically many-core processors which have multiple CPUs or GPGPUs(3)

Technological Issues

In order for the supercomputer's computing performance to be useful to a wide range of applications, the network joining the servers needs to have higher performance. In the fat-tree network topology, tiers are set based on the extent of the servers being connected, and the redundancy of paths in the tree-like network topology that connects the switches results in fast network performance. For example, a system with 6,000 servers would require 800 switches, each with 36 ports, to connect them.

Thanks to the redundancy of routes in the fat-tree topology, when running a fast Fourier transform, for example, as part of an analysis on a cluster supercomputer, all-to-all communications among the servers shows good network performance. Meanwhile, many-core processors in individual servers or accelerators such as GPGPUs produce dramatic jumps in performance. Network performance needs to be improved so that it stays balanced with computational performance, and this requires many more switches, but increasing the number of switches entails the problem of higher costs for materials, electric power, and installed space.

About the Technology

What Fujitsu Laboratories has done is to develop a technology that can accommodate a large number of servers with relatively few switches by considering what would be an optimized data-exchange process, then connecting the cluster in a new way. This reduces the number of switches needed to connect a given number of nodes by roughly 40% compared to a fat-tree network topology while maintaining equivalent performance levels under the maximum-load communication pattern of all-to-all communications.

Key features of the technology are as follows.

1. Multi-layer full-mesh network topology

Fujitsu Laboratories developed a structure where switches for indirect connections are arrayed around the periphery of a full-mesh framework that connects all switches directly, and multiple full-mesh structures are connected to each other. Compared to a three-layer fat-tree network topology, this eliminates an entire layer of switches, with switch ports being used more efficiently and a smaller number of switches in use.

2.Data-exchange process avoids path contention

In all-to-all communications, where each server is exchanging data with every other server, reducing the number of switches also reduces the number of paths between servers, which is likely to result in collisions. Fujitsu Laboratories was able to achieve all-to-all communications performance on par with a fat-tree topology by taking advantage of the multi-layer full mesh network topology in the process of transferring data between servers. By using scheduling, servers connected to the various apex switches (A through F) will divert to a different apex, and also by avoid collisions within paths that traverse different layers (a1 through d3).

Results

This technology makes it possible to maintain the performance of large-scale cluster supercomputers that are needed for such applications as drug discovery and medicine, and to analyze earthquakes and weather phenomena, while lowering facility costs and power costs. This thereby enables the provision of supercomputers that achieve high performance while conserving energy.

Future Plans

Fujitsu Laboratories plans to have a practical implementation of this technology during fiscal 2015. It also plans to continue research into topologies for large-scale computing systems that do not depend on increasing numbers of switches.

Note:

(1) Cluster supercomputer

A supercomputer made up of numerous PC servers connected by a high-speed network.

(2) Fat tree topology

A network topology that follows a basic tree-like structure, with multiplexed higher layers. A key benefit of this topology is that it avoids network congestion.

(3) GPGPU

A "general-purpose graphic processing unit" is a specialized processor for not only image processing, but has other uses as well as it has the ability to perform certain kinds of calculations very quickly. This has made them increasingly popular in supercomputers recently.

About Fujitsu Limited

Fujitsu is the leading Japanese information and communication technology (ICT) company offering a full range of technology products, solutions and services. Approximately 170,000 Fujitsu people support customers in more than 100 countries. We use our experience and the power of ICT to shape the future of society with our customers. Fujitsu Limited (TSE: 6702) reported consolidated revenues of 4.4 trillion yen (US$47 billion) for the fiscal year ended March 31, 2013 For more information, please see www.fujitsu.com.



Source: Fujitsu Limited

Contact:
Fujitsu Limited
Public and Investor Relations
www.fujitsu.com/global/news/contacts/
+81-3-3215-5259


Copyright 2014 JCN Newswire. All rights reserved. www.japancorp.net

More Stories By JCN Newswire

Copyright 2008 JCN Newswire. All rights reserved. Republication or redistribution of JCN Newswire content is expressly prohibited without the prior written consent of JCN Newswire. JCN Newswire shall not be liable for any errors or delays in the content, or for any actions taken in reliance thereon.

Latest Stories
In his session at 21st Cloud Expo, Carl J. Levine, Senior Technical Evangelist for NS1, will objectively discuss how DNS is used to solve Digital Transformation challenges in large SaaS applications, CDNs, AdTech platforms, and other demanding use cases. Carl J. Levine is the Senior Technical Evangelist for NS1. A veteran of the Internet Infrastructure space, he has over a decade of experience with startups, networking protocols and Internet infrastructure, combined with the unique ability to it...
"Codigm is based on the cloud and we are here to explore marketing opportunities in America. Our mission is to make an ecosystem of the SW environment that anyone can understand, learn, teach, and develop the SW on the cloud," explained Sung Tae Ryu, CEO of Codigm, in this SYS-CON.tv interview at 21st Cloud Expo, held Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA.
High-velocity engineering teams are applying not only continuous delivery processes, but also lessons in experimentation from established leaders like Amazon, Netflix, and Facebook. These companies have made experimentation a foundation for their release processes, allowing them to try out major feature releases and redesigns within smaller groups before making them broadly available. In his session at 21st Cloud Expo, Brian Lucas, Senior Staff Engineer at Optimizely, discussed how by using ne...
"CA has been doing a lot of things in the area of DevOps. Now we have a complete set of tool sets in order to enable customers to go all the way from planning to development to testing down to release into the operations," explained Aruna Ravichandran, Vice President of Global Marketing and Strategy at CA Technologies, in this SYS-CON.tv interview at DevOps Summit at 21st Cloud Expo, held Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA.
"There's plenty of bandwidth out there but it's never in the right place. So what Cedexis does is uses data to work out the best pathways to get data from the origin to the person who wants to get it," explained Simon Jones, Evangelist and Head of Marketing at Cedexis, in this SYS-CON.tv interview at 21st Cloud Expo, held Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA.
"ZeroStack is a startup in Silicon Valley. We're solving a very interesting problem around bringing public cloud convenience with private cloud control for enterprises and mid-size companies," explained Kamesh Pemmaraju, VP of Product Management at ZeroStack, in this SYS-CON.tv interview at 21st Cloud Expo, held Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA.
Large industrial manufacturing organizations are adopting the agile principles of cloud software companies. The industrial manufacturing development process has not scaled over time. Now that design CAD teams are geographically distributed, centralizing their work is key. With large multi-gigabyte projects, outdated tools have stifled industrial team agility, time-to-market milestones, and impacted P&L stakeholders.
"Cloud Academy is an enterprise training platform for the cloud, specifically public clouds. We offer guided learning experiences on AWS, Azure, Google Cloud and all the surrounding methodologies and technologies that you need to know and your teams need to know in order to leverage the full benefits of the cloud," explained Alex Brower, VP of Marketing at Cloud Academy, in this SYS-CON.tv interview at 21st Cloud Expo, held Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clar...
Gemini is Yahoo’s native and search advertising platform. To ensure the quality of a complex distributed system that spans multiple products and components and across various desktop websites and mobile app and web experiences – both Yahoo owned and operated and third-party syndication (supply), with complex interaction with more than a billion users and numerous advertisers globally (demand) – it becomes imperative to automate a set of end-to-end tests 24x7 to detect bugs and regression. In th...
Enterprises are moving to the cloud faster than most of us in security expected. CIOs are going from 0 to 100 in cloud adoption and leaving security teams in the dust. Once cloud is part of an enterprise stack, it’s unclear who has responsibility for the protection of applications, services, and data. When cloud breaches occur, whether active compromise or a publicly accessible database, the blame must fall on both service providers and users. In his session at 21st Cloud Expo, Ben Johnson, C...
"Infoblox does DNS, DHCP and IP address management for not only enterprise networks but cloud networks as well. Customers are looking for a single platform that can extend not only in their private enterprise environment but private cloud, public cloud, tracking all the IP space and everything that is going on in that environment," explained Steve Salo, Principal Systems Engineer at Infoblox, in this SYS-CON.tv interview at 21st Cloud Expo, held Oct 31 – Nov 2, 2017, at the Santa Clara Conventio...
Data scientists must access high-performance computing resources across a wide-area network. To achieve cloud-based HPC visualization, researchers must transfer datasets and visualization results efficiently. HPC clusters now compute GPU-accelerated visualization in the cloud cluster. To efficiently display results remotely, a high-performance, low-latency protocol transfers the display from the cluster to a remote desktop. Further, tools to easily mount remote datasets and efficiently transfer...
"Akvelon is a software development company and we also provide consultancy services to folks who are looking to scale or accelerate their engineering roadmaps," explained Jeremiah Mothersell, Marketing Manager at Akvelon, in this SYS-CON.tv interview at 21st Cloud Expo, held Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA.
"MobiDev is a software development company and we do complex, custom software development for everybody from entrepreneurs to large enterprises," explained Alan Winters, U.S. Head of Business Development at MobiDev, in this SYS-CON.tv interview at 21st Cloud Expo, held Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA.
"We're developing a software that is based on the cloud environment and we are providing those services to corporations and the general public," explained Seungmin Kim, CEO/CTO of SM Systems Inc., in this SYS-CON.tv interview at 21st Cloud Expo, held Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA.