All featured image credits: Stewart Oak, Imperial College London.
Imperial College London efficiently stores, manages & protects large volumes of world-leading research data throughout its life-cycle with a tailored high-performance & future-proof software-defined storage and data-management platform from ArcaStream.
Imperial College London is home to 17,000 students and 8,000 staff, attracting undergraduates from more than 125 countries and awarding over 6,700 degrees every year. The University focuses on the four main disciplines of science, engineering, medicine and business and is one of the world’s leading university research centres – sharing ideas, expertise and technology to find answers to today’s big scientific questions and tackle global challenges.
As a centre for high-impact research, the University’s Research Computing Service (RCS) – part of the ICT department – plays a vital role in addressing the computing and storage needs of the research community. In 2018, the RCS team launched the Research Data Store (RDS) to provide new robust, reliable storage services to efficiently manage and protect large volumes of research data throughout its entire life-cycle. The innovative solution at the heart of these services was designed and delivered by ArcaStream.
- Eliminate 30 fragmented islands of storage with a single centrally-managed and supported system
- Guarantee consistent high-performance with an optimum user experience
- Ensure responsible data management compliant with regulations
- Charge users by consumption not capacity
- Deliver a continual service for decades
The requirement for guaranteed performance, scalability and integration with current and future compute systems substantially increased the complexity of the project and a bespoke solution was sought. ArcaStream was selected to provide a high-performance, scalable research storage solution to seamlessly integrate legacy infrastructure and support the University’s future storage strategies.
ArcaStream’s PixStor framework was deployed to deliver a protected, adaptable, scalable and collaborative central platform across the institution. A high-performance scalable storage platform based on IBM Spectrum Scale, it combines flash, disk, tape, and cloud storage into a single global namespace. With a software-defined architecture, it uses open standard commodity hardware to avoid vendor lock-in coupled with powerful data management tools – including tiering, cloud integration, monitoring, search and analytics – to drive workflow efficiencies and reduce costs.
PixStor guarantees consistent high-performance with no degradation as the file system fills, and the platform has been designed to easily extend, upgrade and replace for the foreseeable future, with no limits on how large it can grow, or for how long it can operate a continuous service.
- 10 PB research storage repository with a single centrally-managed platform serving existing 2,500 HPC nodes.
- Simultaneously serving 3,000+ users seamlessly via the desktop.
- 20 GB/s throughput with no loss of interactive user performance.
- Disaster Recovery using Ngenea to tier replicated data to colder external storage without expanding the physical footprint as the system grows.
- 1 Billion files replicated each night in less than 8 hours.
- Analytics to make informed decisions on expansion and the true cost of existing data.
Imperial College’s Resource Data Store (RDS) infrastructure is data-centre based with geographically dispersed primary and secondary sites deploying PixStor with asynchronous replication and intelligent automated tiering to external storage targets.
ArcaStream’s PixStor combines best-of-breed technologies from Dell, Excelero and Mellanox into a single integrated platform.
Dell PowerEdge servers and Dell PowerVault storage to deliver exceptional reliability and performance at a commodity price point. R740 Servers run and serve the filesystem to the HPC Clients as well as to the general research community via the ArcaStream NAS stack.
Excelero’s NVMesh® software provides a scalable NVMe tier for extreme metadata performance, running on Dell PowerEdge R740XD servers. NVMesh enables shared NVMe across any network and supercharges any file system – accelerating interactive performance.
Mellanox Spectrum and ConnectX® technologies provide a hybrid network infrastructure to deliver data via both Infiniband and Ethernet. The 100GbE networking solution backbone allows seamless scalability of the solution as the University’s requirements increase.
Researchers at Imperial College London are now using ArcaStream’s PixStor platform with absolute confidence in its ability to support their research data storage needs. High performance, with enterprise-level reliability and integrity, along with robust continuous service through expansion, replacement, upgrades and refreshes, mean that the University can confidently plan for the long term – addressing continual multi-petabyte per year growth of their data holdings, safe in the knowledge that both the primary and DR solution can keep up.
The PixStor platform has already been expanded with additional capacity, and further expansion into tape storage using Ngenea is underway to provide a massive capacity boost without compromising on the service provided. This ongoing investment in the service speaks volumes about Imperial College London’s confidence in the PixStor platform and ArcaStream’s ability to deliver the level of support and service they require to strategically meet their always-evolving research data requirements.
- Greater agility and insight to manage capacity and performance.
- Reduced complexity and silos.
- Improved security to meet stringent data regulations.
- Efficient control of expenditure and growth strategies.
- Delivers significant ROI, with integration of legacy and new systems.
- Future-proof scalability and flexibility without hardware lock-in.
- Peace of mind and an improved user experience.