BeeGFS

Parallel Cluster Storage File System

[BEEGFS]

More Image(s)

BeeGFS is a parallel file system, developed and optimised for high-performance computing. Its distributed metadata architecture is designed for scalability and flexibility with a focus on massive data throughput, such as AI training and inference workloads. The configurable design, allows for control of all aspects of the system, including a command line interface to monitor and control the cluster. The ground-up File System build is resilient and reliable, which provides unparalleled performance.

Availability: In stock

Details

Parallel Cluster File System

BeeGFS is the leading parallel cluster file system. It has been developed with a strong focus on maximum performance and scalability, a high level of flexibility and designed for robustness and ease of use. If I/O intensive workloads are your problem, BeeGFS is the solution.

BeeGFS is a software-defined storage based on the POSIX file system interface, which means applications do not have to be rewritten or modified to take advantage of BeeGFS. BeeGFS clients accessing the data inside the file system, communicate with the storage servers via network, via any TCP/IP based connection or via RDMA-capable networks like InfiniBand (IB), Omni-Path (OPA) and RDMA over Converged Ethernet (RoCE).

Furthermore, BeeGFS is a parallel file system. By transparently spreading user data across multiple servers and increasing the number of servers and disks in the system, the capacity and performance of all disks and all servers is aggregated in a single namespace. That way the file system performance and capacity can easily be scaled to the level which is required for the specific use case, also later while the system is in production.

BeeGFS separates metadata

The file chunks are provided by the storage service and contain the data, which users want to store (i.e. the user file contents), whereas the metadata is the “data about data”, such as access permissions, file size and the information about how the user file chunks are distributed across the storage servers.

The moment a client has got the metadata for a specific file or directory, it can talk directly to the storage service to store or retrieve the file chunks, so there is no further involvement of the metadata service in read or write operations.

BeeGFS adresses everyone, who needs large and/or fast file storage. While BeeGFS was originally developed for High Performance Computing (HPC), it is used today in almost all areas of industry and research.

Maximum Scalability

BeeGFS offers maximum performance and scalability on various levels. It supports distributed file contents with flexible striping across storage servers on a per-file or per-directory basis as well as distributed metadata.

BeeGFS is optimized especially for use in environments where performance matters to provide:

Best in class client throughput: 8 GB/s with only a single process streaming on a 100GBit network, while a few streams can fully saturate the network.

Linear scalability through dynamic metadata namespace partitioning.

BeeGFS servers allow flexible choice of underlying file system to perfectly fit the given storage hardware.

BeeGFS Storage Pools make different types of storage devices available within the same namespace. By having SSDs and HDDs in different pools, pinning of a user project to the flash pool enables all-flash storage performance for the current project while still providing the advantage of the cost-effecient high capacity of spinning disks for other data.

Maximum Flexibility

BeeGFS supports a wide range of Linux distributions such as RHEL/Fedora, SLES/OpenSuse or Debian/Ubuntu as well as a wide range of Linux kernels from ancient 2.6.18 up to the latest vanilla.

The storage services run on top of an existing local filesystem (such as xfs, zfs or others) using the normal POSIX interface and clients and servers can be added to an existing system without downtime.

BeeGFS supports multiple networks and dynamic failover in case one of the network connections is down. BeeGFS client and server components can also run on the same physical machines. Thus, BeeGFS can turn a compute rack into a cost-efficient converged data processing and shared storage unit, eliminating the need for external storage resources and providing simplified management.

Fault Tolerance

BeeGFS storage servers are typically used with an underlying RAID to transparently handle disk errors.

Using BeeGFS with shared storage is also possible to handle server failures. The built-in BeeGFS Buddy Mirroring approach goes even one step further by tolerating the loss of complete servers including all data on their RAID volumes - and that with commodity servers and shared-nothing hardware.

Part No.	BEEGFS
End of Life?	No
Architecture	Virtual File System - Up to 1024 unique configurable virtual file systems per storage cluster. Error Detection - End-to-end data protection.
Storage Support	Data Protection - Distributed data protection (N+2 or N+4). Snapshots and Clones - Up to 4096 snapshots. Access Control - User Authentication, LDAP, extended ACLs.
Network Support	Supported Protocols - POSIX, NFS, SMB, S3 via gateway. Tiering - S3 compatible cloud (public or private).
Virtualisation Support	Management Interface - GUI, CLI
ISV Certifications	Hardware Vendors- Dell, HPE, Lenovo, Penguin Computing, Supermicro. Certified Object Stores- Amazon S3, Cloudian, IBM COS, Quantum (ActiveScale), Scality, SwiftStack S3.