Subscribe to the
Discussion List for up-to-the-minute notification about Data Oasis and
all aspects of the Triton Resource.
The Triton Resource Data Oasis is designed as an extremely large scale storage system, having
two-to-four petabytes of total disk capacity when fully deployed. The system will provide
between 60 and 120 gigabytes per second of data movement bandwidth and manage
from 3000 to 6000 individual disks.
Data Oasis is a parallel file system (PFS). The Triton Resource has exclusive 10 GbE access to over 800
terabytes of raw disk space for use as a temporary, high bandwidth, very high capacity
storage system for use while running jobs on Triton's 256-node cluster.
Data Oasis has been online since September, 2011, and continues to be enhanced and
upgraded to improve reliability and performance.
For a view of the test configuration, see these diagrams:
To see the performance characteristics, backup policies,
hardware, and data management software on Oasis and Triton's other storage facilities,
see the
Data Storage page. View the Backup Policy page for complete details.
Data Oasis is fully connected to both the Petascale
Data Analysis Facility and the Triton Compute Cluster,
providing exceptional data movement and data management
throughput to users on either system.
Data Oasis Phase 0 Performance Test Results
Latest test results: 9/27/2010
Common paramters:
File size: 2 TB
Strip size: 1 MB
Number of OSSes: 7
Number of OSTs per OSS: 8
Total OSTs: 56
Nodes Cores/Node Total Cores Max Write Max Read
16 2 32 3706.88 MiB/sec 6449.85 MiB/sec
16 4 64 3646.38 MiB/sec 6790.21 MiB/sec
16 8 128 3460.32 MiB/sec 6639.50 MiB/sec
32 2 64 4080.55 MiB/sec* 7505.48 MiB/sec
32 4 128 3842.93 MiB/sec 7479.72 MiB/sec
32 8 256 3718.46 MiB/sec 6703.78 MiB/sec
64 2 128 3903.77 MiB/sec 7923.24 MiB/sec**
64 4 256 3686.14 MiB/sec 7075.21 MiB/sec
64 8 512 3590.96 MiB/sec 7663.74 MiB/sec
* Max write was with 64 cores, 32 nodes (2cores/node): 4080.55 MiB/sec
**Max read was with 128 cores, 64 nodes (2cores/node): 7923.24 MiB/sec
SET #1 : Single OSS, client sweep [1-16 nodes, 2-32 cores]
Common parameters:
api = POSIX
test filename = testFile
access = single-shared-file
ordering in a file = sequential offsets
ordering inter file=random task offsets >= 1, seed=0)
repetitions = 3
xfersize = 1 MiB
aggregate filesize = 576 GiB
Results:
#Clients (cores) Max Write Max Read
1 (2) 317.26 MiB/sec 489.98 MiB/sec
2 (4) 622.47 MiB/sec 860.69 MiB/sec
4 (8) 706.66 MiB/sec 1057.74 MiB/sec
6 (12) 679.54 MiB/sec 1097.75 MiB/sec
8 (16) 698.49 MiB/sec 1122.38 MiB/sec
12 (24) 686.72 MiB/sec 1130.55 MiB/sec
16 (32) 708.07 MiB/sec 1135.67 MiB/sec
SET #2 : Multiple OSSs [1-8], Fixed number of clients [32 cores]
Common parameters:
clients = 32 (32 cores at 2 per node; 16 nodes)
xfersize = 1 MiB
blocksize = 18 GiB
aggregate filesize = 576 GiB
Results:
#OSS(#OSTs) Max Write Max Read
1(8) 741.31 MiB/sec 1136.24 MiB/sec
2(16) 1379.18 MiB/sec 2247.38 MiB/sec
3(24) 2116.57 MiB/sec 3373.73 MiB/sec
4(32) 2808.72 MiB/sec 4438.47 MiB/sec
5(40) 3326.50 MiB/sec 5475.37 MiB/sec
6(48) 4068.21 MiB/sec 6193.90 MiB/sec
7(56) 4206.00 MiB/sec 6508.45 MiB/sec
8(64) 4147.88 MiB/sec 6688.39 MiB/sec
SET #3 : Multiple OSTs [1-8], Varying number of clients [4*#OSS cores]
Common parameters:
xfersize = 1 MiB
blocksize = 18 GiB
2 tasks per client node
Aggregate file size = 18*2*#nodes
# nodes = 2*#OSSs
Results:
#OSS(#OSTs) #nodes(#cores) Max Write Max Read
1(8) 2(4) 673.23 MiB/sec 849.53 MiB/sec
2(16) 4(8) 1266.72 MiB/sec 1959.52 MiB/sec
3(24) 6(12) 1845.42 MiB/sec 3022.94 MiB/sec
4(32) 8(16) 2340.85 MiB/sec 4042.03 MiB/sec
5(40) 10(20) 2570.28 MiB/sec 1688.81 MiB/sec
6(48) 12(24) 3082.08 MiB/sec 5067.13 MiB/sec
7(56) 14(28) 3636.89 MiB/sec 6256.48 MiB/sec
8(64) 16(32) 4086.61 MiB/sec 6628.29 MiB/sec
[Note: With MPIIO we got 4132.14 MiB/sec(writes) and 6594.68 MiB/sec(reads)]
SET #4 : Fixed number of OSSs (8), OSTs (64), clients (32 cores), and varying stripe size
Common parameters:
clients = 32 (2 per node)
blocksize = 18 GiB
aggregate filesize = 576 GiB
Results:
Stripe Size Max Write Max Read
1M 4147.88 MiB/sec 6688.39 MiB/sec
2M 4044.22 MiB/sec 6138.93 MiB/sec
Early Lustre Performance Test Results
For comparison, we tested the Mirage implementation of Lustre
and derived the measurements below, using one file
with one task only per node, an eight-megabyte stripe
size, and a (full) stripe count of 96.