TRIUMF @ Service Challenge 3
This is the status of our SC 3 cluster as of March 2006.- Hardware
- Software
- Networking
- Tape Transfer Status
- Tier 2 Status
- Networking Tests - June 2005
- 'To do' list
Hardware
Servers
- 3 EMT64 dCache pool node systems, each with:
- 2 GB memory
- hardware raid - 3ware 9xxx SATA raid controller
- Seagate Barracuda 7200.8 drives in hardware raid 5 - 8 x 250 GB
- 1 dual Opteron 246 dCache headnode server with:
- 2 GB memory
- 3ware 9xxx SATA raid controller
- WD Caviar SE drives in hardware raid 0 - 2 x 250 GB
- a 4560-SLX IBM Tape Library (currently with 1 SDLT 320 tape drive)
- 1 EMT64 system used as an FTS Server with:
- 2 GB memory
- 3 SCSI 73 GB drives for the OS and for Oracle's needs.
- a 4560-SLX IBM Tape Library (currently with 1 SDLT 320 tape drive)
- 2 EMT64 systems used as LFC and VOBOX Servers with:
- 2 GB memory
- 2 SATA 160 GB drives in a Raid-1 configuration
Storage
- 5.5+ TB disk
- 8+ TB tape. Note that 2 additional SDLT-600 drives and
required tapes have been ordered. This will bump our tape storage
capacity by an additional 7.5 TB when they are installed.
Software
- dCache/SRM - deployed in June 2005
- FTS / Oracle - deployed late July 2005
- LFC / VOBOX - deployed in October 2005
Networking
- 2 x 1 GigE
- CANARIE provides some graphs of traffic rates of the TRIUMF-CERN link.
- 10GigE was temporarily available for testing until June 18 2005
- Last mile DWDM optics for TRIUMF/BCNET have finally been shipped to us. We will soon proceed with deploying the Foundry switch.
Tape Transfer Status
We used a home-brewed HSM tape interface for tape transfer tests. Tape transfers ran from Mon Jul 25 2005 at 17h30 until Fri, Jul 29 2005 at 03h00. 3 tape drives on 2 hosts performed well; unfortunately the fastest tape library on the third host had various SCSI problems and could only be used productively for about two 6-hour periods.We archived 6977 files totaling 7.3 TB. At peek we achieved approximately 36 MB/sec.
Tier 2 Status
The following Canadian sites participated in the Tier 1 -Tier 2 SC 3 transfers in 2005. Note that they are NOT officially ATLAS Tier 2. FTS was used for transfers.dCache/SRM deployed and currently working
-
Simon Fraser University via WestGrid
Example network T1-T2 activity between TRIUMF and SFU from Wed, Aug 3 to Tue, Aug 9 2005:
- University of Alberta - configured and active from Aug 16 2005.
- University of Toronto - configured and active from Aug 17 2005.
-
University of Victoria - configured and active from Aug 17 2005.
Timeout hiccups (Aug 10-14) during integration of last three sites:
T1-T2 network activity Aug 17 - Aug 24 2005 : All four Tier 2 sites participating:
Networking Tests - June 2005
In June some networking tests were conducted between TRIUMF and
CERN. Following are some graphs of performance over 1 GigE and
over 10 GigE.
TRIUMF-CERN Network tests @ 1 GigE
- a combination of SRM and gridftp initiated transfers:
- SRM did not fill the pipe on its own - we need to tune the SRM server.
TRIUMF-CERN Network tests @ 10 GigE
- we were able to sustain 275 MB/s with little tuning:
- for these tests we used gridftp only
- a more detailed explanation of this graph is available in this pdf link.
To-Do List
- install 2 SDLT-600 drives for tape-based SC tranfers in April 2006
- deploy Foundry Switch
- upgrade to LCG 2.7.0 baseline services software and merge TRIUMF-LCG2 with our Service Challenge cluster
- tune dCache parameters