XSEDE GridFTP Installation Guide

Background Information

Supported Platforms

The entirety of the configuration described in this document can be applied to any RPM based platform supported by the Globus Toolkit release 6.0.x. This includes CentOS, RedHat, and SLES11 platforms used by XSEDE resources.

Port Usage Overview

Alternate ports are used for alternate/testing GridFTP servers.

Firewall configuration is discussed in the configuration section below.

Server control port

A GridFTP client connects to the server's control port, authenticates, provides credentials, and issues the file-transfer request. With 3rd party transfers the client connects to both the source and destination servers' control ports, authenticates, provides credentials, and tells the source server to transfer directly to the destination server.

IPC listener

Used by the GridFTP server to contact its local data movers when a striped transfer is requested. All connections are initiated by the GridFTP server and local to a cluster.

Data channel listener

Used by the data movers for all data transfers. Connections are always initiated by the data source. Ports are selected from the ephemeral/dynamic port range configured using the GLOBUS_TCP_PORT_RANGE environment variable. By default GridFTP will use the lowest available port in that range.

Installing

Trusting the XSEDE Repo

The XSEDE Repository provides source and binary RPM packages for XSEDE platforms (RHEL, CentOS, and SLES). Some XSEDE packages (including the XSEDE distribution of GridFTP) have dependencies on packages contained in the Globus repository. Thus, to install XSEDE distributed GridFTP, you must first tell your machine to trust an XSEDE repository, and a Globus repository. This is done by installing the configuration rpm for Globus from http://toolkit.globus.org/ftppub/gt6/installers/repo/globus-toolkit-repo-latest.noarch.rpm and the appropriate configuration rpm from XSEDE from http://software.xsede.org/production/repo/repos/ Once you have gotten the appropriate configuration rpms from the links above, install them with

   # rpm -i XSEDE-Production-config..noarch.rpm
   # rpm -i globus-toolkit-repo-latest.noarch.rpm
or
   # zypper install ./XSEDE-Production-config..noarch.rpm
   # zypper install ./globus-toolkit-repo-latest.noarch.rpm

You should get a warning that looks like:

warning: XSEDE-Production-config.centos-5-1.noarch.rpm: Header V3 DSA signature: NOKEY, key ID 20423dbb

This is a gpg trust bootstrapping issue because until you install the above RPMs, RPM doesn't know which gpg key(s) to trust. The above RPMs install the PGP keys that are needed, but one has to run these commands for RPM to formally recognize them:

   # rpm --import /etc/pki/rpm-gpg/RPM-GPG-KEY-Globus
   # rpm --import /etc/pki/rpm-gpg/RPM-GPG-KEY-XSEDE-Production

NOTE: If you need an SD&I Development version, you will need to configure your system to use the XSEDE-Development repository by following this procedure https://software.xsede.org/development/repo/repoconfig.txt

Installing GridFTP RPMs

There is one prerequisite for GridFTP that is not contained in the Globus Toolkit repository (nor in the XSEDE software repository: the UDT library. It can be installed by:

   # yum install --enablerepo=epel udt

On RedHat based platforms, the command to install the latest GridFTP server and client from the repository configured above is:

   # yum install --disablerepo=epel globus-gridftp-xsede

Note: If you do not have the EPEL repository configured on your system, you can leave off the "--disablerepo-epel"

On SLES platforms, the proper command to install the latest GridFTP server from the configured repository is:

   # zypper install globus-gridftp-xsede

Updating GridFTP RPMs

If you have already installed the GridFTP metapackage, but wish to update to the most recent release, the command is exactly the same as to install--yum will prompt you with a list of packages that will be updated as a result, and ask you whether or not you wish to install them. Select "y" at the prompt.

Installing a second instance into an alternate (testing) location

Alternate location installs are described below in the Testing section

Installing on Solaris

There are no binaries provided for the Globus Toolkit version 6 for Solaris. You should install using the source installer as documented here

Configuring

With the new RPM packaging in GT 6.0.x it is now recommended that it be installed on each machine. The GridFTP server software only needs to be available on gridftp server machines. Users will use separately deployed client software and don't need access to this software. No Modules keys need to be defined.

Configuration File Location

The configuration file for the GridFTP server is read from the following locations, in the given order. Only the first file found will be loaded:

  1. Path specified with the -c <configfile> command line option. (Default for this is /etc/gridftp.conf, when starting from /etc/init.d/globus-gridftp-server)
  2. $GLOBUS_LOCATION/etc/gridftp.conf
  3. /etc/grid-security/gridftp.conf

XSEDE GridFTP installs should specify their configuration files using the -c <configfile> command line option, as this eliminates any potential confusion.

Options are one per line, with the format:

 <option> <value>

If the value contains spaces, they should be enclosed in double-quotes ("). Flags or boolean options should only have a value of 0 or 1. Blank lines and lines beginning with # are ignored.

example recommended configuration file contents for a front end (/etc/gridftp.conf):

inetd 1
log_single /var/log/gridftp.log
control_preauth_timeout 120
usage_stats_target usage-stats.globus.org:4810,globus-usage.xsede.org:4812!all



Server DNS configuration

To balance transfers between all the data movers we recommend that all XSEDE sites configure their GridFTP servers as follows:

The net effect of this configuration will be that GridFTP transfers will be sent in round-robin order to one of your GridFTP servers.

Usage Stats Configuration

Certain usage statistics are sent to both the globus.org usage stats collector and the XSEDE usage stats collector. In 6.0.x, usage stats can additionally include: file, client IP, Data IP, user account, user dn, config ID, Session ID. In the XSEDE recommended configuration (specifically, the /etc/gridftp.conf file shown above), generic/default usage stats are sent to the globus collector, and all available usage stats are sent to the XSEDE collector.

Firewall Configuration

So that XSEDE users can transfer data between XSEDE and non-XSEDE gridftp servers, XSEDE gridftp servers should have the following in-bound ports open through site firewalls. Outbound connections shouldn't be restricted.

Configure /etc/services

Configure /etc/services on all the machines that will run GridFTP control or data channel services:

gsiftp          2811/tcp        # GSI FTP
gsiftp          2811/udp        # GSI FTP
gsiftpdata      4811/tcp        # GSI striped FTP data node
gsiftpdata      4811/udp        # GSI striped FTP data node

Make sure each GridFTP control server has /etc/grid-security/ with both a XSEDE grid-mapfile and XSEDE accepted host certificates.

xinetd configuration

The banner_fail line in the gridftp xinetd config below will make xinetd print a message when it fails or refuses to start the server for any reason, rather than just immediately closing the connection. This will replace the vague "end of file" error message.

First, create the message file:

$echo -e "421 The GridFTP Service is unable to accept this connection.  Please try again later.\r" >/etc/gridftp.full.msg

For xinetd create the following file /etc/xinetd/gsiftp

Note: though a value of instances=50 is shown in the configurations below, a limit in the 20 to 50 range is recommended. RPs are responsible for determining whether their configured limit is reasonable for their particular hardware, and fine tuning it as necessary.  The per_source limit controls how many instances are supported from a given source--given that all transfers that are initiated using Globus Online will appear to come from the same source, it is recommended that the per_source limit be set to be the same as the instances limit. The cps limit controls how many connections are allowed per second, and the timeout before reenabling if the cps is exceeded.

service gsiftp
{
instances               = 50
per_source              = 50
cps			= 30 10
socket_type             = stream
wait                    = no
user                    = root
banner_fail 		= /etc/gridftp.full.msg
env                     += GLOBUS_TCP_PORT_RANGE=50000,51000
server                  = /usr/sbin/globus-gridftp-server
server_args             = -c /etc/gridftp.conf -i 
log_on_success          += DURATION
nice                    = 10
disable                 = no
}

For xinetd data channel servers create the following file /etc/xinetd/gsiftpdata

service gsiftpdata
{
instances               = 50
per_source              = 50
cps			= 30 10
socket_type             = stream
wait                    = no
user                    = root
banner_fail 		= /etc/gridftp.full.msg
env                     += GLOBUS_TCP_PORT_RANGE=50000,51000
server                  = /usr/sbin/globus-gridftp-server
server_args             = -i -dn -c /etc/gridftpdata.conf
log_on_success          += DURATION
nice                    = 10
disable                 = no
}

Operating

Starting and Stopping a server

It is recommended that one run the server out of xinetd.  To do so, ensure you have a /etc/xinetd/gsiftp as above (notably, with disable=no).  You can check to see that this enables the service out of xinetd by running: chkconfig --list

Maintenance Mode

It is possible to put the GridFTP server into maintenance mode

There is an option to disable connections while informing users. This maintenance mode option can be configured by using the config dir option:

  1. create the dir /etc/gridftp.d
  2. xinetd: add to args: -C /etc/gridftp.d
  3. create file named /etc/gridftp.d/nologon with the below options:
    connections_disabled 1
    offline_msg "CONNECTIONS ARE DISABLED"
    

In maintenance mode, the server will stay up and respond to users (with the offline_msg), but not allow authentication.

Updating

How to update to the new/latest packages

To update to the latest packages from the globus repo, one can simply do a "yum update" as root, and the Globus Toolkit packages will report any updates in the same manner as any other software package on your system.

How to NOT update to the new/lastest packages

If you do not want to update from the released version (6.0.x), comment out the Globus Updates repository from your /etc/yum.repos.d (or /etc/zypp/repos.d on SLES11) configuration.

Testing

Simple tests to check that your GridFTP server is operating

This quickstart will walk thru testing GridFTP services.

Run the following commands:

$ myproxy-logon -s myproxy.xsede.org
$ globus-url-copy gsiftp://<your_gridftp_server>/etc/group file:///tmp/`whoami`.$$

Success looks like:

[testuser@xsedeResource] myproxy-logon -s myproxy.xsede.org
Enter MyProxy pass phrase:
A credential has been received for user XXXXX in /tmp/x509up_u501.
[testuser@xsedeResource] globus-url-copy gsiftp://<your_gridftp_server>/etc/group file:///tmp/`whoami`.$$
[testuser@xsedeResource] diff /tmp/`whoami`.* /etc/group[testuser@xsedeResource] rm /tmp/testuser.892

Now try doing a client/server transfer (one of your URLs has file: in it):

globus-url-copy -vb -dbg gsiftp://<your_gridftp_server>/dev/zero file:///dev/null

This will run until you control-c the transfer. If that works, reverse the direction:

globus-url-copy -vb -dbg file:///dev/zero gsiftp://<your_gridftp_server>/dev/null

Again, this will run until you control-c the transfer.

If you have another server set up on another machine, try doing a third party transfer by running this command:

globus-url-copy -vb -dbg gsiftp://<your_gridftp_server>/dev/zero gsiftp://<another_gridftp_server>/dev/null

Again, this will run until you control-c the transfer.

Running a test server on a production host

Installing to alternate locations

There are times when it may be necessary to run a second instance, alternate install of a different version of the GridFTP server, or times when it might be inconvenient to install from RPM (for example, shared network installations). To make this easy, we have created binary tarballs, created directly from the binaries contained in the Globus Toolkit v6 RPMs. They are available in the directory tree here:http://software.xsede.org/production/globus-gridftp-xsede/globus-gridftp-xsede-6.0-1/binary-tgz/ . (Found here for development versions).

Simply choose the appropriate tarball (the "el" directory is used for both CentOS and RedHat Enterprise Linux), and untar it in an appropriate location (such as /soft/local/globus_gridftp_xsede-6.0-1/ ). To use such a binary installation, you will need to modify the Modules file to set some additional environment variables.

replace /path/to/installation/ with the actual path to your installation.:

Note: i386 architectures should use "lib" below, x86_64 should use "lib64"

$ cat << EOF > globus-gridftp-xsede.module
#%Module1.0####################################################################

proc ModulesHelp { } {
global _module_name
puts stderr "The $_module_name modulefile defines the default system paths and"
puts stderr "environment variables needed to use the $_module_name libraries and
 tools."
puts stderr ""
}

set _module_name        [module-info name]
module-whatis "globus gridftp xsede 6.0-1 "
prepend-path         GLOBUS_LOCATION /path/to/installation
setenv          GLOBUS_HOSTNAME `/path/to/installation/usr/bin/globus-hostname`
setenv          GLOBUS_PATH     /path/to/installation
prepend-path    LD_LIBRARY_PATH        /path/to/installation/usr/[lib|lib64]
prepend-path    LIBPATH                /path/to/installation/usr/[lib|lib64]
prepend-path    SHLIB_PATH             /path/to/installation/usr/[lib|lib64]
prepend-path    MANPATH                /path/to/installation/usr/share/man
prepend-path    PATH                   /path/to/installation/bin:/path/to/installation/usr/sbin
setenv   GLOBUS_TCP_PORT_RANGE  50000,51000
setenv  RSHCOMMAND             /usr/bin/ssh
setenv  MYPROXY_SERVER         myproxy.xsede.org
EOF

It is important to note: When you establish an alternate RPM root, the packages don't know that they're in a special place so one must set environment variables in the xinetd.d configuration files (or however you are invoking the server). Particularly, LD_LIBRARY_PATH must be set. Take special note that if you are installing a 64 bit version, LD_LIBRARY_PATH must include the $ALT_RPM_ROOT/usr/lib64 directory.

You want to set the equivalent of:

export ALT_RPM_ROOT=/path/to/alternateroot
export LD_LIBRARY_PATH=$ALT_RPM_ROOT/usr/lib:$ALT_RPM_ROOT/usr/lib64

The next steps are:

Recommended INCA operations tests

Registering in Information Services

Your new GridFTP server should be registered using XSEDE's pub/sub service, IPF, as documented in:

Debugging/Troubleshooting

For debugging information, please see the Globus GridFTP debugging documentation.

For troubleshooting information please see the Globus GridFTP Troubleshooting documentation.

Usage Stats Reports

There are automated reports for GridFTP and GRAM usage that are generated from the usage DB. This is informational for SPs, but if you are interested in what is done with the usage stats, see: http://globus-usage.xsede.org/graphs/reports/