====== The Forge ====== ===== System Information ===== ==== Software ==== The Forge was built with rocks 6.1.1 which is built on top of CentOS 6.8. With the Forge we made the conversion from using PBS/Maui to using SLURM as our scheduler and resource manager. ==== Hardware ==== ===Management nodes=== The head node and login node are Dell commodity servers configured as follows. Dell R430: Dual 8 core Haswell CPUs with 64GB of DDR4 ram and dual 1TB 7.2K rpm drives in RAID 1 ===Compute nodes=== The newly added compute nodes are all SuperMicro super servers TP-2028 chassis configured as follows. SuperMicro TP-2028: 4 node chassis with each node containing dual 16 core Haswell CPUs with 256 GB DDR4 ram and 6 300GB 15K SAS drives in raid 0. ===Storage=== ==General Policy Notes== None of the cluster attached storage available to users is backed up in any way by us, this means that if you delete something and don't have a copy somewhere else, **it is gone**. Please note the data stored on cluster attached storage is limited to Data Class 1 and 2 as defined by [[ https://www.umsystem.edu/ums/is/infosec/classification-definitions | UM System Data Classifications]]. If you have need to store things in DCL3 or DCL4 please contact us so we may find a solution for you. ==Home Directories== The Forge home directory storage is available from an NFS share, meaning your home directory is the same across the entire cluster. This storage will provide 14 TB of raw storage, 13 TB of that is available for users, limited to 50GB per user, which can be expanded upon request with proof of need. **This volume is not backed up, we do not provide any data recovery guarantee in the event of a storage system failure.** System failures where data loss occurs are rare, but they do happen. All this to say, you ** should not ** be storing the only copy of your critical data on this system. ==Scratch Directories== Each user will get a scratch directory created for them at /mnt/stor/scratch/$USER an alias of `cdsc` has also been made for users to cd directly to this location. As with all storage scratch space is not backed up, and even more to the fact of data impermanence in this location it is actively cleaned in an attempt to prevent the storage from filling up. The intent for this storage is for your programs to create temporary files which you may need to keep after the calculation completes for a short time only. The volume is a high speed network attached scratch space, there currently are no quotas placed on the directories in this scratch space, however if the 20TB volume filling up becomes a problem we will have to implement quotas. Along with the networked scratch space, there is always local scratch on each compute node for use during calculations in /tmp. There is no quota placed on this space, and it is cleaned regularly as well, but things stored in this space will only be available to processes executing on the node in which they were created. Meaning if you create it in /tmp in a job, you won't be able to see it on the login node, and other processes won't be able to see it if they are on a different node than the process which created the file. ==Leased Space== If home directory, and scratch space availability aren't enough for your storage needs we also lease out quantities of cluster attached space. If you are interested in leasing storage please contact us. If you already are leasing storage, but need a reference guide on how to manage the storage please go [[ pub:forge:storage | here]]. ==== Policies ==== ** __Under no circumstances should your code be running on the login node.__** You are allowed to install software in your home directory for your own use. Know that you will *NOT* be given root/sudo access, so if your software requires it you will not be able to use that software. Contact ITRSS about having the software installed cluster-wide. User data on the Forge is **not backed up** meaning it is your responsibility to back up important research data to a location off site via any of the methods in the [[#moving_data]] section. If you are a student your jobs can run on any compute node in the cluster, even the ones dedicated to researchers, however if the researcher who has priority access to that dedicated node then your job will stop and go back into the queue. You may prevent this preemption by specifying to run on just the general nodes in your job file, please see the documentation on how to submit this request. If you are a researcher who has purchased priority access for 3 years to an allocation of nodes your jobs will primarily run on those nodes, if your job requires more resources than you have available in your priority queue your job will run in the general queue at the same priority as student jobs. You may add users to your allocation so that your research group may also benefit from the priority access. If you are a researcher who has purchased an allocation of CPU hours you will run on the general nodes at the same priority as the students. Your job will not run on any dedicated nodes and will not be susceptible to preemption by any other user. Once your job starts it will run until it fails, completes, or runs out of allocated time. {{ :pub:hexagon_resources_diagram_1_.png?900 }} \\ ==== Partitions ==== The Hardware in the Forge is split up into separate groups, or partitions. Some hardware is in more than one partition, for the most part you won't have to define a partition in a job submission file as we have a plugin that will select the best partition for your user for the job to run in. However there are a few cases that you will want to assign a job to a specific partition. Please see the table below for a list of the limits or default values given to jobs based on the partition. The important thing to note is how long you can request your job to run. | Partition | Time Limit | Default Memory per CPU | | short | 4 hours | 1000MB| | requeue | 7 days | 1000MB| | free | 14 days | 1000MB| | any priority partition | 30 days | varies by hardware| ===== Quick Start ===== ==== Logging in ==== === SSH (Linux)=== Open a terminal and type ssh username@forge.mst.edu replacing username with your campus sso username, Enter your sso password Logging in places you onto the login node. __Under no circumstances should you run your code on the login node.__ If you are submitting a batch file, then your job will be redirected to a compute node to be computed. However, if you are attempting use a GUI, ensure that you __do not run your session on the login node__ (Example: username@login-44-0). Use an interactive session to be directed to a compute node to run your software. sinteractive For further description of sinteractive, read the section in this documentation titled [[https://wiki.mst.edu/itrst/pub/forge#interactive_jobs |Interactive Jobs]]. === Putty (Windows)=== Open Putty and connect to forge.mst.edu using your campus SSO. {{:pub:putty.png?400 |}} === Off Campus Logins === Our off campus logins go through a different host, forge-remote.mst.edu and use public key authentication only, password authentication is disabled for off campus users. To learn how to connect from off campus please see our how to on [[ publickeysetup | setting up public key authentication ]] ==== Submitting a job ==== Using SLURM is very similar to using PBS/Maui to submit jobs, you need to create a submission script to execute on the backend nodes, then use a command line utility to submit the script to the resource manager. See the file contents of a general submission script complete with comments. == Example Job Script == #!/bin/bash #SBATCH --job-name=Change_ME #SBATCH --ntasks=1 #SBATCH --time=0-00:10:00 #SBATCH --mail-type=begin,end,fail,requeue #SBATCH --export=all #SBATCH --out=Forge-%j.out # %j will substitute to the job's id #now run your executables just like you would in a shell script, Slurm will set the working directory as the directory the job was submitted from. #e.g. if you submitted from /home/blspcy/softwaretesting your job would run in that directory. #(executables) (options) (parameters) echo "this is a general submission script" echo "I've submitted my first batch job successfully" Now you need to submit that batch file to the scheduler so that it will run when it is time. sbatch batch.sub You will see the output of sbatch after the job submission that will give you the job number, if you would like to monitor the status of your jobs you may do so with the [[forge#monitoring_your_jobs|squeue]] command. == Common SBATCH Directives == | **Directive** | **Valid Values** | **Description**| | --job-name=| string value no spaces | Sets the job name to something more friendly, useful when examining the queue.| | --ntasks=| integer value | Sets the requested CPUS for the job| | --nodes=| integer value | Sets the number of nodes you wish to use, useful if you want all your tasks to land on one node.| | --time=| D-HH:MM:SS, HH:MM:SS | Sets the allowed run time for the job, accepted formats are listed in the valid values column.| | --mail-type=|begin,end,fail,requeue| Sets when you would like the scheduler to notify you about a job running. By default no email is sent| | --mail-user=|email address | Sets the mailto address for this job| | --export=| ALL,or specific variable names| By default Slurm exports the current environment variables so all loaded modules will be passed to the environment of the job| | --mem=| integer value | number in MB of memory you would like the job to have access to, each queue has default memory per CPU values set so unless your executable runs out of memory you will likely not need to use this directive.| | --mem-per-cpu=| integer | Number in MB of memory you want per cpu, default values vary by queue but are typically greater than 1000Mb.| | --nice= | integer | Allows you to lower a jobs priority if you would like other jobs set to a higher priority in the queue, the higher the nice number the lower the priority.| | --constraint= | please see sbatch man page for usage | Used only if you want to constrain your job to only run on resources with specific features, please see the next table for a list of valid features to request constraints on.| | --gres= | name:count | Allows the user to reserve additional resources on the node, specifically for our cluster gpus. e.g. --gres=gpu:2 will reserve 2 gpus on a gpu enabled node| | -p | partition_name | Not typically used, if not defined jobs get routed to the highest priority partition your user has permission to use. If you were wanting to specifically use a lower priority partition because of higher resource availability you may do so.| == Valid Constraints == | **Feature**| **Description**| | intel | Node has intel CPUs | | amd | Node has amd CPUs | | EDR | Node has an EDR (100Gbit/sec) infiniband interconnect | | FDR | Node has a FDR (56Gbit/sec) infiniband interconnect | | QDR | Node has a QDR (36Gbit/sec) infiniband interconnect | | DDR | Node has a DDR (16Gbit/sec) Infiniband interconnect | | serial | Node has no high speed interconnect | | gpu | Node has GPU acceleration capabilities | ==== Monitoring your jobs ==== squeue -u username JOBID PARTITION NAME USER STATE TIME CPUS NODES 719 requeue Submiss blspcy RUNNING 00:01 1 1 ====Cancel your job=== scancel - Command to cancel a job, user must own the job being cancelled or must be root. scancel ==== Viewing your results ==== Output from your submission will go into an output file in the submission directory, this will either be slurm-jobnumber.out or whatever you defined in your submission script. In our example script we set this to Forge-jobnumber.out, this file is written asynchronously so it may take a bit after the job is complete for the file to show up if it is a very short job. ==== Moving Data ==== Moving data in and out of the forge can be done with a few different tools depending on your operating system and preference. ===DFS volumes=== Missouri S&T users can mount their web volumes and S Drives with the mountdfs command. This will mount your user directories to the login machine under /mnt/dfs/$USER. The data can be copied over with command line tools to your home directory, **your data will not be accessible from the compute nodes so do not submit jobs from these directories.** Aliases "cds" and "cdwww" have been created to allow you to cd into your s drive and web volume quickly and easily. === Windows === ==WinSCP== Using winSCP connect to forge.mst.edu using your SSO just as you would with ssh or putty and you will be presented with the contents of your home directory. Now you will be able to drag files into the winscp window and drop them in the folder you want them in and the copying process should begin. It should also work the same in the opposite direction to get data back out. ==Filezilla== Using Filezilla you connect to forge.mst.edu using your SSO and you will have the contents of your home directory displayed, drag and drop works with Filezilla as well. ==Git== git is installed on the cluster and is recommended to keep track of code changes across your research. See [[https://git-scm.com/documentation | getting started with git]] for usage guides, Campus offers a hosted private git server at [[https://git.mst.edu]] at no additional cost. === Linux === == Filezilla == See windows instructions == scp == scp is a command line utility that allows for secure copies from one machine to another through ssh, scp is available on most Linux distributions. If I wanted to copy a file in using scp I would open a terminal on my workstation and issue the following command. scp /home/blspcy/batch.sub blspcy@forge.mst.edu:/home/blspcy/batch.sub It will then ask me to authenticate using my campus SSO, then copy the file from my local local location of /home/blspcy/batch.sub to my forge home directory. If you have questions on how to use scp I recommend reading the man page for scp, or check it out online at [[http://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cad=rja&uact=8&ved=0CB4QFjAA&url=http%3A%2F%2Flinux.die.net%2Fman%2F1%2Fscp&ei=j0J3VcWrKYW6ogSe-ZDIAg&usg=AFQjCNGSWos542S3NV9HPey1zVTIYw1bCQ&sig2=0K1_pw6A7e9j8onBle1GFQ&bvm=bv.95277229,d.cGU |SCP man page]] == rsync == rsync is a more powerful command line utility than scp, it has a simpler syntax, and checks to see if the file has actually changed before performing the copy. See the man page for usage details or [[http://linux.die.net/man/1/rsync |online documentation]] == git == See git for windows for instruction, it works the same way. ==== Modules ==== An important concept for running on the cluster is modules. Unlike a traditional computer where you can run every program from the command line after installing it, with the cluster we install the programs to a main "repository" so to speak, and then you load only the ones you need as modules. To see which modules are available for you to use you would type "module avail". Once you find which module you need for your file, you type "module load " where is the module you found you wanted from the module avail list. You can see which modules you already have loaded by typing "module list". Here is the output of module avail as of 02/01/2019 ------------------------------------------------------------------------------------------------------ /share/apps/modulefiles/mst-app-modules ------------------------------------------------------------------------------------------------------ AnsysEM/17/17 SAC/101.2 ansys/17.0 curl/7.63.0 gnuplot matlab/2012 msc/nastran/2014 rlwrap/0.43 trinity AnsysEM/19/19 SAC/101.6a (D) ansys/18.0 dirac gperf/3.1 matlab/2013a msc/patran/2014 rum valgrind/3.12 CORE/06.05.92 SAS/9.4 ansys/18.1 dock/6.6eth gridgen matlab/2014a (D) mseed2sac/2.2 samtools/0.1.19 valgrind/3.14 (D) CST/2014 (D) SEGY2ASCII/2.0 ansys/18.2 dock/6.6ib (D) gromacs matlab/2016a namd/2.9 scipy vasp/4.6 CST/2015 SEISAN/10.5 ansys/19.0 espresso gv/3.7.4 matlab/2017a namd/2.10 (D) siesta vasp/5.4.1 (D) CST/2016 SU2/3.2.8 ansys/19.2 feh/2.28 hadoop/1.1.1 matlab/2017b namd/2.12b spark vim/8.0.1376 CST/2017 SU2/4.1.3 apbs fio/3.12 hadoop/1.2.1 matlab/2018a nwchem/6.6 sparta visit/2.8.2 CST/2018 SU2/5.0.0 bison/3.1 flex/2.6.0 hadoop/2.6.0 (D) matlab/2018b octave/3.8.2 spin visit/2.9.0 (D) GMT/4.5.17 SU2/6.0.0 (D) bowtie freefem++ hk/1.3 mcnp/mcnp-nic (D) octave/4.2.1 (D) starccm/6.04 vmd GMT/5.4.2 (D) SeisUnix/44R2 cadence/cadence fun3d/custom (D) htop/2.0.2 mcnp/6.1 openssl/1.1.1a starccm/7.04 voro++/0.4.6 GMTSAR/gmon.out SeismicHandler/2013a calcpicode fun3d/12.2 imake mctdh overture starccm/8.04 (D) vulcan GMTSAR/5.6.test a2ps/4.14 casino fun3d/12.3 jdupes/1.10.4 metis/5.1.0 packmol starccm/10.04 x11/1.20.1 GMTSAR/5.6 (D) abaqus/6.11-2 cmg fun3d/12.4 lammps/9Dec14 molpro/2010.1.nightly parallel/20180922 starccm/10.06 xdvi/22.87.03 JWM/2.3.7 abaqus/6.12-3 (D) comsol/4.3a fun3d/2014-12-16 lammps/17Nov16 molpro/2010.1.25 parmetis/4.0.3 starccm/11.02 yade/gmon.out LIGGGHTS/3.8 abaqus/6.14-1 comsol/4.4 (D) gamess lammps/23Oct2017 (D) molpro/2012.1.nightly (D) proj starccm/11.04 yade/2017.01a (D) OpenFoam/2.3.1 abaqus/2016 comsol/5.2a_deng gaussian lammps/24Jul17 molpro/2012.1.9 psi4 starccm/12.02 zpaq/7.15 OpenFoam/2.4.x abaqus/2018 comsol/5.2a_park gaussian-09-e.01 lammps/30July16 molpro/2015.1.source.block qe starccm/12.06 OpenFoam/5.x abinit/8.6.3 comsol/5.2a_yang gccmakedep/1.0.3 linpack molpro/2015.1.source.s qmcpack/3.0 starccm/13.04 OpenFoam/6.x (D) abinit/8.10.1 (D) comsol/5.2a_zaeem gdal lz4/1.8.3 molpro/2015.1.source qmcpack/2014 tecplot/2013 ParaView accelrys/material_studio/6.1 comsol/5.3a_park gdk-pixbuf/2.24.1 madagascar/2.0 molpro/2015.1 qmcpack/2016 (D) tecplot/2014 (D) QtCreator/Qt5 amber/12 comsol/5.4_park geos maple/16 moose/3.6.4 quantum-espresso/6.1 tecplot/2016.old R/3.1.2 amber/13 (D) cp2k git/2.20.1 maple/2015 (D) mpb-meep rdkit/2017-03-3 tecplot/2016 R/3.4.2 ansys/14.0 cplot gmp maple/2016 mpc rdseed/5.3.1 tecplot/2017r3 R/3.5.1 (D) ansys/15.0 (D) cpmd gnu-tools maple/2017 msc/adams/2015 rheotool/3.0 tecplot/2017 --------------------------------------------------------------------------------------------------- /share/apps/modulefiles/mst-compiler-modules ---------------------------------------------------------------------------------------------------- cilk/5.4.6 cmake/3.9.4 gmake/4.2.1 gnu/5.4.0 gnu/7.1.0 gpi2/1.1.1 (D) intel/2015.2.164 (D) java/9.0.4 (D) ninja/1.8.2 pgi/RCS/11.4,v python/2.7.14 python/3.5.2 python/3.7.1 (D) cmake/3.2.1 (D) cmake/3.11.1 gnu/4.9.2 (D) gnu/6.3.0 gnu/7.2.0 intel/2011_sp1.11.339 intel/2018.3 julia/0.6.2 perl/5.26.1 python/gmon.out python/2.7.15 python/3.6.2 cmake/3.6.2 cmake/3.13.1 gnu/4.9.3 gnu/6.4.0 gpi2/1.0.1 intel/2013_sp1.3.174 java/8u161 mono/3.12.0 pgi/11.4 python/2.7.9 python/3.5.2_GCC python/3.6.4 ------------------------------------------------------------------------------------------------------ /share/apps/modulefiles/mst-mpi-modules ------------------------------------------------------------------------------------------------------ intelmpi mvapich2/pgi/11.4/eth openmpi/1.10.7/intel/2011_sp1.11.339 openmpi/2.0.3/gnu/7.2.0 (D) openmpi/2.1.1/gnu/7.2.0 (D) openmpi/intel/13/eth mvapich2/2.2/intel/2015.2.164 mvapich2/pgi/11.4/ib (D) openmpi/1.10.7/intel/2013_sp1.3.174 openmpi/2.0.3/intel/2011_sp1.11.339 openmpi/2.1.1/intel/2011_sp1.11.339 openmpi/intel/13/ib (D) mvapich2/2.2/intel/2018.3 (D) openmpi openmpi/1.10.7/intel/2015.2.164 (D) openmpi/2.0.3/intel/2013_sp1.3.174 openmpi/2.1.1/intel/2013_sp1.3.174 openmpi/intel/15/eth mvapich2/2.3/gnu/7.2.0 openmpi/1.10.6/gnu/4.9.3 openmpi/1.10/gnu/4.9.2 openmpi/2.0.3/intel/2015.2.164 openmpi/2.1.1/intel/2015.2.164 openmpi/intel/15/ib (D) mvapich2/2.3/intel/2018.3 openmpi/1.10.6/gnu/5.4.0 openmpi/1.10/intel/15 openmpi/2.0.3/intel/2018.3 (D) openmpi/2.1.1/intel/2018.3 (D) openmpi/pgi/11.4/eth mvapich2/gnu/4.9.2/eth openmpi/1.10.6/gnu/6.3.0 openmpi/2.0.2/gnu/4.9.3 openmpi/2.1.0/gnu/4.9.3 openmpi/2.1.5/intel/2018.3 openmpi/pgi/11.4/ib (D) mvapich2/gnu/4.9.2/ib (D) openmpi/1.10.6/gnu/7.1.0 openmpi/2.0.2/gnu/5.4.0 openmpi/2.1.0/gnu/5.4.0 openmpi/3.1.2/intel/2018.3 rocks-openmpi mvapich2/intel/11/eth openmpi/1.10.6/gnu/7.2.0 (D) openmpi/2.0.2/gnu/6.3.0 openmpi/2.1.0/gnu/6.3.0 openmpi/gnu/4.9.2/eth rocks-openmpi_ib mvapich2/intel/11/ib (D) openmpi/1.10.6/intel/2011_sp1.11.339 openmpi/2.0.2/gnu/7.1.0 (D) openmpi/2.1.0/gnu/7.1.0 (D) openmpi/gnu/4.9.2/ib (D) mvapich2/intel/13/eth openmpi/1.10.6/intel/2013_sp1.3.174 openmpi/2.0.2/intel/2011_sp1.11.339 openmpi/2.1.0/intel/2011_sp1.11.339 openmpi/intel/11/backup mvapich2/intel/13/ib (D) openmpi/1.10.6/intel/2015.2.164 (D) openmpi/2.0.2/intel/2013_sp1.3.174 openmpi/2.1.0/intel/2013_sp1.3.174 openmpi/intel/11/eth mvapich2/intel/15/eth openmpi/1.10.7/gnu/6.4.0 openmpi/2.0.2/intel/2015.2.164 (D) openmpi/2.1.0/intel/2015.2.164 (D) openmpi/intel/11/ib (D) mvapich2/intel/15/ib (D) openmpi/1.10.7/gnu/7.2.0 (D) openmpi/2.0.3/gnu/6.4.0 openmpi/2.1.1/gnu/6.4.0 openmpi/intel/13/backup ------------------------------------------------------------------------------------------------------ /share/apps/modulefiles/mst-lib-modules ------------------------------------------------------------------------------------------------------ CUnit/2.1-3 cryptopp/7.0.0 glibc/2.23 (D) libtiff/5.3.0 netcdf/openmpi_ib/gnu/4.3.2 (D) python/modules/theano/0.8.0 Imlib2/1.4.4 docbook/4.1.2 glpk/4.63 libxcb/0.4.0 netcdf/openmpi_ib/intel/3.6.2 python/modules/tinyarray/1.0.5 Qt5/5.6.3 eigen/3.3.4 gtk-doc/1.9 libxcomposite/0.4.4 netcdf/openmpi_ib/intel/4.3.2 (D) readline/7.0-alt Qt5/5.11.3 (D) expat/2.2.6 gts/0.7.6 libxdmcp/1.1.2 pixman/0.34.0 readline/7.0 (D) arpack fftw/3.3.6/gnu/7.1.0 hdf/4/gnu/2.10 libxfont/1.5.4 python/modules/argparse/1.4.0 sgao atlas/gnu/3.10.2 fftw/mvapich2_ib/gnu4.9.2/2.1.5 hdf/4/intel/2.10 libxkbfile/1.0.9 python/modules/decorator/4.0.11 snappy/1.1.7 atlas/intel/3.10.2 fftw/mvapich2_ib/gnu4.9.2/3.3.4 (D) hdf/4/pgi/2.10 libxpm/3.5.12 python/modules/hostlist/1.15 vtk/6.1 autoconf-archive/2018.03.13 fftw/mvapich2_ib/intel2015.2.164/2.1.5 hdf/5/mvapich2_ib/gnu/1.8.14 libxtst/1.2.2 python/modules/kwant/1.0.5 vtk/8.1.1 (D) boost/gnu/gmon.out fftw/mvapich2_ib/intel2015.2.164/3.3.4 (D) hdf/5/mvapich2_ib/intel/1.8.14 loki/gmon.out python/modules/lasagne/1 xbitmaps/1.1.2 boost/gnu/1.51.0 fftw/mvapich2_ib/pgi11.4/2.1.5 hdf/5/mvapich2_ib/pgi/1.8.14 loki/0.1.7 (D) python/modules/matplotlib/1.4.2 xcb/lib/1.13 boost/gnu/1.53.0 fftw/mvapich2_ib/pgi11.4/3.3.4 (D) hdf/5/openmpi_ib/gnu/1.8.14 mkl/2011_sp1.11.339 python/modules/mpi4py/2.0.1 xcb/proto/1.13 boost/gnu/1.55.0 (D) fftw/openmpi_ib/gnu4.9.2/2.1.5 hdf/5/openmpi_ib/intel/1.8.14 mkl/2013_sp1.3.174 python/modules/mpmath/0.19 xcb/util/0.4.0 boost/gnu/1.62.0 fftw/openmpi_ib/gnu4.9.2/3.3.4 (D) hdf/5/openmpi_ib/pgi/1.8.14 mkl/2015.2.164 (D) python/modules/networkx/1.11 xcb/util/image/0.4.0 boost/gnu/1.65.1-python2 fftw/openmpi_ib/intel2015.2.164/2.1.5 lapack/3.8.0 mkl/2018.3 python/modules/nolearn/0.6 xcb/util/keysyms/0.4.0 boost/gnu/1.65.1 fftw/openmpi_ib/intel2015.2.164/3.3.4 (D) libdrm/2.4.94 ncurses/6.1 python/modules/numpy/1.10.0 xcb/util/renderutil/0.3.9 boost/gnu/1.69.0 fftw/openmpi_ib/pgi11.4/2.1.5 libepoxy/1.5.2 netcdf/mvapich2_ib/gnu/3.6.2 python/modules/obspy/1.0.4 xcb/util/wm/0.4.1 boost/intel/1.55.0 (D) fftw/openmpi_ib/pgi11.4/3.3.4 (D) libfontenc/1.1.3 netcdf/mvapich2_ib/gnu/4.3.2 (D) python/modules/opencv/3.1.0 xorg-macros/1.19.1 boost/intel/1.62.0 freeglut/3.0.0 libgbm/2.1 netcdf/mvapich2_ib/intel/3.6.2 python/modules/pylab/0.1.4 xorgproto/2018.4 cgns/openmpi_ib/gnu (D) glibc/2.14 libqglviewer/2.7.0 netcdf/mvapich2_ib/intel/4.3.2 (D) python/modules/scipy/0.16.0 xtrans/1.3.5 cgns/openmpi_ib/intel glibc/2.18 libqglviewer/2.7.1 (D) netcdf/openmpi_ib/gnu/3.6.2 python/modules/sympy/1.0 Where: D: Default Module Use "module spider" to find all possible modules. Use "module keyword key1 key2 ..." to search for all possible modules matching any of the "keys". ==== Compiling Code ==== There are several compilers available through modules, to see a full list of modules run module avail the naming scheme for the compiler modules are as follows. MPI_PROTOCOL/COMPILER/COMPILER_VERSION/INTERFACE e.g openmpi/intel/15/ib is the 2015 intel compiler built with openmpi libraries and is set to communicate over the high speed infiniband interface. After you have decided which compiler you want to use you need to load it. module load openmpi/intel/15/ib Then compile your code, use mpicc for c code and mpif90 for fortran code. Here is an MPI hello world C code. /* C Example */ #include #include int main (argc, argv) int argc; char *argv[]; { int rank, size; MPI_Init (&argc, &argv); /* starts MPI */ MPI_Comm_rank (MPI_COMM_WORLD, &rank); /* get current process id */ MPI_Comm_size (MPI_COMM_WORLD, &size); /* get number of processes */ printf( "Hello world from process %d of %d\n", rank, size ); MPI_Finalize(); return 0; } Use mpicc to compile it. mpicc ./helloworld.c Now you should see a a.out executable in your current working directory, this is your mpi compiled code that we will run when we submit it as a job. ==== Submitting an MPI job ==== You need to be sure that you have the same module loaded in your job environment as you did when you compiled the code to ensure that the compiled executables will run correctly, you may either load them before submitting a job and use the directive #SBATCH --export=all in your submission script, or load the module prior to running your executable in your submission script. Please see the sample submission script below for an mpi job. #!/bin/bash #SBATCH -J MPI_HELLO #SBATCH --ntasks=8 #SBATCH --export=all #SBATCH --out=Forge-%j.out #SBATCH --time=0-00:10:00 #SBATCH --mail-type=begin,end,fail,requeue module load openmpi/intel/15/ib mpirun ./a.out Now we need to submit that file to the scheduler to be put into the queue. sbatch helloworld.sub You should see the scheduler report back what job number your job was assigned just as before, and you should shortly see an output file in the directory you submitted your job from. ==== Interactive jobs ==== Some things can't be run with a batch script because they require user input, or you need to compile some large code and are worried about bogging down the login node. To start an interactive job simply use the sinteractive command and your terminal will now be running on one of the compute nodes. The hostname command can help you confirm you are no longer running on a login node. Now you may run your executable by hand without worrying about impacting other users. The sinteractive script by default will allot a 1 cpu for 1 hour, you may request more by using SBATCH directives, e.g. sinteractive --time=02:00:00 --cpus-per-task=2 will start a job with 2 CPUs on one node for 2 hours. If you will need a GUI Window for whatever you are running inside the interactive job you will need to connect to The Forge with X forwarding enabled. For Linux this is simply adding the -X switch to the ssh command. ssh forge.mst.edu -X For Windows there are a couple X server software's available for use, x-ming and x-win32 that can be configured with putty. [[http://www.geo.mtu.edu/geoschem/docs/putty_install.html|Here]] is a simple guide for configuring putty to use xming. ==== Job Arrays ==== If you have a large number of jobs you need to start I recommend becoming familiar with using job arrays, basically it allows you to submit one job file to start up to 10000 jobs at once. One of the ways you can vary the input of the job array from task to task is to set a variable based on which array id the job is and then use that value to read the matching line of a file. For instance the following line when put into a script will set the variable PARAMETERS to the matching line of the file data.dat in the submission directory. PARAMETERS=$(awk -v line=${SLURM_ARRAY_TASK_ID} '{if (NR == line) { print $0; };}' ./data.dat) You can then use this variable in your execution line to do whatever you would like to do, you just have to have the appropriate data in the data.dat file on the appropriate lines for the array you are submitting. See the sample data.dat file below. "I am line number 1" "I am line number 2" "I am line number 3" "I am line number 4" you can then submit your job as an array by using the --array directive, either in the job file or as an argument at submission time, see the example below. #!/bin/bash #SBATCH -J Array_test #SBATCH --ntasks=1 #SBATCH --out=Forge-%j.out #SBATCH --time=0-00:10:00 #SBATCH --mail-type=begin,end,fail,requeue PARAMETERS=$(awk -v line=${SLURM_ARRAY_TASK_ID} '{if (NR == line) { print $0; };}' ./data.dat) echo $PARAMETERS I prefer to use the array as an argument at submission time so I don't have to touch my submission file again, just the data.dat file that it reads from. sbatch --array=1-2,4 array_test.sub Will execute lines 1,2, and 4 of data.dat which echo out what line number they are from my data.dat file. You may also add this as a directive in your submission file and submit without any switches as normal. Adding the following line to the header of the submission file above will accomplish the same thing as supplying the array values at submission time. #SBATCH --array=1-2,4 Then you may submit it as normal sbatch array_test.sub ==== Converting From Torque ==== With the new cluster we have changed resource managers from PBS/Maui to Slurm, We have provided a script to convert your submission files from PBS directives to slurm. This script, pbs2slurm.py, should be in your executable path and can be used as follows. pbs2slurm.py < batch.qsub > batch.sbatch Where batch.qsub is the name of your old submission script file and batch.sbatch is the name of your new submission script file. Please see the table below for commonly used command conversions. | PBS/Maui Command | Slurm Equivalent | Purpose | | qsub | sbatch | Submit a job for execution | | qdel | scancel | Cancel a job | | showq | squeue | check the status of jobs | | qstat | squeue | check the status of jobs | | checkjob | scontrol show job| check the detailed information of a specific job | | checknode | scontrol show node| check the detailed information of a specific node | ==== Checking your account usage ==== If you have purchased a number of CPU hours from us you may check on how many hours you have used by issuing the usereport command from a login node, this will show your account's CPU hour limit and the total amount used. **Note this is usage for your account, not your user.** ===== Applications ===== ** The applications portion of this wiki is currently a Work in progress, not all applications are currently here, but they will eventually be. ** ==== Abaqus ==== * Default Vesion = 6.12.3 * Other versions available: 2019, 2018, 2016, 6.14.1, 6.11.2 module load abaqus/2019 module load abaqus/2018 module load abaqus/2016 module load abaqus/6.14-1 module load abaqus/6.12-3 This example is for 6.12 The SBATCH commands are explained in the [[pub:forge|Forge Documentation]]. You will need a ***.inp** file from where you built the model on another system. This **.inp** file needs to be in the same folder as the jobfile you will create.\\ Example jobfile: #!/bin/bash #SBATCH --job-name=sbatchfilename.sbatch #SBATCH --nodes=1 #SBATCH --ntasks=2 #SBATCH --mem=4000 #SBATCH --time=00:60:00 #SBATCH --mail-type=BEGIN #SBATCH --mail-type=END #SBATCH --mail-type=FAIL #SBATCH --mail-user=joeminer@mst.edu\ input=inputfile.inp /share/apps/ABAQUS/Commands/abq6123 job=$input analysis cpus=2 interactive * Replace the **sbatchfilename.sbatch** with the name of your jobfile. * Replace the email address with your own. * Replace the **inputfile.inp** with the ***.inp** file that you want to analyze. * **Keep ntasks and cpu equal to the same amount of processors** This job will run on 1 node with 2 processors and allocate 4GB of memory per processor. These node/processor values can change, but be aware that the Abaqus licensing is limited and the job may queue for hours or days before licenses are available, depending on the quantities you selected. Simple breakdown of Abaqus Licensing: ^Parallel License ^Number of Processors used ^Tokens Allocated^ |1|4|20| |2|8|40| |3|16|60| |4|32|80| |5|64|100| ==== Abaqus 2016 ==== module load abaqus/2016 #!/bin/bash #SBATCH --job-name=sbatchfilename.sbatch #SBATCH --nodes=1 #SBATCH --ntasks=20 #SBATCH --mem=4000 #SBATCH --partition=requeue #SBATCH --time=10:00:00 #SBATCH --mail-type=BEGIN #SBATCH --mail-type=END #SBATCH --mail-type=FAIL #SBATCH --mail-user=joeminer@mst.edu unset SLURM_GTIDS input=inputfile.inp time abq2016hf3 job=$input analysis standard_parallel=all cpus=20 interactive unset SLURM_GTIDS This is a new variable that must be in place for 2016 to run with MPI.\\ Chain loading jobs for Abaqus 2016. #!/bin/bash #SBATCH --job-name=sbatchfilename.sbatch #SBATCH --nodes=1 #SBATCH --ntasks=20 #SBATCH --mem=4000 #SBATCH --partition=requeue #SBATCH --time=10:00:00 #SBATCH --mail-type=BEGIN #SBATCH --mail-type=END #SBATCH --mail-type=FAIL #SBATCH --mail-user=joeminer@mst.edu unset SLURM_GTIDS input=inputfile.inp oldjob=oldjobfile.inp abq2016hf3 job=$input oldjob=$oldjob cpus=20 double interacive ==== Ansys ==== * Default Version = 15.0 * Other versions available: 14.0, 17.0, 18.0, 18.1, 18.2, 19.0, 19.2 === Running the Workbench === Be sure you are connected to the Forge with X forwarding enabled, and running inside an interactive job using command sinteractive before you attempt to launch the work bench. Running sinteractive without any switches will give you 1 cpu for 1 hour, if you need more time or resources you may request it. See [[pub:forge#interactive_jobs|Interactive Jobs]] for more information. \\ Once inside an interactive job you need to load the ansys module. module load ansys Now you may run the workbench. runwb2 \\ === Job Submission Information === \\ Fluent is the primary tool in the Ansys suite of software used on the Forge.\\ Most of the fluent simulation creation process is done on your Windows or Linux workstation.\\ The 'Solving' portion of a simulation is where the Forge is utilized.\\ Fluent will output a lengthy file, based on the simulation being run and that lengthy output file would be used on your Windows or Linux Workstation to do the final review and analysis of your simulation. === The basic steps === \\ 1. Create your geometry\\ 2. Setup your mesh\\ 3. Setup your solving method\\ 4. Use the .cas and .dat files, generated from the first three steps, to construct your jobfile\\ 5. Copy those files to the Forge, to your home folder\\ 6. Create your jobfile using the slurm tools on the Forge Documentation page\\ 7. Load the Ansys module\\ 8. Submit your newly created jobfile with sbatch\\ === Serial Example. === I used the Turbulent Flow example from Cornell's [[https://confluence.cornell.edu/display/SIMULATION/Home|SimCafe]].\\ On the Forge, I have this directory structure for this example. Please create your own structure that makes sense to you.\\ TurbulentFlow/ |-- flntgz-48243.cas |-- flntgz-48243.dat |-- output.dat |-- slurm-8731.out |-- TurbulentFlow_command.txt |-- TurbulentFlow.sbatch The .cas file is the CASE file that contains the parameters define by you when creating the model.\\ The .dat file is the data result file used when running the simulation.\\ The .txt file, is the actual, command equivalent, of your model, in a form that the Forge understands.\\ The .sbatch file, is the slurm job file that you will use to submit your model for analysis.\\ The .out file is the output from the run.\\ The .dat file is the binary (ansys specific) file created during the solution, that could be imported into Ansys back on the Windows/Linux workstation for further analsys.\\ === Jobfile Example. === #!/bin/bash #SBATCH --job-name=TurbulentFlow.sbatch #SBATCH --ntasks=1 #SBATCH --nodes=1 #SBATCH --time=01:00:00 #SBATCH --mail-type=BEGIN #SBATCH --mail-type=END #SBATCH --mail-type=FAIL #SBATCH --mail-user=rlhaffer@mst.edu #SBATCH -o Forge-%j.out fluent 2ddp -g < /home/rlhaffer/unittests/ANSYS/TurbulentFlow/TurbulentFlow_command.txt The SBATCH commands are explained in the [[pub:forge#submitting_a_job|Forge Documentation]]. The job-name is a name given to help you determine which job is which.\\ This job will be in the --- **partition=requeue** queue.\\ It will use 1 node --- **nodes=1**.\\ It will use 4 processors in one node --- **ntasks=4**.\\ It has a wall clock time of 1 hour --- **time=01:00:00**.\\ It will email the user when it begins, ends, or if it fails. **--mail-type** & **--mail-user**\\ **fluent** is the command we are going to run.\\ **2ddp** is the mode we want fluent to use\\ //Modes The [mode] option must be supplied and is one of the following:\\ * 2d runs the two-dimensional, single-precision solver\\ * 3d runs the three-dimensional, single-precision solver\\ * 2ddp runs the two-dimensional, double-precision solver\\ * 3ddp runs the three-dimensional, double-precision solver//\\ **-g** turns off the GUI\\ Path to the command file we are calling in fluent. //**< /home/rlhaffer/unittests/ANSYS/TurbulentFlow/TurbulentFlow_command.txt**// Contents of command file\\ This file can get long. As it contains the .cas file & .dat file information as well as saving frequency and iteration count \\ **NOTE**, this is all in one line when creating the command file\\ /file/rcd /home/rlhaffer/unittests/ANSYS/TurbulentFlow/flntgz-48243.cas /file/autosave/data-frequency 20000 /solve/iterate 150000 /file/wd /home/rlhaffer/unittests/ANSYS/TurbulentFlow/output.dat /exit When the simulation is finished, you will have a Forge-#####.out file that looks something like this:\\ /share/apps/ansys_inc/v150/fluent/fluent15.0.7/bin/fluent -r15.0.7 2ddp -g /share/apps/ansys_inc/v150/fluent/fluent15.0.7/cortex/lnamd64/cortex.15.0.7 -f fluent -g (fluent "2ddp -alnamd64 -r15.0.7 -path/share/apps/ansys_inc/v150/fluent") Loading "/share/apps/ansys_inc/v150/fluent/fluent15.0.7/lib/fluent.dmp.114-64" Done. /share/apps/ansys_inc/v150/fluent/fluent15.0.7/bin/fluent -r15.0.7 2ddp -alnamd64 -path/share/apps/ansys_inc/v150/fluent -cx edrcompute-43-17.local:56955:53521 Starting /share/apps/ansys_inc/v150/fluent/fluent15.0.7/lnamd64/2ddp/fluent.15.0.7 -cx edrcompute-43-17.local:56955:53521 Welcome to ANSYS Fluent 15.0.7 Copyright 2014 ANSYS, Inc.. All Rights Reserved. Unauthorized use, distribution or duplication is prohibited. This product is subject to U.S. laws governing export and re-export. For full Legal Notice, see documentation. Build Time: Apr 29 2014 13:56:31 EDT Build Id: 10581 Loading "/share/apps/ansys_inc/v150/fluent/fluent15.0.7/lib/flprim.dmp.1119-64" Done. -------------------------------------------------------------- This is an academic version of ANSYS FLUENT. Usage of this product license is limited to the terms and conditions specified in your ANSYS license form, additional terms section. -------------------------------------------------------------- Cleanup script file is /home/rlhaffer/unittests/ANSYS/TurbulentFlow/cleanup-fluent-edrcompute-43-17.local-17945.sh > Reading "/home/rlhaffer/unittests/ANSYS/TurbulentFlow/flntgz-48243.cas"... 3000 quadrilateral cells, zone 2, binary. 5870 2D interior faces, zone 1, binary. 30 2D velocity-inlet faces, zone 5, binary. 30 2D pressure-outlet faces, zone 6, binary. 100 2D wall faces, zone 7, binary. 100 2D axis faces, zone 8, binary. 3131 nodes, binary. 3131 node flags, binary. Building... mesh materials, interface, domains, mixture zones, pipewall outlet inlet interior-surface_body centerline surface_body Done. Reading "/home/rlhaffer/unittests/ANSYS/TurbulentFlow/flntgz-48243.dat"... Done. iter continuity x-velocity y-velocity k epsilon time/iter ! 389 solution is converged 389 9.7717e-07 1.0711e-07 2.9115e-10 5.2917e-08 3.4788e-07 0:00:00 150000 ! 390 solution is converged 390 9.5016e-07 1.0389e-07 2.8273e-10 5.1020e-08 3.3551e-07 1:11:14 149999 Writing "/home/rlhaffer/unittests/ANSYS/TurbulentFlow/output.dat"... Done. ===Parallel Example=== To use fluent in parallel please you need set the PBS_NODEFILE envrionment variable inside your job. Please see example submission file below. #!/bin/bash #SBATCH --job-name=TurbulentFlow.sbatch #SBATCH --ntasks=32 #SBATCH --time=01:00:00 #SBATCH --mail-type=BEGIN #SBATCH --mail-type=END #SBATCH --mail-type=FAIL #SBATCH --mail-user=rlhaffer@mst.edu #SBATCH -o Forge-%j.out #generate a node file export PBS_NODEFILE=`generate_nodefile` #run fluent in parallel. fluent 2ddp -g -t32 -pinfiniband -cnf=$PBS_NODEFILE -ssh < /home/rlhaffer/unittests/ANSYS/TurbulentFlow/TurbulentFlow_command.txt ===Interactive Fluent=== If you would like to run the full GUI you may do so inside an interactive job, make sure you've connected to The Forge with X Forwarding enabled. Start the job with. sinteractive This will give you 1 processor for 1 hour, to request more processors or more time please see the documentation at [[pub:forge#interactive_jobs|Interactive Jobs]]. Once inside the interactive job you will need to load the ansys module. module load ansys Then you may start fluent from the command line. fluent 2ddp will start the 2d, double precision version of fluent. If you've requested more than one processor you need to first run export PBS_NODEFILE=`generate_nodefile` Then you need to add some switches to fluent to get it to use those processors. fluent 2ddp -t## -pethernet -cnf=$PBS_NODEFILE -ssh You need to replace the ## with the number of processors requested. ====Ansys Mechanical==== Tested for version 18.1\\ {{ :pub:ansys_mechanical_batch_on_forge.pdf | Simple tutorial}} #!/bin/bash #SBATCH --job-name=round #SBATCH --ntasks=10 #SBATCH --mem=20000 #SBATCH -p=requeue #SBATCH --time=00:60:00 --comment=#SBATCH --mail-type=begin --comment=#SBATCH --mail-type=end #SBATCH --export=all #SBATCH --out=Forge-%j.out module load ansys/18.1 time ansys181 -j round -b -dis -np 10 < /home/rlhaffer/unittests/ANSYS/roundthing/round.log ansys mechanical log file /INPUT,'round','inp',,0,0 /SOLU SOLVE FINISH ====AnsysEM==== * Default Version = 17.0 * Other installed version: 19.0 * Also Called Ansys Electronics Desktop #!/bin/bash #SBATCH --job-name=jobfile.sbatch #SBATCH --partition=requeue #SBATCH --ntasks=1 #SBATCH --cpus-per-task=20 #SBATCH --time=01:00:00 #SBATCH --mail-type=BEGIN #SBATCH --mail-type=END #SBATCH --mail-type=FAIL #SBATCH --mail-user=joeminer@mst.edu module load AnsysEM/17 time ansysedt -ng -batchsolve Model_Cluster_Test_v2.aedt ====CMG==== Computer Modeling Group - This is a WORKING SECTION - Edits are actively taking place.\\ Default version = 2015\\ ===Constructing the jobfile=== From the Windows installation the **.dat** model will be needed. Please follow the general [[pub:forge|Forge documentation]] for submission techniques. After logged into The Forge, run: module load cmg This loads environment variables needed to for CMG to function.\\ Copy your **.dat** file from your Windows system to your cluster home folder. This is explained in the [[pub:forge|Forge documentation]].\\ Example sbatch jobfile\\ #!/bin/bash #SBATCH --job-name=batchfilename.sbatch #SBATCH --nodes=1 #SBATCH --ntasks=7 #SBATCH --mem=17000 #SBATCH --time=10:00:00 #SBATCH --mail-type=BEGIN #SBATCH --mail-type=END #SBATCH --mail-type=FAIL #SBATCH --mail-user=joeminer@mst.edu #SBATCH --export=ALL RunSim.sh gem 2015.10 GasModel4P.dat -parasol 7 |RunSim.h| the CMG sim script that allows the simulation to run on multiple processors| |gem| the solver specified| |2015.10| version of the software used| |GasModel4P.dat| the model to be analyzed| |-parasol 7 | this is the command switch to tell CMG there are processors to use. In this example, 7 processors were allocated.| ===CMG licensing=== When running the above example, we discovered how CMG’s licensing really works.\\ We have varying quantities of licenses for different features and each seat of those licenses equates to 20 solver tokens that will be allocated to jobs on the cluster.\\ For this example..\\ **20** tokens of the **solve-stars** solver are being used, because we have an instance of the solver open.\\ **35** tokens of the **solve-parallel** solver are being used, because we have asked for 7 processors.\\ We currently have 8 seats for **Stars** and 8 seats for **Parallel**. This equates to **160** tokens of each. Out of that **160** tokens, your job can take at least **20** tokens from the stars solver, and up to **160** tokens of the parallel solver.\\ **160** tokens of the parallel solver actually equates to 32 processors when building the job file.\\ So, the max processors requested is **32**.\\ ===Token Scenario example=== If a student submits a parallel job asking for 32 processors (all 160 tokens) and all 160 are free, the job will begin running.\\ If a second student then submits a job asking for 10 processors (50 tokens) but all 160 are currently in use, the job will fail. In the output file of the failed job, it will indicate that the job failed because it did not have enough licenses to run.\\ If this happens, the best suggestion is to modify your jobfile with a smaller number of processors. This may take a few iterations until you find a number of processors that will work.\\ There is no efficient way to know how many licenses are in use.\\ Licensing for CMG is limited. Some features have 2160 tokens, some only 100.\\ Please be patient when submitting jobs as they could fail in this **no license** manner.\\ The jobs tested only indicate the solve-star and solve-parallel are being used. We have not seen a test case using these other features:\\ gem_forgas gem_gap imex_forgas imex_gap imex_iam stars_forgas stars_gap winprop builder cmost_studio dynagrid solve_university rlm_roam ====Comsol==== ===Licensing=== The licensing scheme for Comsol is seat based, which means if you only have 1 license, and you have it installed on your workstation and the Forge you can only use one or the other at the same point in time. If your workstation has the license checked out when your Forge job starts, your job will fail to get a license and will fail to run. This problem only gets compounded when you are also working with other users inside a group with a handful of shared seats. You will need to coordinate with them when you and they will run so that you don't run into license problems when trying to run. ===Running Batch=== Running Comsol with the Forge as your solver is fairly straight forward, first you load your module (module names for comsol differ by which license pool you are using). module load comsol/5.4_$pool then create your batch file for submission. #SBATCH --job-name=comsol_test #SBATCH --nodes=1 #SBATCH --ntasks=20 #SBATCH --mem=20000 #SBATCH --time=60:00:00 #SBATCH --export=ALL comsol -np 20 batch -inputfile multilayer.mph -outputfile laminate_out.mph Please note that comsol is a memory intensive application, you will likely have to adjust the values for mem and ntasks to suit what your simulation will need. We also advise creating the input file on a windows workstation and using the Forge for simulation solving, however running interactively should be possible inside an interactive job with X forwarding enabled. ====CST==== * (Computer Simulation Techonolgy) * Default version = 2014 * Other versions available = 2015, 2016, 2017, 2018 Job Submission information CST is an electromagnetic simulation solver, primarily used by the EMCLab. Steps. - Build your simulation on one of the Windows workstations availble to you.\\ - Copy your .cst file to your home directory on the Forge.\\ - Write you job submission file.\\ - Either use the default CST version of 2014 or use 2015 by loading its module file.\\ module load CST/2015 Example Job File #!/bin/bash #SBATCH --job-name=CST_Test #SBATCH --nodes=1 #SBATCH --ntasks=2 #SBATCH --mem=4000 #SBATCH --partition=cuda #SBATCH --gres=gpu:2 #SBATCH --time=08:00:00 #SBATCH --mail-type=FAIL #SBATCH --mail-type=BEGIN #SBATCH --mail-type=END #SBATCH --mail-user=joeminer@mst.edu cst_design_environment /home/userID/path/to/.cst/file/example.cst --r --withgpu=2 This job will use __**1**__ node, __**2**__ processors from that node, __**4GB**__ of memory, the __**cuda**__ queue, __**2**__ gpus, __**8**__ hours of wall clock time and email the user when the job starts, ends, or fails. This will output the results in the standard CST folder structure:\\ /path/to/.cst/file/example |-- DC |-- Model |-- ModelCache |-- Model.lok |-- Result |-- SP `-- Temp ==== Cuda ==== If you would like to use CUDA programs or GPU accelerated code you will need to use our GPU nodes. Access to our GPU nodes is granted on a case by case basis, please request access to them through the help desk or submit a ticket at [[http://help.mst.edu]].\\ \\ Our login nodes don't have the CUDA toolkit installed so to compile your code you will need to start an interactive job on these nodes to do your compilation. sinteractive -p cuda --time=01:00:00 --gres=gpu:1 This interactive session will start on a cuda node and give you access to one of the GPUs on the node, once started you may compile your code and do whatever testing you need to do inside this interactive session. To submit a job for batch processing please see this example submission file below. #!/bin/bash #SBATCH -J Cuda_Job #SBATCH -p cuda #SBATCH -o Forge-%j.out #SBATCH --nodes=1 #SBATCH --ntasks=1 #SBATCH --gres=gpu:1 #SBATCH --time=01:00:00 ./a.out This file requests 1 cpu and 1 gpu on 1 node for 1 hour, to request more cpus or more gpus you will need to modify the values related to ntasks and gres=gpu. It is recommended that you at least have 1 cpu for each gpu you intend to use, we currently only have 2 gpus available per node. Once we incorporate the remainder of the GPU nodes we will have 7 gpus available in one chassis. ==== Espresso ==== * Espresso is an integrated suite of Open-Source computer codes for electronic-structure calculations and materials modeling at the nanoscale. It is based on density-functional theory, plane waves, and pseudopotentials. * Default version - 5.2.1 * http://www.quantum-espresso.org/ * http://www.quantum-espresso.org/pseudopotentials/ All of the available pseudopotential files **.UPF** as of 12/14/2015, are copied into the pseudo install folder. Before running an Espresso job, run: module load espresso Example: The **sbatch** file is the job file you will run on the Forge HPC.\\ The **.pw.in** is the input file you create, that you will reference in the jobfile.\\ In the job file, change the path to your input file.\\ In the .pw.in file, change the paths of **outdir**, **wfcdir** to a location in your Forge home directory and change the **pseudo_dir** path as in the example below. #!/bin/bash #SBATCH --job-name=FeGeomBU #SBATCH --ntasks=20 #SBATCH --time=0-04:00:00 #SBATCH --mail-type=begin,end,fail,requeue #SBATCH --out=forge-%j.out mpirun pw.x -procs 20 -i /path/to/your/inputfile/FeGeom2BU.pw.in &CONTROL title = 'Na3FePO4CO3 geom+U' , calculation = 'vc-relax' , outdir = '/path/to/your/inputfile' , wfcdir = '/path/to/your/inputfile' , pseudo_dir = '/share/apps/espresso/espresso-5.2.1/pseudo' , prefix = 'Na3FePO4CO3Geom' , verbosity = 'high' , / &SYSTEM ibrav = 12, A = 8.997 , B = 5.163 , C = 6.741 , cosAB = -0.0027925231 , nat = 11, ntyp = 5, ecutwfc = 40 , ecutrho = 400 , lda_plus_u = .true. , lda_plus_u_kind = 0 , Hubbard_U(1) = 6.39, space_group = 11 , uniqueb = .false. , / &ELECTRONS conv_thr = 1d-6 , startingpot = 'file' , startingwfc = 'atomic' , / &IONS ion_dynamics = 'bfgs' , / &CELL cell_dynamics = 'bfgs' , cell_dofree = 'all' , / ATOMIC_SPECIES Fe 55.93300 Fe.pbe-sp-van_mit.UPF P 30.97400 P.pbe-van_ak.UPF C 12.01100 C.pbe-van_ak.UPF O 15.99900 O.pbe-van_ak.UPF Na 22.99000 Na.pbe-sp-van_ak.UPF ATOMIC_POSITIONS crystal_sg Fe 0.362400000 0.279900000 0.250000000 P 0.587500000 0.798900000 0.250000000 Na 0.740900000 0.251300000 0.002200000 Na 0.917800000 0.738500000 0.250000000 C 0.060500000 0.234500000 0.250000000 O 0.123300000 0.461600000 0.250000000 O 0.143000000 0.031700000 0.250000000 O 0.919500000 0.215700000 0.250000000 O 0.321000000 0.281700000 0.565500000 O 0.431600000 0.674000000 0.250000000 O 0.569400000 0.096500000 0.250000000 K_POINTS automatic 6 12 10 1 1 1 ====Lammps==== To use lammps you will need to load the lammps module module load lammps and use a submission file similar to the following #SBATCH -J Zn-ZnO-rapid #SBATCH --ntasks=4 #SBATCH --mail-type=FAIL #SBATCH --mail-type=BEGIN #SBATCH --mail-type=END #SBATCH --time=02:00:00 #SBATCH --export=ALL module load lammps/23Oct2017 mpirun lmp_forge < in.heating Other options may be configured. please refer to the [[http://lammps.sandia.gov/doc/Manual.html | lammps documentation]] . Version 23Oct2017 has support for the following packages asphere body class2 colloid compress coreshell dipole granular kspace manybody mc meam misc molecule mpiio opt peri python qeq replica rigid shock snap srd user-cgdna user-cgsdk user-diffraction user-dpd user-drude user-eff user-fep user-lb user-manifold user-meamc user-meso user-mgpt user-misc user-molfile user-netcdf user-omp user-phonon user-qtb user-reaxc user-smtbq user-sph user-tally user-uef ==== Maple ==== Default version = 2015 Other installed versions: 2016, 2017 Example:\\ This is a simple polynomial and the goal is to find the roots of the polynomial.\\ Open Maple on your Windows or Linux workstation and start a new project:\\ #Solve a simple polynomial equation\\ f(x):=(x^9-x^8+x^7-x^6+x^5-x^4+x^3-x^2+x^4+9*x^3-2.4^2+.5*x)/(x-1);\\ myroots:=solve(f(x)=148596);\\ #Save the results into a Maple input file for later post-processing. save myroots, "myroots.mpl"; When this runs, it will save the output to the myroots.mpl file.\\ As seen in Maple:\\ {{:pub:maple_main-500x379.png?500|}} Now, you need to SSH into the [[pub:forge|Forge]] with your MST userID/password.\\ Once connected, you need to copy the **findroots.mpl** file from your local system to your home directory on the cluster.\\ You can build your folder structure on the [[pub:forge|Forge]] how you wish, just use something that is meaningful.\\ Once copied, you can return to your terminal on [[pub:forge|Forge]] and create your jobfile to run your findroots.mpl project.\\ Open vi to create your job file.\\ Since this example is simple, it is assigned **1 node**, with **2 processors** and **15 minutes** wall time.\\ The job will email when it begins, ends or fails. First: module load maple/2015 Then write your jobfile:\\ ##Example Maple script #!/bin/bash #SBATCH --job-name=maple_example #SBATCH --nodes=1 #SBATCH --ntasks=2 #SBATCH --mem=2000 #SBATCH --time=00:15:00 #SBATCH --mail-type=BEGIN #SBATCH --mail-type=END #SBATCH --mail-type=FAIL #SBATCH --mail-user=rlhaffer@mst.edu maple -q findroots.mpl Then run: sbatch maple.sbatch\\ When the job finishes, you will have a slurm-jobID.out\\ This contains the output of the finished job.\\ ==== Matlab ==== **IMPORTANT NOTE** Currently campus has 100 Matlab seat licenses to be shared between the Forge and research desktops. There are certain times of the year where Matlab usage is quite high. License check out is on a first come, first served basis. If you are not able to get a Matlab license, you might consider using GNU Octave. This is available on the Forge and will do much of what Matlab will do. Matlab is available to run in batch form or interactively on the cluster. * Default version = 2014a * Other installed version: 2012, 2013a, 2016a, 2017a, 2017b, 2018a, 2018b === Interactive Matlab === The simplest way to get up and running with matlab on the cluster is to simply run matlab from the login node. This will start an interactive job on the backend nodes, load the 2014a module and open matlab. If you have connected with X forwarding you will get the full matlab GUI to use however you would like. This method however limits you to 1 core for 4 hours maximum on one of our compute nodes. To use more than 1 core, or run for longer than 4 hours, you will need to submit your job as a batch submission and use Matlab's Distributive Computing Engine. === Batch Submit === If you want to use Batch Submissions for Matlab you will need to create a submission script similar to the ones above in quick start, but you will want to limit the nodes your job runs on 1, please see the sample submission script below. #!/bin/bash #SBATCH --nodes=1 #SBATCH --ntasks=12 #SBATCH -J Matlab_job #SBATCH -o Forge-%j.out #SBATCH --time=01:00:00 module load matlab matlab < helloworld.m This submission asks for 12 processors on 1 node for an hour, the maximum per node we currently have is 64. If you load the any version older than 2014a your matlab can't use more than 12 processors as that is what mathworks limited you to using on one machine, however Mathworks lifted this limitation to 512 processors in 2014a. === Distributive Computing Engine === Mathworks has developed a parallel computing product that the campus has purchased licenses for, We have developed a parallel cluster profile that must be loaded on a user to user basis. To begin using the distributive computing profile you must open an interactive matlab session with X forwarding so you get the matlab GUI. [blspcy@login-44-0 ~]$ matlab Once the GUI is open you need to open the parallel computing options menu, noted simply by parallel. {{:pub:matlab-parallel.png?900|}} Now open the Parallel profile manager and import the Forge.settings profile from the root directory of the version of matlab you are using, e.g. for 2014a /share/apps/matlab/matlab-2014/ {{:pub:matlab-profilemanager.png?900|}} You will now have a Forge_Import profile in your profile list, select this profile and set it as default. {{:pub:matlab-defaultset.png?900|}} If you would like you may validate the imported profile and see that it is indeed working. {{:pub:matlab-validate.png?900|}} The validation will run a series of five tests which it should pass, if for any reason the profile doesn't pass any test please contact the helpdesk or create a ticket at [[http://help.mst.edu]] To modify how many workers the pool will try to use when you call matlabpool open **in your matlab script** you may edit the profile and set the NumWorkersRange value to the number you wish to use, or simply specify when you call the pool open matlabpool open 24 will open a pool of 24 workers. You may now use matlabpool open in your code and it will call the Forge cluster profile configured above, These steps must be followed for every version of Matlab you wish to use before calling the pool. The parallel profiles are only built for versions newer than 2014a. To make use of this new found power you must implement matlabpool open calls in your matlab script then run it using either the interactive method or batch submit method listed above, make sure to close the pool when you are finished with your work with matlabpool close Keep in mind that the University only has 160 of these licenses and your job may fail if someone else is using the licenses when your job runs. ====METIS==== \*\*\**Under Construction**/*/*/ METIS and parMETIS are installed as modules, to use the software please load the modules with either module load metis or module load parmetis Once loaded you may submit jobs using the METIS or parMETIS binaries as defined in the METIS [[http://glaros.dtc.umn.edu/gkhome/fetch/sw/metis/manual.pdf | user manual]] or the parMETIS [[http://glaros.dtc.umn.edu/gkhome/fetch/sw/parmetis/manual.pdf | user manual]] ==== MSC ==== ===Nastran=== MSC nastran 2014.1 is available through the module msc/nastran/2014 module load msc/nastran/2014 Here is an example job file. #!/bin/bash #SBATCH -J Nastran #SBATCH -o Forge-%j.out #SBATCH --ntasks=1 #SBATCH --time=01:00:00 nast20141 test.bdf #you should also be able to use .dat files with nastran this way. Submit it as normal. ====NAMD==== This article is under development. Things referenced below are written as we make progress and some instruction may not work as intended. Below is an example file for namd, versions prior to 2.12b have not been tested as of yet and may not function in the same way as 2.12b. #SBATCH -J NAMD_EXAMPLE #SBATCH --ntask=16 #SBATCH --time=01:00:00 #SBATCH -o Forge-%J.out module load namd/2.12b charmrun +p16 namd2 [options] You will need to give namd2 whatever options you would like to run, user documentation for what those may be is available from the developers at [[http://www.ks.uiuc.edu/Research/namd/2.12b1/ug/]]. ====Gamess==== To use gamess you will need to load the gamess module module load gamess and use a submission file similar to the following #!/bin/bash #SBATCH -J Gamess_test #SBATCH --time=01:00:00 #SBATCH --ntasks=2 #SBATCH -o Forge-%J.out rungms -i ./test.inp We have customized the rungms script extensively for our system and have introduced several useful switch options. | Switch | Function| | -i | defines the path to the input file, no default value | | -d | defines the path to the data folder, default value is the submission directory | | -s | defines the path to the scratch folder, default value is ~/gamess/scr-$input-$date this folder gets removed once the job completes. | | -b | defines the binary that will be used, default is /share/apps/gamess/gamess.00.x useful if you've compiled your own version of gamess.| | -a | defines the auxdata folder, default is /share/apps/gamess/auxdata useful if you've compiled your own version of gamess.| ====OpenFOAM==== First load the module module load OpenFOAM then source the bash file, source $FOAMBASH or the csh file if you are using csh as your shell source $FOAMCSH Then you may use open foam to work on your data files, anything that needs to be generated interactively needs to be done in an interactive job sinteractive You can create a job file to submit from the data directory to do anything that needs more processing, please see the example below. #!/bin/bash #SBATCH -J FOAMtest #SBATHC --nodes=1 #SBATCH --ntasks=2 #SBATCH --time=01:00:00 blockMesh This will execute blockMesh on the input files in the submission directory on 1 node with 2 CPUs for 1 hour. ====Python==== ===Python 2=== Python modules are available through system modules as well. To see what python modules have been installed issue module avail python These are the site installed python modules and can be used just like any other module. module load python will load all python modules, module load python/matplotlib will load the matplotlib module and any module it depends on. ===Python 3=== There are many modules available for Python 3. However, unlike Python 2.7, all of the python modules are installed in the python directory rather than being separate modules. Currently, the newest Python 3 version available on the Forge is Python 3.6.4. To see a current list of available python versions, run the command module avail python To see a list of all available Python modules available for a particular Python version, load that Python version and run pip list Users are welcome to install additional modules on their account using pip install --user ***NOTE*** The default versions of Python on the Forge do ***NOT*** have pip installed. You will have to load a Python module to get pip functionality. ====QMCPack==== Start by loading the module. module load qmcpack Then create your submission file in the directory where your input files are, see example below. #!/bin/bash #SBATCH -J qmcpack #SBATCH -o Forge-%j.out #SBATCH --ntasks=8 #SBATCH --time=1:00:00 #SBATCH --constraint=intel mpirun qmcapp input.xml **Note:** QMCPack is compiled to run only on intel processors, you need to add --constraint=intel to make sure that your job doesn't run on a node with amd processors. This should run qmcpack on 8 cpus for 1 hour on your input file. If you want to run multiple instances, in parallel, in the same job you may pass an input.list file in place of the input.xml which contains all the input xml files you wish to run. For more information on creating the input files please see QMCPack's [[http://docs.qmcpack.org/ug/index.html|user guide]] ====SAS 9.4==== Notes:\\ SAS can be run from the login node, with an X session, or from command line.\\ SAS is node-locked to only forge.mst.edu, so it won't work in a jobfile. You must run it interactively on forge.mst.edu.\\ For both methods, you need to load the SAS module. module load SAS/9.4 ===Interactive:=== Steps:\\ - Start a local X Server XWin 32 - SSH to forge.mst.edu - On our Linux builds, type the command “ssh -XC forge.mst.edu” - Once logged in onto the Forge, run sinteractive -p=free - the **sinteractive** command will open a new SSH tunnel on a compute node with free resources.\\ - One this new SSH command line, run the sas_en and it will open the interface on the compute node that **sinteractive** has set. - Remember to close all your windows when you're finished This series of Windows will appear.\\ {{:pub:sas_mainwindow.png?500|}} ===Jobfile=== #!/bin/bash #SBATCH --job-name=job_name #SBATCH --ntasks=1 #SBATCH --time=23:59:59 #SBATCH --mem=31000 #SBATCH --mail-type=begin,end,fail,requeue #SBATCH --export=all #SBATCH --out=Forge-%j.out module load SAS/9.4 sas_en -nodms sas_script.sas ===Command line:=== sas_en -nodms runs the command line version of sas. (type endsas; to get out) [rlhaffer@login-44-0 apps]$ sas_en -nodms NOTE: Copyright (c) 2002-2012 by SAS Institute Inc., Cary, NC, USA. NOTE: SAS (r) Proprietary Software 9.4 (TS1M3) Licensed to THE CURATORS OF THE UNIV OF MISSOURI - T&R, Site 70084282. NOTE: This session is executing on the Linux 2.6.32-431.11.2.el6.x86_64 (LIN X64) platform. SAS/QC 14.1 NOTE: Additional host information: Linux LIN X64 2.6.32-431.11.2.el6.x86_64 #1 SMP Tue Mar 25 19:59:55 UTC 2014 x86_64 CentOS release 6.5 (Final) You are running SAS 9. Some SAS 8 files will be automatically converted by the V9 engine; others are incompatible. Please see http://support.sas.com/rnd/migration/planning/platform/64bit.html PROC MIGRATE will preserve current SAS file attributes and is recommended for converting all your SAS libraries from any SAS 8 release to SAS 9. For details and examples, please see http://support.sas.com/rnd/migration/index.html This message is contained in the SAS news file, and is presented upon initialization. Edit the file "news" in the "misc/base" directory to display site-specific news and information in the program log. The command line option "-nonews" will prevent this display. NOTE: SAS initialization used: real time 0.11 seconds cpu time 0.04 seconds 1? ===PRINTING:=== * Select File | Print. * This will create a postscript file, “sasprt.ps”, in the directory you started SAS from. You can download it to your computer and use ps2pdf to print it. ====StarCCM+==== Engineering Simulation Software\\ Default version = 10.04.011\\ Other working versions: * 13.04 * 12.06 * 11.04 * 11.02.009 * 10.06.010 * 8.04.010 * 7.04.011 * 6.04.016 Job Submission Information Copy your .sim file from the workstation to your cluster home profile. The [[pub:forge|Forge]] documentation explains how to do this.\\ Once copied, create your job file. Example job file: First module load starccm/10.04 #!/bin/bash #SBATCH --job-name=starccm_example #SBATCH --nodes=1 #SBATCH --ntasks=12 #SBATCH --partition=requeue #SBATCH --time=08:00:00 #SBATCH --mail-type=FAIL #SBATCH --mail-type=BEGIN #SBATCH --mail-type=END #SBATCH --mail-user=joeminer@mst.edu module load starccm/10.04 time starccm+ -power -batch -np 12 -licpath 1999@flex.cd-adapco.com -podkey AABBCCDDee1122334455 /path/to/your/starccm/simulation/example.sim ** It's prefered that you keep the ntasks and -np set to the same processor count.**\\ Breakdown of the script:\\ This job will use **1** node, asking for **12** processors, for a total wall time of **8 hours** and will email you when the job starts, finishes or fails. The StarCCM commands:\\ |-power| using the power session and power on demand key| |-batch| tells Star to utilize more than one processor| |-np| number of processors to allocate| |-licpath| 1999@flex.cd-adapco.com - Since users of StarCCM at S&T must have Power On Demand keys, we use CD-Adapco's license server| |-podkey| Enter your PoD key....AABBCCDDee1122334455 This is given to you when you sign up for Star, or someone from the FSAE team provides it.| |/path/to/your/starccm/simulation/example.sim| use the true path to your .sim file in your cluster home directory| ====Vasp==== To use our site installation of Vasp you must first prove that you have a license to use it by emailing your vasp license confirmation to . Once you have been granted access to using vasp you may load the vasp module module load vasp and create a vasp job file, in the directory that your input files are, that will look similar to the one below. #!/bin/bash #SBATCH -J Vasp #SBATCH -o Forge-%j.out #SBATCH --time=1:00:00 #SBATCH --ntasks=8 mpirun vasp This example will run the standard vasp compilation on 8 cpus for 1 hour. \\ If you need the gamma only version of vasp use mpirun vasp_gam in your submission file. \\ If you need the non-colinear version of vasp use mpirun vasp_ncl in your submission file.\\ There are some globally available Psudopoetentials available, the module sets the environment variable $POTENDIR to the global directory.