I will focus on the use of Perl on the Linux operating system in this class. Almost anything that we talk about is applicable to Mac OSX, and most things also apply to MS Windows as well. However, most of your computational needs will be serviced in this environment using Linux servers. First, we need to make sure that everyone has a common understanding of how to work on the linux commandline.
- How to get to the servers: ssh.
- a. Windows: xwin32 (You can get it from the OIT download site).
- b. Mac OSX, Linux: ssh
- c. From Within Duke (note, this is true for both xwin32 and ssh, but only ssh is demonstrated):
shell> ssh user@host.duhs.duke.edu
- d. From Outside Duke (note, this is true for both xwin32 and ssh, but only ssh is demonstrated):
shell> ssh user@jumper.genome.duke.edu
jumper_shell> ssh user@host.duhs.duke.edu
- playWithSSH
- Shell environment: Three shells: csh/tcsh, sh/bash, ksh/zsh. Each have slightly different API, and some have a little extra functionality. I will use bash in this class, because it is the default shell in most peoples profile. I will also only focus on the things that you can do in any of the shells, although the commands to do them may vary slightly from bash syntax. I strongly recommend you find a good shell book or online reference. There is alot you can do with shell by itself, and even more you can do with shell and perl. Some things I wont cover, but are extremely useful are awk, sed, and grep (but note, perl can do anything that any of these can do, and more).
- what happens at login
- Environment variables: These are variables that are shared with every child process that gets spawned from a given shell login. These are not shared across logins (unless you place them in one of the config files mentioned in the initialization process section). There are a variety of environment variables that are common to almost all programs and programming languages that run in the *NIX environment, and each programming language has its own special set of environment variables. The way to set an environment variable is:
shell> export VARNAME='....'
shell> echo $VARNAME
shell> echo "${VARNAME} is cool"
shell> env
shell> perl -le 'print $ENV{VARNAME};'
- backticks: you can run any command within backticks '`' within the context of another command, and the output of that command (if any) will be interpolated into the outer command as text.
echo "HELLO I AM "`id -un`
export ME=`id -un`
echo "HELLO I AM ${ME}
- man: the man program makes pages and pages of information about the various programs on a linux system available to you. Not every program has a man page, but 99% of them do. Increasingly, many perl packages have man pages available as well as perldoc ( more on perldoc later).
man id
- PATH: PATH is the environment variable used by the shell itself to determine where a program is located, if the program has been called without a path (see below). It is a string which consists of a colon-separated list of directories where executable programs may be.
shell > echo $PATH
If PATH is the empty string, you will only be able to run shell built-in commands (export, cd, which, etc). You will not be able to run cp, mv, or even ls! One mistake people make is to blow away the PATH set in /etc/bashrc with their own PATH. You can avoid this by either prepending
shell> export PATH="${PATH}:/path/to/my/own/stuff"
or appending
shell> export PATH="/path/to/my/own/stuff:${PATH}"
the existing PATH to your additions. The difference will only be noticed if you have two different commands with the same name (or two versions of the same command) occuring at different places in your PATH. Also, note that sometimes it is easier to include '.' in your path to allow executables in the current working directory to be run without needing to type ./ explicitly
shell> export PATH="${PATH}:."
- PERL5LIB: This is an environment variable that (obviously) is important only to PERL. It may, and probably is, undefined by default (eg, it is not usually set in /etc/bashrc). It is only necessary if you have perl modules in non-standard directories (meaning in directories outside of directories perl by default knows about, usually places like /usr/lib/perl, /usr/lib/perl5, etc). Like PATH, it is a colon-separated list of directories to search for perl Packages (more about these later). Unlike PATH, you dont lose the ability to run standard perl modules when you accidentally set PERL5LIB to the empty string.
- relative vs absolute paths: in *NIX files are referred to with their path. This can be a path relative to the current working directory
shell> ls path/to/some/relative/directoryOrFile
or the absolute path from 'root' /
ls /absolute/path/to/some/relative/directoryOrFile
Another special absolute path is ~/, which is your homedir (or even ~username/ which is username's homedir)
ls ~/path/relative/to/my/homedir
- running a program: Executable Commands can be executed with their absolute or relative paths, unless they are in the current working directory '.', which can only be executed if '.' is in your PATH, or by typing the command relative to the './' directory:
./someCommand
After you have created a new script (perl, shell, python, etc), you will need to make it executable before it can be executed by the shell. You do this with chmod. You can make your programs executable for only you (chmod u+x /path/to/program), or only users in your group (chmod g+x /path/to/program), or for all users (chmod o+x /path/to/program). Type 'man chmod' for details.
- which: This is a very handy program for determining the absolute path to any executable in your PATH. It can be used to determine if the 'right' version of a program is being used, or, in the case of perl, it can be used to find a perl script to edit or replace it.
- editors
- Linux: Emacs, vi.
- Mac OSX: same as Linux, but also subEthaEdit, textwrangler
- Windows: jEdit is a free one. Most cost money.
- Eclipse: runs on any workstation (Linux, Mac OSX, Windows). Powerful, full featured, a little bloated. Has a perl plugin, with some functionality for interactive debugging, autocompletion, etc. Great for Java.
- playWithShell