Learning the Unix Programming Environment
This is a series of practical tutorials on Unix - mainly GNU/Linux and OSX - in the context of modern programming practice. It is an up-to-date learning supplement to the great UPE book, "The Unix Programming Environment" by Kernighan and Ritchie. In addition to summarizing the outstanding classic and adapting it for current environments, LUPE also gathers a number of valuable practical tips and pointers from experienced programmers.
This Page is Under Development
- 1 First: read UPE
- 2 Background
- 3 Basics
- 4 Setting up the modern unix programming environment
- 5 Understanding the Bourne shell
- 6 Processes and Jobs
- 7 Finding stuff
- 8 Networking
- 9 Links
- 10 Further hints
First: read UPE
Even though LUPE is reasonably self-contained and covers what is most immediately useful to you, reading the original masterpiece - UPE - is highly recommended. Specially the exercises. Reading UPE is the top way of learning Unix at a level most useful to programmers. Raeding it will be the single thing that really makes you understand the tools, the idiosyncratic syntax, and why Unix's highly programmable interface works the way it does. Really understanding the practical tools and the power user side of the OS is what makes a highly efficent programmer and user. With the LUPE supplements you'll leverage that powerful knowledge into the broader context of today's practice of programming and advanced usage.
Open a terminal (by launching the Terminal app). Play some commands and constructs of the Shell (sh) environment language:
List files in the current directory/foder:
A common shortcut for ls is d, though it is not always available. Try it anyways:
In the aliases section we'll set this useful shortcut up in case you don't have it by default.
Also try these
ls -a # "all" - shows entries starting with '.', usually hidden ls -l ls -la
For which a useful short is
Don't worry if you don't have ll - we'll set it up later.
Tell the computer to print "Hello, World":
echo hello, world
Then tell it to print this 10 times using a for loop in the bash language (just type it for now - we'll help you understand the syntax idiosyncrasies later):
for i in `seq 1 10`; do echo hello, world; done
Each Unix command is an executable program, which is usually written in C or can be internally modeled with C's abstraction. As in C, each command has a return value and default/standard input and output handles (stdout/stdin).
The standard input and output can be redirected. Create a text file containing hello, world using the redirect operator
echo hello, world > a.txt
Print out the contents of the file
You can use the redirect operator on any command construct that has an output, even on the for construct above:
for i in `seq 1 10`; do echo hello, world; done > a.txt
Count the number of words in the file
wc -l a.txt
The UPE book masterfully explains why you should prefer that over (try it)
cat a.txt | wc -l
You'll understand what the pipe `|' means later, but it basically connects the default output of a program to the standard input of another.
Gives a numbered list of your previous commands. You can ask for previous command number 17 using
Instead of pressing UP multiple times. Another way is to search history by keyword as you type. Lets say you're trying to reverse search the above for statement. You can do this:
- type Control-R
- type any substring of the desired command, say for
- a command matching that substring is displayed
- press Enter to execute it, or keep pressing Control-R to search for other options
You can also refer to the previous command's 3rd argument on your current command:
wc -l a.txt cat !!:2
Your history is located in your home directory, which can be referred to by '~'
On many setups this will contain the history up to a previous session. The current history is usually not stored in ~/.history until the session is over.
Pressing Control-C usually aborts any command that is running interactively through the shell. Other commands that are not blocking the shell can be ended using kill and killall:
Or kill, which requires a process ID associated to a running app:
ps -A|grep firefox 553 kill 553
If it doesnt go away, use -9 or -KILL
killall -9 firefox kill -KILL 553
A reference to any single command is provided by man:
The standard sections of the manual include:
1 User Commands 2 System Calls 3 C Library Functions 4 Devices and Special Files 5 File Formats and Conventions 6 Games et. Al. 7 Miscellanea 8 System Administration tools and Deamons
You can ask man to look up an entry on a determined manual. If you want the C function, you can skip shell commands by using
man 3 printf
To search where a given keyword, eg exec, appears in all the manuals
The shell is a full-fledged programming language which derives its power from being able to very easily control and combine commands to work together inside the OS environment. You can combine the above commands into a shell script using any text editor. Or you can just type
echo 'ls | wc -l echo thats the number of files in the current folder.' > script.sh
Check whats inside the script.sh file
Run the command
Usually you won't edit using echo, but using more convenient editors. There are even variables you can use. Lets say we put the following into script.sh
#!/bin/sh num_files=`ls | wc -l` echo $num_files is the number of files in the current folder.
You can also make it executable and even remove the extension so it becomes more like a command itself.
mv script.sh script chmod a+x script ./script
More useful and fun stuff to try and understand later
Warning: The following commands may require the appropriate software packages to be installed to your system.
# Convert images convert image.png image.jpg # Convert all images to a PDF convert *.jpg document.pdf # Play videos mplayer movie.mkv # Extract pages from a pdf pdftk old.pdf cat 1-9 26-end output new.pdf # Extract all frames of a video mplayer video.mp4 -vo png -frames 5 # Make a GIF animation from images convert -adjoin -loop 0 -delay 5 *.gif animation.gif
More useful commands are available at Utils-Macambira.
Setting up the modern unix programming environment
Useful script collections
Our personal scripts in current use and actual config files used for editing, programming, and daily use is available at Utils-Macambira.
A nifty collection of little shell utilities, having a portuguese slant to them. They can be found at funcoeszz.net. I personally use:
- zzarrumanome - convert file names to lowercase and spaces to underscore, useful for shell scripting
zzarrumanome * RAMONES - I Don't Care.mp3 -> ramones-i_dont_care.mp3 Toy Dolls - Wakey, Wakey!.mp3 -> toy_dolls-wakey_wakey.mp3
zztradutor pt-de livro # Buch
Configuring Bash for programming
The following are most useful BASH configuration files
~/.bashrc ~/.bash_profile ~/.bash_aliases
The .bash_profile shell script is meant to be executed once per work session - upon your login to the system or upon starting a login shell as in bash -ls. The .bashrc script is executed during each new shell that is started, as when typing bash at the prompt. The .bash_aliases shell script is executed by a suitably configured .bashrc and by convention contains only alias/shortcut definitions. For instance,
alias fox="firefox" # you can now just type fox to start firefox
What should you put in ~/.bashrc and ~/.bash_profile? You should take a look at the existing ones in your system, and copy from more experienced friends. The ones I actually use on a daily basis are available at Utils.
Create your own ~/bin folder to install the personal scripts that you'll be using often:
Move the desired script, say myscript, there:
mv myscript ~/bin
Now change permissions to execute
chmod a+x ~/bin/myscript
Set the path in your ~/.bash_profile
Reload the environment for changes to take effect
bash -ls # start as login shell, reloading .profile # XOR . ~/.bash_profile
You can now run your script without the ./:
You should use the cx shortcut given in UPE for making things more simple.
You can get our actual working versions of cx and other utilities suggested by UPE at Utils (do peek at the source code!).
Understanding the Bourne shell
Think about the internal C implementation model for commands. The commands's main() function sees its arguments through argv - the entire argument line broken into a number of strings. It is the shell that breaks the entire command string as in
"echo hello, world" --- sh ---> "echo" "hello," "world"
into distinct components. Understanding this process is key to understanding how the shell works, specially how to use the different quotes `'" and the escape character '\' .
The program echo can only see two distict "hello" and "world" strings in its argv for the above example. For it to see "hello world" as a unit, quotes can be employed:
echo "hello, world"
The behavior looks identical to that of the original unquoted command, but you can be sure that the actual echo program is now seeing hello world as a single entity. For instance, to print a file that has spaces in the filename, you can do
cat "unix programming.txt" #OR cat 'unix programming.txt'
You can also escape the space, since the shell breaks the argument at blanks.
cat unix\ programming.txt
What's the difference?
- ': the strongest quote, preventing the shell to parse, expand, modify nor split whats inside
- ": allows the shell to expand expressions inside before generating the single, final string
- \: not as convenient and is not guaranteed to generate a single final string, e.g. if shell expansions expand to space
These shell expansions are most used with variables. Lets test this behavior:
a="power nix" # single quotes ' could be used echo "hello, $a world" echo 'hello, $a world' echo hello,\ $a\ world # test: what is argc here?
This illustrates $a which expands the variable a - you can think of $ as an S for "substitute in here the contents of what follows".
And what is that weird backward ` quote for? It executes the string that is inside as a shell command and stores the ouput as a single string. You can think of it as "the output of", and it is extremely useful. Try it out:
seq 1 10 numbers=`seq 1 10` echo $numbers # numbers == "1 2 3 4 5 6 7 8 9 10" numbers="seq 1 10" # numbers == "seq 1 10" literally echo $numbers numbers=$(seq 1 10) # you can use $( ) instead of ` ` echo $numbers
Let us take a look at one of the commands in the intro section to print hello world, say, 3 times, written more nicely:
for i in `seq 1 3`; do echo hello, world done
The for loop simply loops the variable i over each token obtained by the shell when parsing the string after in. For instance,
for i in a b c; do # or for i in 1 2 3 etc echo hello, world done
Would print hello, world 3 times, We're plugging in an independent command, seq, in order to generate the for loop string. By itself,
seq 1 3
simply prints 1 to 3. We can use seq's output as a string by enclosing it in back quotes:
mystring=`seq 1 4` #OR mystring=$(seq 1 4) echo $mystring
Since we know that the shell will break this string at blanks, in case its content were used as part of a commandline. Thus, we just give it to the for loop.
Supose you have a collection of Beethoven's music in a folder, and you want to play a random piano sonata album but no concerto:
cd 'music/classical/Ludwig van Beethoven - Complete Works [Brilliant Classics 100 CD Box]' ls CD 001 - Symphonies Nos.1&3 CD 053 - Piano Sonatas Op.109, Op.110, Op.111 CD 002 - Symphonies Nos.2&7 CD 054 - Piano Variations I ... CD 031 - Violin Sonatas II CD 083 - Scottish Songs WoO 156 & 157, complete CD 032 - Violin Sonatas III CD 084 - Scottish Songs Op. 108 & WoO 158-1 CD 033 - String Trios I CD 085 - Folksongs WoO 158 a b c CD 034 - String Trios II CD 086 - Symphony No. 9 in D minor Op.125 (Furtwagler 1951 EMI) CD 035 - String Quartets Op.18 Nos.1&2 CD 087 - Symphony No. 3, Leonore Overtures - Otto Klemperer CD 036 - String Quartets Op.18 Nos.3&4 CD 088 - Symphonies Nos.5 & 7 - Herbert von KARAJAN CD 037 - String Quartets Op.18 Nos.5&6; Op.95 Serioso CD 089 - Piano Concerto No.5 - Piano Sonatas Nos. 8&23 (Edwin Fischer, Furtwangler) CD 038 - String Quartets Op.59 Nos.1&2 CD 090 - Violin Concerto Op.61, Romances for Violin & Orchestra Nos.1&2 CD 039 - String Quartets Op.74 & Op.131 CD 091 - Piano Sonatas Op.13, 27, 57 - Yves NAT ... # play a random sonata album mplayer "`ls | grep -i Sonatas | grep -i Piano | grep -v Concerto | shuf -n 1`"/* & # vlc is a good alternative to mplayer
Try breaking down the above command to see what its made of. Here, shuf (or gshuf in OSX) just randomly shuffles the lines in its input and ouputs 1 requested line. Note how we had to use both " and ` quotes to get this to work with filenames having spaces in them.
Processes and Jobs
Running a program interactively normally blocks further input
command # input is blocked
Type Control-C to stop the process in case it is still running. If the process is waiting for input, type Control-D to signal end of input (end of file) and exit.
Run a command in the background ("in parallel")
command & # the ampersand & is like "run command & whatever I type next"
Bring it back to foreground
# type Control-Z bg
Run multiple commands in the background ("in parallel")
command1 & command2 & command3 &
List the running jobs spawned by the current shell
Each job has a job id and each process has a process id. Make job number 2 foreground, then background again:
fg 2 bg
Stop command2 by name
Look into all running processes and respective resources
List all running processes with pids
List all running processes named command3
ps -A | grep command3
List jobs by pid
jobs -p # outputs PID 2143
Wait for all jobs to finish
Run an app with low priority
nice firefox &
Keep running program on the computer even after you log out:
nohup command & #OR nohup ls -lR / &
When you log back in (say, via ssh), there will be a file called nohup.out with the output of the process.
You can also use GNU screen for persistent terminal sessions.
Basics / Overview / Practicalities
There are ways to search for a file with a specific name, or a file having a specific content. We will be dealing with the latter mostly for text files. Lets take a look at some useful commands and try to understand them.
Case 1: search for a file myfilename with a specific name.
find . -name myfilename #OR locate myfilename
The locate version simply uses indexed seach which is faster than a find. To index your hard drive for locate, you usualy perform the following command:
or something similar, depending on the specific OS. We'll be dealing mostly with GNU locate, which is more feature-rich (albeit a bit slower) than BSD locate. You can use pattern matching (explained in the next section) to search for substrings of the name, etc. While find normally uses file regex or wildcards, locate is easy to use with full regular expression matching (see next section).
find . -iname '*widget*.java' # -iname means case insensitive # Alternative form using locate locate -ri 'widget.*\.java$' # you'll understand the weird .* and $ later # Search for everything locate -ri 'jddl.*/widget.*\.java$' # Search only in a folder having your username as a substring locate -ri 'username.*/.*widget.*java$'
Case 2: search for a file containing a specific string UpDirection
# Find all .cpp files containing the string UpDir inside them (and not a superstring like UpDirection) find . -name '*.cpp' | xargs grep '\<UpDirection\>' # Alternative form using locate locate -r '\.cpp$' | xargs grep '\<UpDirection\>'
I use an lsc script from Utils which includes all the above inside it
lsc -r # lists all source files recursively (*.c *.java *.cxx *.cpp *.php..) lsc -r | grep scilab | xargs grep --color '\<k\>' # lists all source files with # scilab in the filename having the variable / token k
One of the best options for grep in the context of code searching is '--color' as used above. You can also set it by default by including the following in your ~/.bashrc
export GREP_OPTIONS='--color=auto' # inside your .bashrc
More about grep in the following subsection.
Grep, sed, awk, regular expressions
- Convert decimal to hex
bc # start the calculator language interpreter ibase=10 obase=16 40 # output: 28
See also .
Log onto another machine in your local network:
Execute arbitrary commands in it interactively
ls uname -v
Execute one command on the remote machine non-interactively
ssh firstname.lastname@example.org who # tells you who is logged in the remote machine
Copy a file onto the machine
echo testing >a.txt # creates the file scp a.txt 10.0.0.60: # scp /local_path/file username@remote:/remote_path
Automated transfer session with sftp
sftp usrname@hostname << END <password> put filename1 put filename2 get filename3 bye END
Recursively copy an entire file tree to the remote host, incrementally
rsync -avz folder 10.0.0.60:
If this transfer is aborted, you can restart it and it will transfer only the remaining (the differences). The "z" parameter is to compress prior to transferring. You can omit it for fast networks. Rsync very useful for backups.
Open the remote computer's browser on your own local screen
ssh -X email@example.com firefox # or any other program that needs a GUI
PS: you might have to configure X support, ask a local technicial or Google.
Control the remote screen by opening the browser in its screen, which usually referred to with an ID of :0.
export DISPLAY=:0 # on the remote computer firefox
For fun: you can also control the remote terminal, in case there is any one open
ls /dev/pts # you get a list of numbers. test in each of them: echo hello /dev/pts/1 echo hello /dev/pts/2 echo hello /dev/pts/3 # bingo! the user on this remote host is using terminal 3 and received our message
You will understand how this works later, but UPE is the best place to get a hold of whats going on.
Public key (passwordless) authentication
Generate a public key
ssh-keygen -t rsa -C "firstname.lastname@example.org"
Add the key to your ssh-agent so you don't have to type the password all the time
# start the ssh-agent in the background eval "$(ssh-agent -s)" ssh-add ~/.ssh/id_rsa
Test it by communicatibg between two machines: machine 1 (eg, the one where you generated the key, the 'local' machine) ssh-ing into machine 2 (a 'remote' machine).
scp ~/.ssh/id_rsa.pub maquina2:.ssh/
See also the Github article on ssh keys .
Discover other hosts on your LAN
ping -b 10.0.0.0 # broadcast Ping on your local network address
- Download the sourcecode of this tutorial at github.com/rfabbri/lupe
- Lista de Comandos Básicos GNU/Linux
- Patterns to match all files except one (or a pattern)
- Use A |& B instead of A | B To filter both stdout and stderr of program A into stdin of B