Learning the Unix Programming Environment: mudanças entre as edições

De Pontão Nós Digitais
Ir para navegaçãoIr para pesquisar
Sem resumo de edição
(more stuff up to ~/bin')
Linha 9: Linha 9:
== First: read UPE ==
== First: read UPE ==


Even though LUPE is reasonably self-contained and covers what is most immediately useful to you, reading the original masterpiece - UPE - is highly recommended. Specially the exercises. Reading UPE is ''the'' top way of learning Unix at a level most useful to programmers. Reding it will be the single thing that ''really'' makes you understand the tools, the idiosyncratic syntax, and why Unix's highly programmable interface works they way it does. Really understanding the practical tools and the power user side of the OS is what makes a highly efficent programmer and user. With the LUPE supplements you'll leverage that powerful knowledge into the broader context of today's practice of programming and advanced usage.
Even though LUPE is reasonably self-contained and covers what is most immediately useful to you, reading the original masterpiece - UPE - is highly recommended. Specially the exercises. Reading UPE is ''the'' top way of learning Unix at a level most useful to programmers. Raeding it will be the single thing that ''really'' makes you understand the tools, the idiosyncratic syntax, and why Unix's highly programmable interface works the way it does. Really understanding the practical tools and the power user side of the OS is what makes a highly efficent programmer and user. With the LUPE supplements you'll leverage that powerful knowledge into the broader context of today's practice of programming and advanced usage.




Linha 17: Linha 17:
==Basics==
==Basics==


Open a terminal (by launching the Terminal app). Play some commands and constructs of the Shell environment language:
Open a terminal (by launching the Terminal app). Play some commands and constructs of the Shell (sh) environment language:


List files in the current directory/foder:
List files in the current directory/foder:
Linha 43: Linha 43:


Then tell it to print this 10 times using a for loop in the bash language (just type it for now - we'll help you understand the syntax idiosyncrasies later):
Then tell it to print this 10 times using a for loop in the bash language (just type it for now - we'll help you understand the syntax idiosyncrasies later):
  for i in `seq 1 10`; do echo hello, world; done
  for i in `seq 1 10`; do echo hello, world; done  


Each command is an executable program, which is usually written in C or can be internally modeled with C's abstraction. As in C, each command has a return value and default/standard input and output handles (stdout/stdin).  
Each Unix command is an executable program, which is usually written in C or can be internally modeled with C's abstraction. As in C, each command has a return value and default/standard input and output handles (stdout/stdin).
 
The input and output can be redirected


The standard input and output can be redirected. 
Create a text file containing hello, world using the redirect operator
Create a text file containing hello, world using the redirect operator


Linha 65: Linha 64:
  cat a.txt | wc -l  
  cat a.txt | wc -l  


You'll understand what the pipe `|' means later, but it basically connects the standard output of a program to the standard input of another.
You'll understand what the pipe `|' means later, but it basically connects the default output of a program to the standard input of another.


=== Command History ===
===Command History===


  history
  history
Linha 158: Linha 157:
  ./script
  ./script


===More complex stuff to try and understand later===
===More useful and fun stuff to try and understand later===
 
Warning: The following commands may require the appropriate software packages to be installed to your system. 
<syntaxhighlight lang="bash">
# Convert images
convert image.png image.jpg
 
# Convert all images to a PDF
convert *.jpg document.pdf


Converting an image
# Play videos
  convert a.png a.jpg
mplayer movie.mkv
 
# Extract pages from a pdf
pdftk old.pdf cat 1-9 26-end output new.pdf
# Extract all frames of a video
mplayer video.mp4 -vo png -frames 5
# Make a GIF animation from images
convert  -adjoin  -loop 0  -delay 5  *.gif  animation.gif
</syntaxhighlight>
More useful commands are available at [[Utils|Utils-Macambira]].
 
 
==Setting up the modern unix programming environment==
 
===Useful script collections===
 
 
====Our Utilitiess====
Our personal scripts in current use and actual config files used for editing,
programming, and daily use is available at [[Utils|Utils-Macambira]].
 
 
====Funcoeszz====
A nifty collection of little shell utilities, having a portuguese slant to them.
They can be found at [http://funcoeszz.net funcoeszz.net]. I personally use:
* <tt>zzarrumanome</tt> - convert file names to lowercase and spaces to underscore, useful for shell scripting
zzarrumanome *
  RAMONES - I Don't Care.mp3 -> ramones-i_dont_care.mp3
  Toy Dolls - Wakey, Wakey!.mp3 -> toy_dolls-wakey_wakey.mp3
*
* <tt>zztradutor<tt>
zztradutor pt-de livro                # Buch
 
 
===Configuring Bash for Programming===
 
The following are most useful BASH configuration files
  ~/.bashrc
  ~/.bash_profile
  ~/.bash_aliases
The <tt>.bash_profile</tt> shell script is meant to be executed once per work session - upon your login to the system or upon starting a login shell as in <tt>bash -ls</tt>.
The <tt>.bashrc</tt> script is executed during each new shell that is started, as when typing <tt>bash</tt> at the prompt. The <tt>.bash_aliases</tt> shell script is executed by a suitably configured <tt>.bashrc</tt> and by convention contains only alias/shortcut definitions. For instance,
alias fox="firefox"              # you can now just type fox to start firefox
 
What should you put in <tt>~/.bashrc</tt> and <tt>~/.bash_profile</tt>?  You
should take a look at the existing ones in your system, and copy from more
experienced friends. The ones I actually use on a daily basis are available at
[[Utils]].
====~/bin====
Create your own <tt>~/bin</tt> folder to install the personal scripts that you'll be using often:
mkdir ~/bin
 
Move the desired script, say <tt>myscript</tt>, there:
mv myscript ~/bin
 
Now change permissions to execute
chmod a+x ~/bin/myscript
 
Set the path in your ~/.bash_profile
export PATH=$HOME/bin:$PATH
 
Reload the environment for changes to take effect
bash -ls                # start as login shell, reloading .profile
                    # XOR
. ~/.bash_profile
 
You can now run your script without the <tt>./</tt>:
myscript
 
You should use the <tt>cx</tt> shortcut given in UPE for making things more simple.
cx ~/bin/myscript
 
You can get our actual working versions of <tt>cx</tt> and other utilities suggested by UPE at
[[Utils]] (''do'' peek at the source code!).
 
==Understanding the Bourne Shell==
Think about the internal C implementation model for commands. The commands's <tt>main()</tt> function sees its arguments through <tt>argv</tt> - the entire argument line broken into a number of strings. It is the shell that breaks the entire command string as in
  "echo hello, world"      --- sh --->      "echo" "hello," "world"
into distinct components.  Understanding this process is key to understanding how the shell works,
specially how to use the different quotes `'" and the escape character '\' .
 
The program <tt>echo</tt> can only see two distict "hello" and "world" strings
in its <tt>argv</tt> for the above example. For it to see "hello world" as a unit,
quotes can be employed:
echo "hello, world"
 
The behavior looks identical to that of the
original unquoted command, but you can be sure that the actual echo program is
now seeing <tt>hello world</tt> as a ''single'' entity. For instance,
to print a file that has spaces in the filename, you can do
cat "unix programming.txt"
                                  #OR
cat 'unix programming.txt'
 
You can also escape the space, since the shell breaks the argument at blanks.
 
cat unix\ programming.txt
 
What's the difference?
* <b>'</b>: the strongest quote, preventing the shell to parse, expand, modify nor split whats inside
* <b>"</b>: allows the shell to expand expressions inside before generating the single, final string
* <b>\</b>: not as convenient and is not guaranteed to generate a single final string, e.g. if shell expansions expand to space
 
These shell expansions are most used with variables. Lets test this behavior:
a="power nix"                        # single quotes ' could be used
echo "hello, $a world"
echo 'hello, $a world'
echo hello,\ $a\ world              # test: what is argc here?
 
This illustrates $a which expands the variable a - you can think of <tt>$</tt> as an S for "substitute in here the contents of what follows".
 
And what is that weird backward <tt>`</tt> quote for? ''It executes the string that is inside as a shell command and stores the ouput as a single string''. You can think of it as "the output of", and it is extremely useful. Try it out:
  seq 1 10
  numbers=`seq 1 10`
  echo $numbers        # numbers == "1 2 3 4 5 6 7 8 9 10"
  numbers="seq 1 10"    # numbers == "seq 1 10"  literally
  echo $numbers
 
Let us take a look at one of the commands in the intro section to print hello world, say, 3 times, written more nicely:
<syntaxhighlight lang="bash">
  for i in `seq 1 3`; do
    echo hello, world
done
</syntaxhighlight>
The <tt>for</tt> loop simply loops the variable <tt>i</tt> over each token obtained by the shell when parsing the string after <tt>in</tt>. For instance,
<syntaxhighlight lang="bash">
for i in a b c; do              # or for i in 1 2 3  etc
    echo hello, world
done
</syntaxhighlight>
Would print hello, world 3 times, We're plugging in an independent command, <tt>seq</tt>, in order to generate the for loop string. By itself,
seq 1 3
simply prints 1 to 3. We can use seq's output as a string by enclosing it in back quotes:
mystring=`seq 1 4`
                                #OR
mystring=$(seq 1 4)
echo $mystring
Since we know that the shell will break this string at blanks, in case its content were used as part of a commandline. Thus, we just give it to the for loop.  





Edição das 20h07min de 2 de outubro de 2014


This is a series of practical tutorials on Unix - mainly GNU/Linux and OSX - in the context of modern programming practice. It is an up-to-date learning supplement to the great UPE book, "The Unix Programming Environment" by Kernighan and Ritchie. In addition to summarizing the outstanding classic and adapting it for current environments, LUPE also gathers a number of valuable practical tips and pointers from experienced programmers.

This Page is Under Development

Lots-of-tiny-code-reddit191578.png

First: read UPE

Even though LUPE is reasonably self-contained and covers what is most immediately useful to you, reading the original masterpiece - UPE - is highly recommended. Specially the exercises. Reading UPE is the top way of learning Unix at a level most useful to programmers. Raeding it will be the single thing that really makes you understand the tools, the idiosyncratic syntax, and why Unix's highly programmable interface works the way it does. Really understanding the practical tools and the power user side of the OS is what makes a highly efficent programmer and user. With the LUPE supplements you'll leverage that powerful knowledge into the broader context of today's practice of programming and advanced usage.


Background

Basics

Open a terminal (by launching the Terminal app). Play some commands and constructs of the Shell (sh) environment language:

List files in the current directory/foder:

ls

A common shortcut for ls is d, though it is not always available. Try it anyways:

d

In the aliases section we'll set this useful shortcut up in case you don't have it by default.

Also try these

ls -a                                 # "all" - shows entries starting with '.', usually hidden
ls -l
ls -la

For which a useful short is

ll

Don't worry if you don't have ll - we'll set it up later.

Tell the computer to print "Hello, World":

echo hello, world

Then tell it to print this 10 times using a for loop in the bash language (just type it for now - we'll help you understand the syntax idiosyncrasies later):

for i in `seq 1 10`; do echo hello, world; done 

Each Unix command is an executable program, which is usually written in C or can be internally modeled with C's abstraction. As in C, each command has a return value and default/standard input and output handles (stdout/stdin).

The standard input and output can be redirected. Create a text file containing hello, world using the redirect operator

echo hello, world > a.txt

Print out the contents of the file

cat a.txt

You can use the redirect operator on any command construct that has an output, even on the for construct above:

for i in `seq 1 10`; do echo hello, world; done > a.txt

Count the number of words in the file

wc -l a.txt

The UPE book masterfully explains why you should prefer that over (try it)

cat a.txt | wc -l 

You'll understand what the pipe `|' means later, but it basically connects the default output of a program to the standard input of another.

Command History

history

Gives a numbered list of your previous commands. You can ask for previous command number 17 using

!17

Instead of pressing UP multiple times. Another way is to search history by keyword as you type. Lets say you're trying to reverse search the above for statement. You can do this:

  1. type Control-R
  2. type any substring of the desired command, say for
  3. a command matching that substring is displayed
  4. press Enter to execute it, or keep pressing Control-R to search for other options

You can also refer to the previous command's 3rd argument on your current command:

wc -l a.txt
cat !!:2

Your history is located in your home directory, which can be referred to by '~'

cat ~/.history

On many setups this will contain the history up to a previous session. The current history is usually not stored in ~/.history until the session is over.

Further Basics

Pressing Control-C usually aborts any command that is running interactively through the shell. Other commands that are not blocking the shell can be ended using kill and killall:

killall firefox

Or kill, which requires a process ID associated to a running app:

ps -A|grep firefox
   553
kill 553

If it doesnt go away, use -9 or -KILL

killall -9 firefox
kill -KILL 553

A reference to any single command is provided by man:

man bc

The standard sections of the manual include:

 1      User Commands
 2      System Calls
 3      C Library Functions
 4      Devices and Special Files
 5      File Formats and Conventions
 6      Games et. Al.
 7      Miscellanea
 8      System Administration tools and Deamons

You can ask man to look up an entry on a determined manual. If you want the C function, you can skip shell commands by using

man 3 printf

To search where a given keyword, eg exec, appears in all the manuals

apropos exec


Scripting

The shell is a full-fledged programming language which derives its power from being able to very easily control and combine commands to work together inside the OS environment. You can combine the above commands into a shell script using any text editor. Or you can just type

echo 'ls | wc -l
      echo thats the number of files in the current folder.' > script.sh

Check whats inside the script.sh file

 cat script.sh

Run the command

 sh script.sh

Usually you won't edit using echo, but using more convenient editors. There are even variables you can use. Lets say we put the following into script.sh <syntaxhighlight lang="bash">

#!/bin/sh
num_files=`ls | wc -l`
echo $num_files is the number of files in the current folder.

</syntaxhighlight>

You can also make it executable and even remove the extension so it becomes more like a command itself.

mv script.sh script
chmod a+x script
./script

More useful and fun stuff to try and understand later

Warning: The following commands may require the appropriate software packages to be installed to your system. <syntaxhighlight lang="bash">

  1. Convert images
convert image.png image.jpg
  1. Convert all images to a PDF
convert *.jpg document.pdf
  1. Play videos
mplayer movie.mkv
  1. Extract pages from a pdf
pdftk old.pdf cat 1-9 26-end output new.pdf

  1. Extract all frames of a video
mplayer video.mp4 -vo png -frames 5

  1. Make a GIF animation from images
convert  -adjoin  -loop 0  -delay 5  *.gif  animation.gif

</syntaxhighlight> More useful commands are available at Utils-Macambira.


Setting up the modern unix programming environment

Useful script collections

Our Utilitiess

Our personal scripts in current use and actual config files used for editing, programming, and daily use is available at Utils-Macambira.


Funcoeszz

A nifty collection of little shell utilities, having a portuguese slant to them. They can be found at funcoeszz.net. I personally use:

  • zzarrumanome - convert file names to lowercase and spaces to underscore, useful for shell scripting
zzarrumanome *
  RAMONES - I Don't Care.mp3 -> ramones-i_dont_care.mp3
  Toy Dolls - Wakey, Wakey!.mp3 -> toy_dolls-wakey_wakey.mp3
  • zztradutor
zztradutor pt-de livro                # Buch


Configuring Bash for Programming

The following are most useful BASH configuration files

 ~/.bashrc
 ~/.bash_profile
 ~/.bash_aliases

The .bash_profile shell script is meant to be executed once per work session - upon your login to the system or upon starting a login shell as in bash -ls. The .bashrc script is executed during each new shell that is started, as when typing bash at the prompt. The .bash_aliases shell script is executed by a suitably configured .bashrc and by convention contains only alias/shortcut definitions. For instance,

alias fox="firefox"              # you can now just type fox to start firefox

What should you put in ~/.bashrc and ~/.bash_profile? You should take a look at the existing ones in your system, and copy from more experienced friends. The ones I actually use on a daily basis are available at Utils.

~/bin

Create your own ~/bin folder to install the personal scripts that you'll be using often:

mkdir ~/bin

Move the desired script, say myscript, there:

mv myscript ~/bin

Now change permissions to execute

chmod a+x ~/bin/myscript

Set the path in your ~/.bash_profile

export PATH=$HOME/bin:$PATH

Reload the environment for changes to take effect

bash -ls                 # start as login shell, reloading .profile
                   # XOR
. ~/.bash_profile

You can now run your script without the ./:

myscript

You should use the cx shortcut given in UPE for making things more simple.

cx ~/bin/myscript

You can get our actual working versions of cx and other utilities suggested by UPE at Utils (do peek at the source code!).

Understanding the Bourne Shell

Think about the internal C implementation model for commands. The commands's main() function sees its arguments through argv - the entire argument line broken into a number of strings. It is the shell that breaks the entire command string as in

 "echo hello, world"      --- sh --->      "echo" "hello," "world"

into distinct components. Understanding this process is key to understanding how the shell works, specially how to use the different quotes `'" and the escape character '\' .

The program echo can only see two distict "hello" and "world" strings in its argv for the above example. For it to see "hello world" as a unit, quotes can be employed:

echo "hello, world"

The behavior looks identical to that of the original unquoted command, but you can be sure that the actual echo program is now seeing hello world as a single entity. For instance, to print a file that has spaces in the filename, you can do

cat "unix programming.txt"
                                 #OR
cat 'unix programming.txt'

You can also escape the space, since the shell breaks the argument at blanks.

cat unix\ programming.txt

What's the difference?

  • ': the strongest quote, preventing the shell to parse, expand, modify nor split whats inside
  • ": allows the shell to expand expressions inside before generating the single, final string
  • \: not as convenient and is not guaranteed to generate a single final string, e.g. if shell expansions expand to space

These shell expansions are most used with variables. Lets test this behavior:

a="power nix"                        # single quotes ' could be used
echo "hello, $a world"
echo 'hello, $a world'
echo hello,\ $a\ world               # test: what is argc here?

This illustrates $a which expands the variable a - you can think of $ as an S for "substitute in here the contents of what follows".

And what is that weird backward ` quote for? It executes the string that is inside as a shell command and stores the ouput as a single string. You can think of it as "the output of", and it is extremely useful. Try it out:

 seq 1 10
 numbers=`seq 1 10`
 echo $numbers         # numbers == "1 2 3 4 5 6 7 8 9 10"
 numbers="seq 1 10"    # numbers == "seq 1 10"   literally
 echo $numbers

Let us take a look at one of the commands in the intro section to print hello world, say, 3 times, written more nicely: <syntaxhighlight lang="bash">

for i in `seq 1 3`; do 
   echo hello, world
done 

</syntaxhighlight> The for loop simply loops the variable i over each token obtained by the shell when parsing the string after in. For instance, <syntaxhighlight lang="bash">

for i in a b c; do              # or for i in 1 2 3   etc
   echo hello, world
done 

</syntaxhighlight> Would print hello, world 3 times, We're plugging in an independent command, seq, in order to generate the for loop string. By itself,

seq 1 3

simply prints 1 to 3. We can use seq's output as a string by enclosing it in back quotes:

mystring=`seq 1 4`
                                #OR
mystring=$(seq 1 4) 
echo $mystring

Since we know that the shell will break this string at blanks, in case its content were used as part of a commandline. Thus, we just give it to the for loop.


Links