24

Your terminal is not a terminal: An Introduction to Streams

 5 years ago
source link: https://www.tuicool.com/articles/hit/NfqAZrF
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

I love streams because I don’t like software .

I always try to build less software. Less software means you have to spend less time updating it, less time fixing it, and less time thinking about it. The only thing better than “less software” is no software at all.

Streams help us writing less software because they allow programs to communicate with each other.

If programs cannot communicate they must have too many features to satisfy their user’s needs, therefore creating more software. By enabling inter-process communication, streams encourage software to be smaller and sometimes can even prevent software from being written.

Learning about streams helps you understand a bit better how UNIX systems work and keep your development environment simple .

What streams are

Streams are just that: streams. In the same way that a river has a stream of water, programs have streams of data . Moreover, just like you can use steel pipes to carry water from one place to another, you can use UNIX pipes to carry data from one program to another . This was the very analogy that inspired the design of streams:

We should have some ways of connecting programs like a garden hose — screw in another segment when it becomes necessary to massage data in another way. This is the way of I/O also. — Douglas McIlroy

Streams can be used to pass data into programs and to get data out of them.

program-input-output-streams.png

In UNIX, programs get some streams attached to them by default, both for input and output. We call these standard streams .

There are three different standard streams:

  • stdin or standard input is the stream which feeds your program with data
  • stdout or standard output is the stream your program writes its main output to
  • stderr or standard error is the stream your program writes its error messages to

The program fortune , for example, writes some pieces of wisdom to the stdout stream.

$ fortune
It is simplicity that is difficult to make
-- Bertold Brecht

When fortune ran, it got stdin , stdout and stderr attached to it. Since it didn’t produce any errors and didn’t get any external input, it just wrote its output to stdout .

fortune-streams.png

cowsay is another program that writes to stdout . cowsay takes a string and shows a cow saying it.

$ cowsay "Brazil has a decent president"
 _______________________________
< Brazil has a decent president >
 -------------------------------
        \   ^__^
         \  (oo)\_______
            (__)\       )\/\
                ||----w |
                ||     ||

Differently from fortune , cowsay doesn’t necessarily say smart things – as we’ve just seen. Thankfully, we can feed cowsay through the stdin stream attached to it.

All we need to do to make cowsay smarter and repeat fortune ’s quotes is to use something we call a pipe — represented by | — to attach fortune ’s stdout to cowsay ’s stdin .

$ fortune | cowsay
 _________________________________________
/ A language that doesn't have everything \
| is actually easier to program in than   |
| some that do.                           |
|                                         |
\ -- Dennis M. Ritchie                    /
 -----------------------------------------
        \   ^__^
         \  (oo)\_______
            (__)\       )\/\
                ||----w |
                ||     ||

We use pipes for connecting the output stream of a program to the input stream of another.

fortune-and-cowsay-streams.png

You can see the output of cowsay in your screen because, by default, your terminal gets the stdin , stdout and stderr standard streams attached to it.

Data goes in through stdout and stderr and goes out in the other end: your monitor. Similarly, your keyboard input goes through stdin to a program.

standard-streams-devices.png

Source: Wikipedia

The cat program, for example, uses the stdin to receive input from your keyboard and the stdout to send it out:

$ cat
Everything I write before pressing Enter
Everything I write before pressing Enter
Gets logged right after
Gets logged right after

cat-keyboard-and-display.png

We can make it more elaborate by using sed to replace all occurrences of I by We every time we press Enter :

$ cat | sed -E "s/I/We/"
I think streams are quite cool.
We think streams are quite cool.

sed-cat-and-keyboard-and-monitor.png

Also, in case you didn’t know, sed stands for stream editor .

How streams talk to your “terminal”

Many websites linked to the last blog post I’ve written . In the comments section of one of them, someone pointed out that I wasn’t really using a terminal.

They were completely right in their not-pedant-at-all comment. However, here is a picture of me in 1978 — a bit before I was born — using an HP 2647A serial terminal:

old-serial-terminal.jpg

If you are not a hardcore time-traveller like me, what you use is just a terminal emulator . Who could’ve guessed, right?

Terminal emulators are software simulations of “real” terminals.These emulators provide you with an interface to interact with the Linux TTY driver. The TTY driver is responsible for handling the data to and from programs.

terminal-interaction-diagram.png .

Each TTY has its own stdin , stdout , and stderr streams connected to it. These are the streams provided to programs for them to read from ( stdin ) and write to ( stdout and stderr ).

Here is a more accurate version of what happened when you ran cat | sed -E "s/I/We/" in the last example:

tty-and-processes.png

Like everything in UNIX , the tty is a file. Each instance of a terminal emulator has a different tty file associated with it. Because each emulator reads from and writes to a different file, you don’t see the output of the programs you run in all the windows you have open.

To find out which tty is associated with a terminal window you can use the tty command.

multiple-terminals-multiple-ttys.png

When you open a new terminal window, this is what its streams point to:

empty-terminal-streams.png

In the image above, the /dev/ttys/005 is just an example. It could’ve been any other file as there will be a new one for each tty instance.

Redirection

To write the output of a program to a file instead of the tty , you can direct the stdout stream somewhere else.

In the example below, we write the contents of the / directory to the file root_content.txt in the /tmp folder. We do this by using the > operator, which allows us to redirect the stdout stream by default.

$ ls / 1> /tmp/root_content.txt

To check what is inside /tmp/root_content.txt you can now use cat :

$ cat /tmp/root_content.txt
Applications
Library
Network
System
Users
Volumes
bin
cores
dev
etc
home
net
private
sbin
themes
tmp
usr
var

Differently from what it would usually do if you had just used ls / , the ls command didn’t write anything to your terminal. Instead of writing to the /dev/tty file that your terminal emulator reads from, it has written to /tmp/content_list.txt .

stream-redirecting-stdout-to-file.png

We can achieve the same redirection effect by using > instead of 1> .

$ ls / > /tmp/root_content.txt

Omitting the prefixed number works because the 1 in front of > indicates which stream we want to redirect. In this case, 1 is the file descriptor for stdout .

Since the tty is just a file, you can also redirect an stdout stream from one terminal to another.

cowsay-from-one-to-another.png

If we wanted to redirect the stderr stream, we could prefix its file-descriptor — which is 2 — to > .

$ cat /this/path/does/not/exist 2> /tmp/cat_error.txt

stream-redirecting-stderr-to-file.png

Now the /tmp/cat_error.txt contains whatever cat has written to stderr .

$ cat /tmp/cat_error.txt
cat: /this/path/does/not/exist: No such file or directory

For redirecting both stdin and stderr we can use &> .

$ cat /does/not/exist /tmp/root_content.txt &> /tmp/two_streams.txt

Now /tmp/two_streams will contain what has been written to both stdout and stderr .

$ cat /tmp/two_streams.txt
cat: /does/not/exist: No such file or directory
Applications
Library
Network
System
Users
Volumes
bin
cores
dev
etc
home
installer.failurerequests
net
private
sbin
themes
tmp
usr
var

redirecting-stdout-and-stderr.png

You must be careful when writing to a file with > . Using a single > overrides the contents of a file.

$ printf "Look, I have something inside" > /tmp/careful.txt

$ cat /tmp/careful.txt
Look, I have something inside

$ printf "Now I have something else" > /tmp/careful.txt

$ cat /tmp/careful.txt
Now I have something else

To append to a file instead of overwriting its contents, you must use >> .

$ printf "Look, I have something inside" > /tmp/careful.txt

$ cat /tmp/careful.txt
Look, I have something inside

$ printf "\nNow I have one more thing" >> /tmp/careful.txt

$ cat /tmp/careful.txt
Look, I have something inside
Now I have one more thing

For reading from stdin , we can use the < operator.

The following command uses the stdin stream to feed sed with the contents of /usr/share/dict/words . sed then selects a random line and writes it to stdout .

$ sed -n "${RANDOM}p" < /usr/share/dict/words
alloestropha

Since stdin ’s file descriptor is 0 , we can achieve the same effect by prefixing it to < .

$ sed -n "${RANDOM}p" 0< /usr/share/dict/words
pentameter

It’s also important to notice the difference between using redirection operators and pipes. When using pipes, we attach a program’s stdout to another program’s stdin . When using redirection, we change the location a specific stream points to when starting a program.

Since streams are just file-descriptors, we can create as many streams as we want. For that, we can use exec to open files on specific files descriptors.

In the example below, we open /usr/share/dict/words for reading on the descriptor 3 .

$ exec 3< /usr/share/dict/words

extra-file-descriptor.png

Now we can use this descriptor as the stdin for a program by using <& .

$ sed -n "${RANDOM}p" 0<&3
dactylic

What the <& operator does in the above example is duplicating the file descriptor 3 and making 0 ( stdin ) a copy of it.

copying-extra-file-descriptor-to-stdin.png

Once you have opened a file descriptor for reading, you can only “consume” it once. Hence why trying to use 3 again won’t work:

$ grep dactylic 0<&3

To close a file-descriptor we can use - as if we were copying it to the file-descriptor we want to close.

$ exec 3<&-

Just like we can use < to open a file for reading, we can use > to open a file for writing.

In the example below we create a file called output.txt , open it in writing mode, and duplicate its descriptor to 4 :

$ touch /tmp/output.txt
$ exec 4>&/tmp/output.txt

extra-writing-descriptor.png

Now if we want cowsay to write to the /tmp/output.txt file, we can duplicate the file descriptor for 4 and copy it to 1 ( stdout )

$ cowsay "Does this work?" 1>&4

$ cat /tmp/output.txt
 _________________
< Does this work? >
 -----------------
        \   ^__^
         \  (oo)\_______
            (__)\       )\/\
                ||----w |
                ||     ||

cowsay-writing-to-duplicate-descriptor.png

Intuitively, to open a file for reading and writing you can use <> . First let’s create a file called /tmp/lines.txt , open an r/w descriptor for it and copy it to 5 .

$ touch /tmp/lines.txt
$ exec 5<> /tmp/lines.txt

extra-read-and-write-descriptor.png

In the example below, we copy the first 3 lines of /usr/share/dict/propernames to /tmp/lines.txt .

$ head -n 3 /usr/share/dict/propernames 1>&5
$ cat /tmp/lines.txt
Aaron
Adam
Adlai

Notice that if we tried to read from 5 with cat we would not get any output because when writing we advanced into the file and 5 is now at its end.

$ cat 0<&5

We can solve this by closing 5 and reopening it.

$ exec 5<&-
$ exec 5<> /tmp/lines.txt
$ cat 0<&5
Aaron
Adam
Adlai

Postscriptum

On generating random numbers

In the examples above, I have used $RANDOM to generate random numbers and pass them to sed in order to select random lines from the /usr/share/dict/words file.

You might have noticed that this usually gives you words starting with a , b or c . That’s because RANDOM is two-bytes long and therefore can only go from 0 to 32,767.

The /usr/share/dict/words file has 235,886 lines.

$ wc -l /usr/share/dict/words
235886 /usr/share/dict/words

Since the biggest possible number generated by RANDOM is approximately 7 times smaller than /usr/share/dict/words , it is not suitable to select random words from it. In this post, it was used merely for the sake of simplicity.

On the TTY and I/O devices

I intentionally omitted some details when explaining that the TTY and the terminal emulator stand between the I/O devices and the processes.

You can find a much more complete and in-depth explanation of all the components involved in this communication process at this extraordinary post by Linus Åkesson called “The TTY Demystified” .

References and useful links


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK