Working with pipes || on Linux.
No, no, nooo... not the PVC one!!! ... but this command-line pipe is just as useful, and its hold a ton of value for system administrators. The | command is called a pipe. It's a form of redirection in Linux used to connect the STDOUT of one command into the STDIN of a second command.
It is used to pipe, or transfer, the standard output from the command on its left into the standard input of the command on its right. Plus, you can stack piped commands as many times as you want, or until you run out of output or file descriptors. Filtering is one of the primary functions of piping. You use piping to filter the contents of a large file, for example, to find a specific string or word.
Let's look at some real-world examples of how piping works.
Counting files in a given directory
The easiest way to count files in a directory on Linux is to use the ls
command and pipe it with the wc -l
command.
$ ls | wc -l
The wc
command is used on Linux in order to print the bytes, characters or newlines count. However, in this case, we are using this command to count the number of files in a directory.
For instance; let’s say that you want to count the number of files present in the "/etc" directory.
Well will run ls
command on the "/etc" directory and pipe it into the wc
command.
$ ls /etc | wc -l
or
$ ls -1 /etc |wc -l
-1 list one file per line
OUPUT
211
Remarks
When using the wc
command, an important thing to remember is that it counts the number of newlines for a given command.
As a result, there is a significant difference between those two commands.
$ ls /etc | wc -l
211
$ ls -l /etc |wc -l
212
As a consequence, you are counting a line that should not be counted, incrementing the final result by one.
ps: why does ls -l
count more files than I do?
$ ls -l
total 1116
The 1116
you see here is not the number of files. Indeed, the number of disk blocks consumed.
Count Files Recursively using find
In order to count files recursively on Linux, you have to use both the find
and wc
commands; all will be passed through the |
command in order to count the number of files.
E.g. if you want to recursively count unhidden/hidden files in the "/etc" directory.
$ sudo find /etc -type f | wc -l
2054
HIDDEN
$ sudo find /etc -name ".*" | wc -l
14
or
$ sudo find /etc -type f -name ".*" | wc -l
14
The find
command, by default, does not stop at the first depth of the directory: it will explore every single subdirectory, making the file search recursive.
Note the f
option, as we are targeting only files.
Below, we will count each and every directory followed by any hidden one in the "/etc" directory.
$ sudo find /etc -type d | wc -l
370
$sudo find /etc -type d -name ".*" | wc -l
0
Note the -type d
option, as we are targeting only directories.
Fetch particular data with pipes
In this example, we only display the available space on a file system.
$ df -h
Filesystem Size Used Avail Use% Mounted on
udev 889M 0 889M 0% /dev
tmpfs 600M 4.0M 597M 1% /run
/dev/mmcblk0p2 129G 11G 118G 17% /
tmpfs 935M 0 935M 0% /dev/shm
tmpfs 5.0M 0 5.0M 0% /run/lock
tmpfs 935M 0 935M 0% /sys/fs/cgroup
/dev/loop1 85M 85M 0 100% /snap/core/11321
/dev/loop2 46M 46M 0 100% /snap/core18/1948
/dev/loop0 85M 85M 0 100% /snap/core/11425
/dev/loop4 27M 27M 0 100% /snap/snapd/12705
/dev/loop3 46M 46M 0 100% /snap/core18/2072
/dev/loop5 53M 53M 0 100% /snap/lxd/21031
/dev/loop6 27M 27M 0 100% /snap/snapd/12396
/dev/loop7 51M 51M 0 100% /snap/lxd/20329
/dev/sda 30G 11G 18G 37% /mnt/SRYUJHIU76883
The df
command (short for disk free), is used to display information related to file systems about total space and available space.
The -h
option stands for human-readable and it prints sizes in the power of 1024.
Now, we will use the |
command to highlight only the lines that do contain the character 'G' for Gigabit and the word 'Filesystem'
$ df -h | grep -E 'G|Filesystem'
Filesystem Size Used Avail Use% Mounted on
/dev/mmcblk0p2 129G 11G 118G 17% /
/dev/sda 30G 11G 18G 37% /mnt/SRYUJHIU76883
The grep
command searches for patterns in each file.
The -E
option helps to search content by applying extended regular expressions to display the machining lines.
Use multiple pipes in a single command
In this example, unlike the earlier ones, we will be using more than one |
in a single command to elaborate its functionality.
We have a file named fire.txt that contains 14344418 lines.
$ cat fire.txt | wc -l
14344418
Now, we want to get the record of the occurrence of the word that matches the name we have provided in the command.
The cat
(short for "concatenate") command is used to fetch data from a particular file.
The sort
command is used to sort a file, arranging the records in a particular order. -r
option tells to sort in random order.
The uniq
command filters out the repeated lines in a file.
The tee
command reads from the standard input and writes to both standard output and one or more files at the same time. It is mostly used in combination with other commands through piping.
Basically, it is used to save the result in another file, here named fire-2.txt
$ cat fire.txt | grep monkey | sort -r | uniq | tee fire-2.txt | wc -l
6723
As you can see the word monkey is occurring 6723 times.
Let's double-check or say!
$ du -sch * | grep fire
80K fire-2.txt
134M fire.txt
The file fire-2.txt has been created and its size has dramatically shrunken from 134M to 80K, keeping the non-duplicates.
Now, let's see its content.
$ cat fire-2.txt
...
jomonkey39
monkey130
lil.monkey
monkeyhands
codemonkey19
frozenmonkey
monkey1740
monkey1990
fnkymonkey
monkeys083
doublejandmonkey
...
Thank you for reading. As we saw, the |
command helps us mashed-up two or more commands at the same time and run them consecutively; basically, passes a parameter such as the output of one process to another process that accepts it as input. The possibilities here are just limited by your imagination.