sort has several optimizations for sorting based on datatypes. This command writes sorted concatenation of all files to standard output. However, be wary, complex sort operations on large files of a few GigaBytes can impede the system performance.
When running a production server with limited CPU and/or memory availability, it is recommended to offload these larger files to a workstation for sorting operations during peak business hours.
Switch | Action |
-b | Ignore leading blank lines |
-d | Dictionary order, consider only blanks and alphanumeric characters |
-f | Ignore case, folding lower and upper characters |
-g | General numeric sort |
-M | Month sort |
-h | Human readable numeric sort 1KB, 1MB, 1GB |
-R | Random sort |
-m | Merge already sorted files |
Feel free to copy the tabular text below and follow along with our sort examples. Be sure each column is separated with a tab character.
first name | last name | office |
Ted | Daniel | 101 |
Jenny | Colon | 608 |
Dana | Maxwell | 602 |
Marian | Little | 903 |
Bobbie | Chapman | 403 |
Nicolas | Singleton | 203 |
Dale | Barton | 901 |
Aaron | Dennis | 305 |
Santos | Andrews | 504 |
Jacqueline | Neal | 102 |
Billy | Crawford | 301 |
Rosa | Summers | 405 |
Kellie | Curtis | 903 |
Matt | Davis | 305 |
Gina | Carr | 902 |
Francisco | Gilbert | 101 |
Sidney | Mack | 901 |
Heidi | Simmons | 204 |
Cristina | Torres | 206 |
Sonya | Weaver | 403 |
Donald | Evans | 403 |
Gwendolyn | Chambers | 108 |
Antonia | Lucas | 901 |
Blanche | Hayes | 603 |
Carrie | Todd | 201 |
Terence | Anderson | 501 |
Joan | Parsons | 102 |
Rose | Fisher | 304 |
Malcolm | Matthews | 702 |
Using sort in its most basic, default form −
[root@centosLocal centos]# sort ./Documents/names.txt
Aaron Dennis 305
Antonia Lucas 901
Billy Crawford 301
Blanche Hayes 603
Bobbie Chapman 403
Carrie Todd 201
Cristina Torres 206
Dale Barton 901
Dana Maxwell 602
Donald Evans 403
Francisco Gilbert 101
Gina Carr 902
Gwendolyn Chambers 108
Heidi Simmons 204
Jacqueline Neal 102
Jenny Colon 608
Joan Parsons 102
Kellie Curtis 903
Malcolm Matthews 702
Marian Little 903
Matt Davis 305
Nicolas Singleton 203
Rosa Summers 405
Rose Fisher 304
Santos Andrews 504
Sidney Mack 901
Sonya Weaver 403
Ted Daniel 101
Terence Anderson 501
[root@centosLocal centos]#
Sometimes, we will want to sort files on another column, other than the first column. A sort can be applied to other columns with the -t and -k switches.
-t : define a file delimiter
-k : key count to sort by (think of this as a column specified from the delimiter.
-n : sort in numeric order
Note − In some examples, we have used cat piped into grep. This was to demonstrate the concepts of piping commands. Outputting cat into grep can increase the system load hundreds of times over with large files while adding complex sorting. This will make veteran Linux administrators cringe.
Now that we have a good idea of how the pipe character works, this poor practice will be avoided in the chapters to follow. The key to keeping the system resources low with commands like sort is learning to use them efficiently.
[root@centosLocal centos]# sort -t ‘ ‘ -k 3n ./Documents/names.txt
Ted Daniel 101
Francisco Gilbert 101
Jacqueline Neal 102
Joan Parsons 102
Gwendolyn Chambers 108
Carrie Todd 201
Nicolas Singleton 203
Heidi Simmons 204
Cristina Torres 206
Billy Crawford 301
Rose Fisher 304
Aaron Dennis 305
Matt Davis 305
Bobbie Chapman 403
Donald Evans 403
Sonya Weaver 403
Rosa Summers 405
Terence Anderson 501
Santos Andrews 504
Dana Maxwell 602
Blanche Hayes 603
Jenny Colon 608
Malcolm Matthews 702
Antonia Lucas 901
Dale Barton 901
Sidney Mack 901
Gina Carr 902
Kellie Curtis 903
Marian Little 903
[root@centosLocal centos]#
Now we have our list sorted by office number. The astute reader will notice something out of the ordinary after the -t switch; single quotes separated by what appears to be a few spaces. This was actually a literal Tab character sent to the shell. A literal Tab can be sent to the BASH shell using the key combination of control+Tab+v.
Most shells will interpret the Tab key as a command. For example, auto-completion in BASH. The shell needs an escape sequence to recognize a literal Tab character. This is one reason why Tabs are not the best choice for delimiters with Linux. Generally speaking, it is best to avoid both spaces and tabs, as they can cause issues when scripting a shell.
Let us fix our names.txt file.
[root@centosLocal centos]# sed -i ‘s/\t/:/g’ ./Documents/names.txt &&
cat ./Documents/names.txt
Ted:Daniel:101
Jenny:Colon:608
Dana:Maxwell:602
Marian:Little:903
Bobbie:Chapman:403
Nicolas:Singleton:203
Dale:Barton:901
Aaron:Dennis:305
Santos:Andrews:504
Jacqueline:Neal:102
Billy:Crawford:301
Rosa:Summers:405
Kellie:Curtis:903:
Matt:Davis:305
Gina:Carr:902
Francisco:Gilbert:101
Sidney:Mack:901
Heidi:Simmons:204
Cristina:Torres:206
Sonya:Weaver:403
Donald:Evans:403
Gwendolyn:Chambers:108
Antonia:Lucas:901
Blanche:Hayes:603
Carrie:Todd:201
Terence:Anderson:501
Joan:Parsons:102
Rose:Fisher:304
Malcolm: Matthews:702
[root@centosLocal centos]#
Now, it will be much easier to work with the text file. If someone demands it be returned to Tab-delimited for another application (this is common), we can accomplish that task easily as −
sed -i ‘s/:/\t/g’ ./Documents/names.txt
Common end-user applications will work well with Tabs as a delimiter (An Accountant does not want to see a colon separating data columns while working on Spreadsheets.). So learning to transform characters back and forth is a good practice; it comes up often.
Note − Office uses word-processors and spreadsheets with a Graphical User Interface, running on Windows. Hence, it is common for Linux Administrators to get good at completing transformation actions, accommodating end office users (most times, our boss will be an end-user).
Introduced was a command called sed. sed is a stream editor and can be used as a non-interactive text editor for manipulating streams of text and files. We will learn more about sed later. However, keep in mind, for now, using sed, we avoided a need to pipe several filter commands when changing our text file. Thus, making the most efficient use of the tools at hand.
We also introduced a Bash shell operator: &&. && will run the second command only if the first command completes with a successful status of “0”.
[root@centosLocal centos]# ls /noDir && echo “You cannot see me”
ls: cannot access /noDir: No such file or directory
[root@centosLocal centos]# ls /noDir ; echo “You cannot see me”
ls: cannot access /noDir: No such file or directory
You cannot see me
[root@centosLocal centos]# ls /noDir; echo “You cannot see me”
In the above code, note the difference between && and;? The first will only run the second command when the first has been completed successfully, while; simply chains the commands. More on this when we get to scripting shell commands.