What is Linux sort
?
sort
is a very useful command line utility used to sort the lines of a file or input stream. sort
can be used to sort input by entire lines, single columns, or different column ranges in a variety of ways.
Sorting by Entire Lines
The default behavior of the sort command is to sort by entire lines. Here is an example:
#sort a file by entire lines
sort sample.txt
Sorting by Columns
It is possible to sort files/streams on a single column, multiple columns, or ranges of columns. To do this use the -k
option to indicate these columns. The default delimiter used to identify columns is white space (blank characters). If you need to use a different column delimiter use the -t
option.
Sorting by a Single Column
Sorting by single column requires the use of the -k
option. You must also specify the start column and end column to sort by. When sorting by a single column, these numbers will be the same. Here is an example of sorting a CSV (comma delimited) file by the second column.
#sorting a CSV by the values in column 2
sort -k2,2 -t ',' sample.txt
Sorting by Multiple Columns
Sorting by multiple columns is similar to sorting by a single column. To sort on a range of columns, simply specify the start and end columns in the column range to use for sorting. Here is an example of sorting a CSV on columns 2 through 4.
#sorting a CSV by the values in columns 2-4
sort -k2,4 -t ',' sample.txt
To sort on a non-contiguous set of columns, you must use the -k
option multiple times. Here is an example of sorting by column 2 then column 6:
#sorting a CSV by the values in column 2 then column 6
sort -k2,2 -k6,6 -t ',' sample.txt
Sorting in Reverse
It is easy to sort in reverse (descending) order with the Linux sort
command. Simply use the -r
option. Here is an example of sorting by entire lines in reverse order:
sort -r sample.txt
Sorting Numerically
By default the sort
command will order lines based on text value, which doesn’t always make sense. For example, if we had the following numbers in a file:
1
9
10
If we sort on text values based on text values we will get:
#sorting numbers based on text values
sort sample.txt
1
10
9
To sort numerically we need to use the -n
option. Using this we will now get the correct numerical ordering:
#sorting numbers based on text values
sort -n sample.txt
1
9
10
Complex Sorting
There are many ways that one might want to sort a text file. Here is a more complex example of sorting a file first on column 1 in numerical reverse order, and then on columns 2-4 using the default text ordering.
#sorting first on column 1 in numerical reverse order, and then on columns 2-4 using the default text ordering
sort -k1,1rn -k2,4 sample.txt
Random Sorting
Occasionally you want to print/sort files in random order. To do this with sort
simply use it -R
option.
#randomly sorting lines
sort -R sample.txt
Sorting by Case
Depending on your system, the default sort
behavior could be either to sort text using case-sensitive or case-insensitive comparisons. To make the default case-sensitive simply use export LC_ALL=C
before executing your sort command. To sort case-insensitive use the -f
option.
#sort case sensitive
export LC_ALL=C
sort sample.txt
#sort case insensitive
sort -f sample.txt
Sorting Multiple Files together
Often when working with data you will want the sorted output of multiple files. There are a couple of easy ways to do this. First you can pass multiple files on the command line to be sorted, or you can cat
the content of multiple files and pipe the output to the sort
command:
#sort multiple files
sort file1.txt file2.txt
#cat content of multiple files and sort the output
cat file1.txt file2.txt | sort
Check if Input is Already Sorted
Knowing if your data is sorted can be helpful when using other commands like uniq
, and can save a lot of processing time. To check if a file is already sorted use the -c
option.
#check if file is already sorted
sort -c sample.txt
Learning More about Linux Sort
To learn more about how to use the Linux sort
command, simply view the man page from you terminal:
#reading the sort manual
man sort
NAME
sort - sort lines of text files
SYNOPSIS
sort [OPTION]... [FILE]...
sort [OPTION]... --files0-from=F
DESCRIPTION
Write sorted concatenation of all FILE(s) to standard output.
With no FILE, or when FILE is -, read standard input.
Mandatory arguments to long options are mandatory for short options too. Ordering options:
-b, --ignore-leading-blanks
ignore leading blanks
-d, --dictionary-order
consider only blanks and alphanumeric characters
-f, --ignore-case
fold lower case to upper case characters
....