Photo by Nik / Unsplash

`sort` Exercises

Exercises Mar 28, 2025

Subscribers get to work together on these in Workroom PlayTime 011 on 3 April, 3pm London time, Zoom and Miro.

Go to Workroom PlayTime 011 for login etc.

There is a (very draft) info page at https://www.workroom-productions.com/p/e0bf704b-5e17-4828-ace2-fde1f68306bd/

Exercises

Files

ex01 has one single-digit number per line, and is unsorted.

ex03 has data in comma-delimited columns. The first three columns are date, first name, surname.

ex04 contains some month names / abbreviations, in order as far as English months are concerned. ex04a contains variants.

ex05 – one number per line like ex01, and sometimes a letter.

ex06 contains randomly ordered numbers, with some duplicates, and ex06b contains a collection of numbers with some in 1E6 notation.

Go to https://envs.workroomprds.com, pick a user, drop through to VSCode in the browser. The files to sort are in ~/sort_exercises. We'll be working in the terminal, and you should see something like this at the bottom of your window:

Exercise 1: Basic use

Type cat ex01 on the command line to see the contents (or look using the file browser).

  • Type sort ex01 to see the output on the command line.
  • Compare sort ex01 with sort -R ex01 and sort -r ex01
💡
The syntax is sort «option(s)» «file(s)»
Sort can reverse with -r ... and randomise with -R

Exercise 2: Plumbing

  • Compare cat ex01 | sort with sort ex01
  • Use sort ex01 > output_of_ex02 to sort into a file called output_of_ex02
  • Use sort ex01 | less to open the output in a a file reader less. Use q to exit the editor.
💡
sort is all set up to be used with other commands.
As a standalone tool, with real data, it is a bit unwieldy – it's best used with other tools.

Exercise 3: Columns

Testers need to work with complex data, and need a column sort.

Use sort -t, -k3,3 ex03 to sort it by surname

Use sort -t, -k2,2 ex03 to sort by first name.

Use sort -t, -k3,3 -k2,2 ex03 to sort by surname then first name, and compare with sort -t, -k3,3 -k2,2r ex03 which reverse the sort of the first name.

💡
Plain sort compares whole lines, character by character.

Columns need delimiters: sort uses space by default, and takes the -t option to change. Specify -t, to use commas and -t$'\t' to use tabs (probably).

Use options twice to sort on two columns. Use modifiers to change the type of sort.
💡
Use -k2,2 to specify a sort on your data's second column.
Use -k2,4 to sort on the second, third and fourth columns.
If you specify -k2 you'll sort on the second column and everything to the left. It's weird, don't do it.

Exercise 4: Checking

You can check if something is sorted with sort -c – which is handy if you're checking a sort for a test, or pre-qualifying some data.

Use sort -c on any of the earlier files – note the error shows the line and the content of the first non-sorted entry.

Use sort -c ex04 to see that a problem is on line 2.

Use sort -Mc ex04 to see that the check changes if told to expect to sort months, and within that style of sort, it accepts varieties of abbreviation and case.

💡
Use sort -c to check whether data is sorted, in various types of sort.
Options can stack

This exercise produces not a lot of output – here's the contents of ex04 for interest.

January
Feb
mar
April
dEcEmBeR

Exercise 5: Reducing

Sort can throw away duplicates. This is handy to see what data is in use (i.e. if you want unique account numbers, a list of this sessions error messages), and is handier using a columns selection.

  • Compare sort ex05 and sort -u ex05 – what's thrown away?
  • Compare sort -k1,1 ex05 and sort -uk1,1 ex05 – what's lost now?
  • Weird one: Compare sort -M ex04a and sort -Mu ex04a – what month names are kept?
💡
option -u throws away duplicates
'duplicates' depends on the sort
u goes at the start, n at the end, column stuff in the middle...

Exercise 6: Problems and avoidances

Use sort ex06 to see a problem. Try sort -n ex06 to avoid it.

Try sort -g ex06b to see how that works...

💡
sort's default is to sort by character.
option -n sorts by value
💡
There are other options for other forms, including
* -d dictionary sort – good for names i.e.O'Leary and New York.
* -f caseless i.e. a before B before c.
* -g scientific numeric i.e. 1E-2 is sorted as 0.01
* -h human numeric sorts 1 before 1K before 1G
* -M English month acronym sorts jan before feb.

Testing: look out for the 'wrong' sort: it may only be revealed by novel data. Other systems may break when the 'wrong' sort is corrected.

Exercise 7: Sort and merge

Try sort -g ex06 ex01 ex06b

Sources

Linux sort Command with Examples

Wikipedia sort (Unix)

Man pages

https://ss64.com/bash/sort.html

sort(1) - Linux manual page


Sprue below - not useful.

Tags

James Lyndsay

Getting better at software testing. Singing in Bulgarian. Staying in. Going out. Listening. Talking. Writing. Making.