The Bash Shell
Exercise Answers
Files and Directories
Relative path resolution
- Incorrect, since
backupexists in theUsersparent directory, where the preceding..is referencing. - Incorrect, since
..is used beforebackupto reference theUsersparent directory first, solswill run on/Users/backup. - Incorrect as per 2, but also to have the directories have a following
/we’d also need to use the-Fflag. - Correct.
ls reading comprehension
- This will attempt to list the contents of a file or directory called
pwdwhich does not exist. - Partially correct, but 3 is also correct.
- Partially correct, but 2 is also correct.
- Correct.
Creating Things
Renaming files
- Incorrect, since this would copy the mistakenly named file to another file with the desired name, but the original mistakenly named file will still exist.
- Correct.
- Incorrect, since this would not rename the file.
- Incorrect, since
mvis used to rename a file andcpis used to copy a file.
Moving and Copying
- Incorrect, since
proteins-saved.datwas created in the directory above, since..was used before its filename when copyingrecombine/proteins.dat - Correct.
- Incorrect, since
proteins.datwas moved into therecombinedirectory. - Incorrect, since the
recombinedirectory was created in this current directory.
Organizing Directories and Files
mv fructose.dat sucrose.dat analyzedwill copy the files ending.datinto theanalyzeddirectory.
Copy with Multiple Filenames
- With several filenames and a directory,
cpwill copy the given files into the given directory. - When given three or more filenames,
cpwill return an error since with more than two argumentscpassumes the last argument is a directory.
Pipes and Filters
What does sort -n do?
sort -nwill perform a sort interpreting numerical digits as proper numbers - a numerical sort.sorton its own will perform a sort assuming any numerical digits are just sequences of characters, e.g.10will come before2since1comes before2.
What does >> mean?
>will redirect the outputhelloto the filetestfile01.txt, replacing the contents of that file if it already exists.<<will redirect the outputhelloto the filetestfile01.txt, appending any existing content if that file already exists.
Piping commands together
- Incorrect, since we cannot redirect to a command, we should use a
|(pipe) instead. - Incorrect, since
1-3when passed toheadis interpreted as a file, not a range of line numbers. - Incorrect, since we need to sort the line counts before extracting the top three (otherwise we get them in whatever order
wcgives them tohead) - Correct.
Why does uniq only remove adjacent duplicates?
- For efficiency. If it were to work across non-adjacent lines it would need to keep the whole file in memory in some way to know whether it had already encountered a line. This would need considerable memory with very large files, and searching for duplicate lines would take much longer to run.
- You could use
sortfirst in a pipe to sort the file contents to ensure duplicate lines are adjacent, e.g.sort salmon.txt | uniq
Pipe reading comprehension
cat animals.txtwill output the contents ofanimals.txt.head -5accepts the output fromcatand output the first 5 lines of that.tail -3accepts the 5 lines fromheadand output the last 3 lines of that.sort -raccepts the 3 lines fromtailand output those lines in reverse sort order.> final.txtwill take the output fromsortand redirect it into a file calledfinal.txt.
Shell Scripts
Variables in shell scripts
- Incorrect, since
-1is passed toheadin the script it will output the first line of each.pdbfile, whilst the-1passed totailwill output the last line of each.pdbfile. - Correct.
- Incorrect, since
*.pdbis passed into the script and used byheadandtail, so only.pdbfiles will be used. - Incorrect, since the quotes only mean that
*.pdbwill be passed into the script without expansion.
Script reading comprehension
- Script 1 will output a list of files that match the
*.*pattern, i.e.fructose.dat,glucose.dat, andsucrose.dat. - Script 2 will take in three arguments on the command line, and for each of them, print out their contents.
- Script 3 will print out all arguments as passed to the script on a single line and append
.datto that output.
Loops
Variables in Loops
- The first loop will present “fructose.dat glucose.dat sucrose.dat” three times, since we are running
ls *.datthree separate times - we’re not making use of the loop variable$datafile. The second loop will produce “fructose.dat”, “glucose.dat”, and “sucrose.dat” (each on a separate line) since we’re passing$datafiletols.
Saving to a File in a Loop - Part One
- Correct.
- Incorrect, since we’re using the
>redirect operator, which will overwrite any previous contents ofxylose.dat. - Incorrect, since the file
xylose.datwould not have existed when*.datwould have been expanded. - Incorrect.
Saving to a File in a Loop - Part Two
- Correct.
- Incorrect, since we’re looping through each of the other
.datfiles (fructose.datandglucose.dat) whose contents would also be included. - Incorrect, since
maltose.txthas a.txtextension and not a.datextension, so won’t match on*.datand won’t be included in the loop. - Incorrect, since the
>>operator redirects all output to thesugar.datfile, so we won’t see any screen output.
Doing a dry run
- Version 2 is the one that successfully acts as a dry run. In version 1, since the
>file redirect is not within quotes, the script will create three filesanalyzed-basilisk.dat,analyzed-minotaur.dat, andanalyzed-unicorn.datwhich is not what we want.
Finding Things
Using grep
- Incorrect, since it will find lines that contain
ofincluding those that are not a complete word, including “Software is like that.” - Incorrect,
-E(which enables extended regular expressions ingrep), won’t change the behaviour since the given pattern is not a regular expression. So the results will be the same as 1. - Correct, since we have supplied
-wto indicate that we are looking for a complete word, hence only “and the presence of absence:” is found. - Incorrect.
-iindicates we wish to do a case insensitive search which isn’t required. The results are the same as 1.
find pipeline reading comprehension
- Find all files (in this directory and all subdirectories) that have a filename that ends in
.dat, count the number of files found, and sort the result. Note that thesorthere is unnecessary, since it is only sorting one number.
Matching ose.dat but not temp {}:
- Incorrect, since the first
grepwill find all filenames that containosewherever it may occur, and also because the use ofgrepas a following pipe command will only match on filenames output fromfindand not their contents. - Incorrect, since it will only find those files than match
ose.datexactly, and also because the use ofgrepas a following pipe command will only match on filenames output fromfindand not their contents. - Correct answer. It first executes the
findcommand to find those files matching the ’*ose.dat’ pattern, which will match on exactly those that end inose.dat, and thengrepwill search those files for “temp” and only report those that don’t contain it, since it’s using the-vflag to invert the results. - Incorrect.
Additional Exercises
Copying files with new filenames
- Assuming the output directory is named
copied:
today_date=$(date +"%d-%m-%y")
for file in data/*.csv
do
base_file=$(basename $file)
cp $file copied/$today_date-$base_file
done
Filtering our output
- The
Max_temp_jul_Fcolumn is the fourth column in each data file - Assuming the input directory is named
copiedand the output directory is namedfiltered:
for file in copied/*.csv
do
base_file=$(basename $file)
cat $file | cut -d"," -f 4 > filtered/$base_file
done