{"id":235,"date":"2011-11-04T23:32:24","date_gmt":"2011-11-05T04:32:24","guid":{"rendered":"http:\/\/brainybehavior.com\/neuroimaging\/?p=235"},"modified":"2012-09-20T19:27:07","modified_gmt":"2012-09-21T00:27:07","slug":"awk-for-truncating-files","status":"publish","type":"post","link":"https:\/\/brainybehavior.com\/neuroimaging\/2011\/11\/awk-for-truncating-files\/","title":{"rendered":"Using awk to truncate text files"},"content":{"rendered":"<p>With neuroimaging analyses we deal with a lot of text files. These are often the result of program or scripts running on the MRI data; they can be statistics, volumes, error logs, or any number of other outputs. Let&#8217;s say that we have a large text file with a lot of data in it. Let&#8217;s suppose that it is at least organized neatly into rows and columns, which means it could easily be imported into Excel or another spreadsheet application. FreeSurfer&#8217;s aseg.stats is a good example of this.<\/p>\n<p>For simplicity, let&#8217;s say that we have a 5&#215;5 matrix of data (it&#8217;s really a 6&#215;6 but we&#8217;ll just focus on the 25 numerical data values):<\/p>\n<p><code>\u00a0 \u00a0X1 X2 X3 X4 X5<br \/>\nY1 3000 5000 745 122 875<br \/>\nY2 942 400 263 558 991<br \/>\nY3 325 584 775 381 545<br \/>\nY4 654 336 272 883 235<br \/>\nY5 241 154 782 754 899<\/code><\/p>\n<p>Notice that the column headers do not always line up with the rest of the columns; this is nothing to worry about. If you have a file like this (or are creating one), it is helpful to have a blank new line at the end (i.e., a blank extra row). This can help with the display and processing of text files.<\/p>\n<p>Say you want to display the contents of the text file. Any easy way to do this is with <code>cat {file}.txt<\/code>. This will quickly display the contents on the screen. If it&#8217;s a long text file, you should use <code>less {file}.txt<\/code> instead of <code>cat<\/code>.<\/p>\n<p>Here&#8217;s what this looks like with our sample file.<\/p>\n<p><a href=\"http:\/\/brainybehavior.com\/neuroimaging\/wp-content\/uploads\/2011\/11\/cat_screenshot_bash.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-medium wp-image-237\" title=\"cat_screenshot_bash\" src=\"http:\/\/brainybehavior.com\/neuroimaging\/wp-content\/uploads\/2011\/11\/cat_screenshot_bash-300x249.png\" alt=\"\" width=\"300\" height=\"249\" srcset=\"https:\/\/brainybehavior.com\/neuroimaging\/wp-content\/uploads\/2011\/11\/cat_screenshot_bash-300x249.png 300w, https:\/\/brainybehavior.com\/neuroimaging\/wp-content\/uploads\/2011\/11\/cat_screenshot_bash.png 763w\" sizes=\"(max-width: 300px) 100vw, 300px\" \/><\/a>If you want to number the lines, you can run <code>cat -n {file}.txt<\/code>, which results in a display like this:<\/p>\n<p><a href=\"http:\/\/brainybehavior.com\/neuroimaging\/wp-content\/uploads\/2011\/11\/cat_bash_numbered.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-medium wp-image-238\" title=\"cat_bash_numbered\" src=\"http:\/\/brainybehavior.com\/neuroimaging\/wp-content\/uploads\/2011\/11\/cat_bash_numbered-300x249.png\" alt=\"\" width=\"300\" height=\"249\" srcset=\"https:\/\/brainybehavior.com\/neuroimaging\/wp-content\/uploads\/2011\/11\/cat_bash_numbered-300x249.png 300w, https:\/\/brainybehavior.com\/neuroimaging\/wp-content\/uploads\/2011\/11\/cat_bash_numbered.png 763w\" sizes=\"(max-width: 300px) 100vw, 300px\" \/><\/a>Or, you can use awk:\u00a0<code>awk 'FNR==1{print \"\"}1' {file}.txt<\/code><\/p>\n<p><a href=\"http:\/\/brainybehavior.com\/neuroimaging\/wp-content\/uploads\/2011\/11\/awk_cat_bash.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-medium wp-image-239\" title=\"awk_cat_bash\" src=\"http:\/\/brainybehavior.com\/neuroimaging\/wp-content\/uploads\/2011\/11\/awk_cat_bash-300x249.png\" alt=\"\" width=\"300\" height=\"249\" srcset=\"https:\/\/brainybehavior.com\/neuroimaging\/wp-content\/uploads\/2011\/11\/awk_cat_bash-300x249.png 300w, https:\/\/brainybehavior.com\/neuroimaging\/wp-content\/uploads\/2011\/11\/awk_cat_bash.png 763w\" sizes=\"(max-width: 300px) 100vw, 300px\" \/><\/a>Now, let&#8217;s say that instead of the entire file, you just want a row of values or a single value. Here&#8217;s a way to display just one row: <code>awk 'BEGIN { RS = \"\" ; FS = \"\\n\" } ; { print $2 }' {file}.txt<\/code><\/p>\n<p>The <code>{print $2}<\/code> specifies the second row in the file. This could be changed to $3, $5, $8, or whatever you want, if it&#8217;s a valid row number in the file.<\/p>\n<p><a href=\"http:\/\/brainybehavior.com\/neuroimaging\/wp-content\/uploads\/2011\/11\/awk_single_row.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-medium wp-image-240\" title=\"awk_single_row\" src=\"http:\/\/brainybehavior.com\/neuroimaging\/wp-content\/uploads\/2011\/11\/awk_single_row-300x249.png\" alt=\"\" width=\"300\" height=\"249\" srcset=\"https:\/\/brainybehavior.com\/neuroimaging\/wp-content\/uploads\/2011\/11\/awk_single_row-300x249.png 300w, https:\/\/brainybehavior.com\/neuroimaging\/wp-content\/uploads\/2011\/11\/awk_single_row.png 763w\" sizes=\"(max-width: 300px) 100vw, 300px\" \/><\/a>You could also write that line to a new file with <code>&gt;<\/code> or the pipe <code>|.<\/code><\/p>\n<p>Now, what if you want a column from the file?<\/p>\n<p><code>awk '{ print $3 }' {file}.txt<\/code><\/p>\n<p><a href=\"http:\/\/brainybehavior.com\/neuroimaging\/wp-content\/uploads\/2011\/11\/awk_print_column.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-medium wp-image-241\" title=\"awk_print_column\" src=\"http:\/\/brainybehavior.com\/neuroimaging\/wp-content\/uploads\/2011\/11\/awk_print_column-300x249.png\" alt=\"\" width=\"300\" height=\"249\" srcset=\"https:\/\/brainybehavior.com\/neuroimaging\/wp-content\/uploads\/2011\/11\/awk_print_column-300x249.png 300w, https:\/\/brainybehavior.com\/neuroimaging\/wp-content\/uploads\/2011\/11\/awk_print_column.png 763w\" sizes=\"(max-width: 300px) 100vw, 300px\" \/><\/a>Now, what if you want only a single value from the text file. I know there are shorter ways of doing this but one way is to do a two-step awk command by combining the column and row commands I already demonstrated:<br \/>\n<code><\/code><\/p>\n<p><code>awk 'BEGIN { RS = \"\" ; FS = \"\\n\" } ; { print $2 }' {file}.txt | tee -a file_tmp.txt;<br \/>\nawk '{ print $3; exit }' file_tmp.txt | tee -a subject_volume.txt;<br \/>\nrm file_tmp.txt<\/code><\/p>\n<p><a href=\"http:\/\/brainybehavior.com\/neuroimaging\/wp-content\/uploads\/2011\/11\/awk_column_row_value.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-medium wp-image-242\" title=\"awk_column_row_value\" src=\"http:\/\/brainybehavior.com\/neuroimaging\/wp-content\/uploads\/2011\/11\/awk_column_row_value-300x249.png\" alt=\"\" width=\"300\" height=\"249\" srcset=\"https:\/\/brainybehavior.com\/neuroimaging\/wp-content\/uploads\/2011\/11\/awk_column_row_value-300x249.png 300w, https:\/\/brainybehavior.com\/neuroimaging\/wp-content\/uploads\/2011\/11\/awk_column_row_value.png 763w\" sizes=\"(max-width: 300px) 100vw, 300px\" \/><\/a><\/p>\n<p>This displays the second row in the file and writes it to a temporary file; then it reads and displays and saves the value from the 3rd column in the temp file. The &#8220;exit&#8221; part of the second command will limit the print command to a single value, should there be multiple values in the column (or row). Lastly, it deletes the temp file. In the screenshot above, I am showing the value in the second row, third column (this includes the row and coumn labels). You could also run this with the column command first, if that&#8217;s easier.<\/p>\n<p>This can save you a lot of time if you have to pull out values from a lot of text files (e.g., aseg.stats) by using a for or while loop. I recently did this to pull values out of some text files. It would have taken hours to pull out the correct values from every participant by hand. With a script (series of &#8220;for&#8221; loops) it ran in seconds. Then, I had a file with all the values for all the participants, which was then imported into Excel.<\/p>\n<p>I am a novice user of awk so specific questions about it are best directed elsewhere. This is just what I&#8217;ve figured out by reading various guides online.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>With neuroimaging analyses we deal with a lot of text files. These are often the result of program or scripts running on the MRI data; they can be statistics, volumes, error logs, or any number of other outputs. Let&#8217;s say &hellip; <a href=\"https:\/\/brainybehavior.com\/neuroimaging\/2011\/11\/awk-for-truncating-files\/\">Continue reading <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":242,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[8],"tags":[59,66,60],"_links":{"self":[{"href":"https:\/\/brainybehavior.com\/neuroimaging\/wp-json\/wp\/v2\/posts\/235"}],"collection":[{"href":"https:\/\/brainybehavior.com\/neuroimaging\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/brainybehavior.com\/neuroimaging\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/brainybehavior.com\/neuroimaging\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/brainybehavior.com\/neuroimaging\/wp-json\/wp\/v2\/comments?post=235"}],"version-history":[{"count":9,"href":"https:\/\/brainybehavior.com\/neuroimaging\/wp-json\/wp\/v2\/posts\/235\/revisions"}],"predecessor-version":[{"id":259,"href":"https:\/\/brainybehavior.com\/neuroimaging\/wp-json\/wp\/v2\/posts\/235\/revisions\/259"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/brainybehavior.com\/neuroimaging\/wp-json\/wp\/v2\/media\/242"}],"wp:attachment":[{"href":"https:\/\/brainybehavior.com\/neuroimaging\/wp-json\/wp\/v2\/media?parent=235"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/brainybehavior.com\/neuroimaging\/wp-json\/wp\/v2\/categories?post=235"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/brainybehavior.com\/neuroimaging\/wp-json\/wp\/v2\/tags?post=235"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}