LINUX - awk - use on data to add up columnar data

HOW TO USE AWK TO BETTER YOUR LIFE
##################################

NOTE: this may seem like an intimidating read because of the length, however most of it is long simply because I copy paste the data all over and over again so that you can see it more clearly.


Good examples:
http://www.thegeekstuff.com/2010/01/awk-introduction-tutorial-7-awk-print-examples/
http://www.gnu.org/software/gawk/manual/html_node/Print-Examples.html
http://www.kossboss.com/linux---awk---make-searched-lines-standout
http://www.kossboss.com/linux---delimit-stuff

Here is the steps I was counting during the month of October for a race challenge. I want to add up all of the steps, but instead of manually I want linux to do it

DATE::: STEPS
Sept 30 - Monday::: 8000
Octo 1 - Tuesday::: 6000
Octo 2 - Wednesday::: 7000
Octo 3 - Thursday::: 6000
Octo 4 - Friday::: 9000 - Lost pedometer - heavy workout for three housr
Octo 5 - Saturday::: 9000 - heavy workout for three hours
Octo 6 - Sunday::: 4000 - Steps
Octo 7 - Monday::: 8500 - Got new Pedometer (2000 steps on old before 2:07, after 2:07 6500 steps)
Octo 8 - Tuesday::: 6025
Octo 9 - Wednesday::: 6608
octo 10 - Thursday::: 6120
octo 11 - Friday::: 6185
octo 12 - Saturday::: 3452
octo 13 - Sunday::: 1858
octo 14 - Monday::: 5675
octo 15 - Tuesday::: 2227
octo 16 - Wednesday::: 4444
octo 17 - Thursday::: 16607
octo 18 - Friday::: 7052
octo 19 - Saturday::: 6000
octo 20 - Sunday::: 5321
octo 21 - Monday::: 6205
octo 22 - Tuesday::: 8007
octo 23 - Wednesday::: 11234
octo 24 - Thursday::: 7502
octo 25 - Friday::: 7111
octo 26 - Saturday::: 6523
octo 27 - Sunday::: 4342
octo 28 - Monday::: 9542
octo 29 - Tuesday::: 3212
octo 30 - Wednesday::: 5600
octo 31 - Thursday::: 19324
nove 01 - Friday::: 22145
nove 02 - Saturday::: 20541
nove 03 - Sunday::: 6523
nove 04 - Monday::: 5785

I will just jump to the final command that does it all (literally drag out from the echo down to the last line - make sure to not include the #):

echo "Sept 30 - Monday::: 8000
Octo 1 - Tuesday::: 6000
Octo 2 - Wednesday::: 7000
Octo 3 - Thursday::: 6000
Octo 4 - Friday::: 9000 - Lost pedometer - heavy workout for three housr
Octo 5 - Saturday::: 9000 - heavy workout for three hours
Octo 6 - Sunday::: 4000 - Steps
Octo 7 - Monday::: 8500 - Got new Pedometer (2000 steps on old before 2:07, after 2:07 6500 steps)
Octo 8 - Tuesday::: 6025
Octo 9 - Wednesday::: 6608
octo 10 - Thursday::: 6120
octo 11 - Friday::: 6185
octo 12 - Saturday::: 3452
octo 13 - Sunday::: 1858
octo 14 - Monday::: 5675
octo 15 - Tuesday::: 2227
octo 16 - Wednesday::: 4444
octo 17 - Thursday::: 16607
octo 18 - Friday::: 7052
octo 19 - Saturday::: 6000
octo 20 - Sunday::: 5321
octo 21 - Monday::: 6205
octo 22 - Tuesday::: 8007
octo 23 - Wednesday::: 11234
octo 24 - Thursday::: 7502
octo 25 - Friday::: 7111
octo 26 - Saturday::: 6523
octo 27 - Sunday::: 4342
octo 28 - Monday::: 9542
octo 29 - Tuesday::: 3212
octo 30 - Wednesday::: 5600
octo 31 - Thursday::: 19324
nove 01 - Friday::: 22145
nove 02 - Saturday::: 20541
nove 03 - Sunday::: 6523
nove 04 - Monday::: 5785" | awk -F ':::' '{ print $2; }' | awk -F - '{print $1;}' | awk '{total=total+$1;print $1, " - CUMULATIVE:", total;}'

OUTPUT:
########

8000  - CUMULATIVE: 8000
6000  - CUMULATIVE: 14000
7000  - CUMULATIVE: 21000
6000  - CUMULATIVE: 27000
9000  - CUMULATIVE: 36000
9000  - CUMULATIVE: 45000
4000  - CUMULATIVE: 49000
8500  - CUMULATIVE: 57500
6025  - CUMULATIVE: 63525
6608  - CUMULATIVE: 70133
6120  - CUMULATIVE: 76253
6185  - CUMULATIVE: 82438
3452  - CUMULATIVE: 85890
1858  - CUMULATIVE: 87748
5675  - CUMULATIVE: 93423
2227  - CUMULATIVE: 95650
4444  - CUMULATIVE: 100094
16607  - CUMULATIVE: 116701
7052  - CUMULATIVE: 123753
6000  - CUMULATIVE: 129753
5321  - CUMULATIVE: 135074
6205  - CUMULATIVE: 141279
8007  - CUMULATIVE: 149286
11234  - CUMULATIVE: 160520
7502  - CUMULATIVE: 168022
7111  - CUMULATIVE: 175133
6523  - CUMULATIVE: 181656
4342  - CUMULATIVE: 185998
9542  - CUMULATIVE: 195540
3212  - CUMULATIVE: 198752
5600  - CUMULATIVE: 204352
19324  - CUMULATIVE: 223676
22145  - CUMULATIVE: 245821
20541  - CUMULATIVE: 266362
6523  - CUMULATIVE: 272885
5785  - CUMULATIVE: 278670

So I have 278670 steps total

So the breakdown:

THE MISSION:
############

We need to get all of the numbers alone. We can do that with an echo into an awk statement (or 2 awk statements if there is useless data left over after the first awk data). Awk can be used to seperate data out kind of like 'grep' can and 'cut' can, so in other words it can pick columns out of lines. Then after we have the numbers

Here is the syntax

echo "text" | awk commands to select the numbers only | awk to add up the numbers

The reason for the echo is so that we can pump text into awk. Usually people use echo for that, or they use another command before the awk that outputs the data they need. We have the data in notepad, so we can copy it into an echo statement (more on this later in a sidenote, specifically on making an easy way to echo a multiline data string such as this). So essentially the echo all it does is just prepare to be inputed for awk.

BEGINNING THE AWESOMENESS (The examples begin):
###############################################

The first awk command

echo "TEXT" | awk -F ':::' '{ print $2; }'


The -F is the field seperator that will split up everything by the ::: strings after that is '{ print $2; }' this is the power of awk. Note without the -F the default

This is a little mini-awk program that tells the program what to do with the findings. These little mini programs work per line per finding (not on a whole page or all of the text, just per line thats why its nice).

SIDENOTE: the awk mini programs like to be enclosed by single quotes

So the above will select everything to the right of the :::

Think of it like this. Everything gets split up by the ::: symbols with the -F command. Think of the split command in programming. print $0 would print the whole line, print $1 would print the first field which is to the left of the :::, and what we need is to the right of the ::: so thats print $2

Here is how it would work on 4 columns of ::: seperators (not our case, this is just for example sake)

COL 1 ::: COL 2 ::: COL 3 ::: COL 4
print $1; ::: print $2; ::: print $3; ::: print $4;

So if you havent got it yet the dollar sign number refers to different column locations (seperated by the field seperator - remember by default its a space/tab) with the exception that $0 is the whole line. So $1 means the first column etc.
 
So after that the output is like this:

FIRST AWK:
##########

SIDENOTE: How did I construct a multiline command so perfectly? First Type 'echo "', without the single quotes, then paste the text in with the right click into putty, then end the echo command with " You can test if it work correctly by clicking enter and you should see the same text thrown right back on the screen. Hit up arrow to get the full command back up.

# echo "Sept 30 - Monday::: 8000
Octo 1 - Tuesday::: 6000
Octo 2 - Wednesday::: 7000
Octo 3 - Thursday::: 6000
Octo 4 - Friday::: 9000 - Lost pedometer - heavy workout for three housr
Octo 5 - Saturday::: 9000 - heavy workout for three hours
Octo 6 - Sunday::: 4000 - Steps
Octo 7 - Monday::: 8500 - Got new Pedometer (2000 steps on old before 2:07, after 2:07 6500 steps)
Octo 8 - Tuesday::: 6025
Octo 9 - Wednesday::: 6608
octo 10 - Thursday::: 6120
octo 11 - Friday::: 6185
octo 12 - Saturday::: 3452
octo 13 - Sunday::: 1858
octo 14 - Monday::: 5675
octo 15 - Tuesday::: 2227
octo 16 - Wednesday::: 4444
octo 17 - Thursday::: 16607
octo 18 - Friday::: 7052
octo 19 - Saturday::: 6000
octo 20 - Sunday::: 5321
octo 21 - Monday::: 6205
octo 22 - Tuesday::: 8007
octo 23 - Wednesday::: 11234
octo 24 - Thursday::: 7502
octo 25 - Friday::: 7111
octo 26 - Saturday::: 6523
octo 27 - Sunday::: 4342
octo 28 - Monday::: 9542
octo 29 - Tuesday::: 3212
octo 30 - Wednesday::: 5600
octo 31 - Thursday::: 19324
nove 01 - Friday::: 22145
nove 02 - Saturday::: 20541
nove 03 - Sunday::: 6523
nove 04 - Monday::: 5785" | awk -F ':::' '{ print $2; }'

OUTPUT OF FIRST AWK:
####################

8000
6000
7000
6000
9000 - Lost pedometer - heavy workout for three housr
9000 - heavy workout for three hours
4000 - Steps
8500 - Got new Pedometer (2000 steps on old before 2:07, after 2:07 6500 steps)
6025
6608
6120
6185
3452
1858
5675
2227
4444
16607
7052
6000
5321
6205
8007
11234
7502
7111
6523
4342
9542
3212
5600
19324
22145
20541
6523
5785

So notice we get all of the numbers but all of the text as well. Notice that the text is seperated with -. Luckily in real life most data is nicely seperated in wierd ways, usually they keep the same seperator but in this case I used ::: for the numbers and then - for the comments.

So now we need everything to the left of the - so thats print $1;

So repeat the command from above and tack on another pipe into a new awk with the print $1 and the field seperator being the -

So the second awk command is:

awk -F '-' '{print $1;}'

Notice its not really picky on spacing inside of the mini program because awk is c/c++/java like syntax

SECOND AWK:
###########
 
# echo "Sept 30 - Monday::: 8000
Octo 1 - Tuesday::: 6000
Octo 2 - Wednesday::: 7000
Octo 3 - Thursday::: 6000
Octo 4 - Friday::: 9000 - Lost pedometer - heavy workout for three housr
Octo 5 - Saturday::: 9000 - heavy workout for three hours
Octo 6 - Sunday::: 4000 - Steps
Octo 7 - Monday::: 8500 - Got new Pedometer (2000 steps on old before 2:07, after 2:07 6500 steps)
Octo 8 - Tuesday::: 6025
Octo 9 - Wednesday::: 6608
octo 10 - Thursday::: 6120
octo 11 - Friday::: 6185
octo 12 - Saturday::: 3452
octo 13 - Sunday::: 1858
octo 14 - Monday::: 5675
octo 15 - Tuesday::: 2227
octo 16 - Wednesday::: 4444
octo 17 - Thursday::: 16607
octo 18 - Friday::: 7052
octo 19 - Saturday::: 6000
octo 20 - Sunday::: 5321
octo 21 - Monday::: 6205
octo 22 - Tuesday::: 8007
octo 23 - Wednesday::: 11234
octo 24 - Thursday::: 7502
octo 25 - Friday::: 7111
octo 26 - Saturday::: 6523
octo 27 - Sunday::: 4342
octo 28 - Monday::: 9542
octo 29 - Tuesday::: 3212
octo 30 - Wednesday::: 5600
octo 31 - Thursday::: 19324
nove 01 - Friday::: 22145
nove 02 - Saturday::: 20541
nove 03 - Sunday::: 6523
nove 04 - Monday::: 5785" | awk -F ':::' '{ print $2; }' | awk -F - '{print $1;}'

OUTPUT OF SECOND AWK:
#####################

8000
6000
7000
6000
9000
9000
4000
8500
6025
6608
6120
6185
3452
1858
5675
2227
4444
16607
7052
6000
5321
6205
8007
11234
7502
7111
6523
4342
9542
3212
5600
19324
22145
20541
6523
5785

So there we go we have all of the numbers in a row. THis was the first mission. Now we write a program that cumulatively adds them and then just displays the output.

Here is the awk statement for that:

awk '{total=total+$1;print $1, " - CUMULATIVE:", total;}'

$1 refers to column 1, we could of used $0 as that is the numbers as well. Which brings up the point that we could of selected all of the numbers with $1 in the previous awk command as we did to select out the comments, and add up the totals there. However I wanted clean output without comments so im doing it here.

So the final command is like this

echo "TEXT" | awk -F ':::' '{ print $2; }' | awk -F - '{print $1;}' | awk '{total=total+$1;print $1, " - CUMULATIVE:", total;}'

Explanation of each one again:

The echo is for the awk to have some input, which we use our data to input into the awk | the first awk selects out the numbers | the second awk makes a clean final selection of the numbers so all we have left is numbers | the final awk adds everything up cumulatively

The final output looks like so
 
THIRD AWK:
##########
 
# echo "Sept 30 - Monday::: 8000
Octo 1 - Tuesday::: 6000
Octo 2 - Wednesday::: 7000
Octo 3 - Thursday::: 6000
Octo 4 - Friday::: 9000 - Lost pedometer - heavy workout for three housr
Octo 5 - Saturday::: 9000 - heavy workout for three hours
Octo 6 - Sunday::: 4000 - Steps
Octo 7 - Monday::: 8500 - Got new Pedometer (2000 steps on old before 2:07, after 2:07 6500 steps)
Octo 8 - Tuesday::: 6025
Octo 9 - Wednesday::: 6608
octo 10 - Thursday::: 6120
octo 11 - Friday::: 6185
octo 12 - Saturday::: 3452
octo 13 - Sunday::: 1858
octo 14 - Monday::: 5675
octo 15 - Tuesday::: 2227
octo 16 - Wednesday::: 4444
octo 17 - Thursday::: 16607
octo 18 - Friday::: 7052
octo 19 - Saturday::: 6000
octo 20 - Sunday::: 5321
octo 21 - Monday::: 6205
octo 22 - Tuesday::: 8007
octo 23 - Wednesday::: 11234
octo 24 - Thursday::: 7502
octo 25 - Friday::: 7111
octo 26 - Saturday::: 6523
octo 27 - Sunday::: 4342
octo 28 - Monday::: 9542
octo 29 - Tuesday::: 3212
octo 30 - Wednesday::: 5600
octo 31 - Thursday::: 19324
nove 01 - Friday::: 22145
nove 02 - Saturday::: 20541
nove 03 - Sunday::: 6523
nove 04 - Monday::: 5785" | awk -F ':::' '{ print $2; }' | awk -F - '{print $1;}' | awk '{total=total+$1;print $1, " - CUMULATIVE:", total;}'

THIRD AWK OUTPUT:
#################

8000  - CUMULATIVE: 8000
6000  - CUMULATIVE: 14000
7000  - CUMULATIVE: 21000
6000  - CUMULATIVE: 27000
9000  - CUMULATIVE: 36000
9000  - CUMULATIVE: 45000
4000  - CUMULATIVE: 49000
8500  - CUMULATIVE: 57500
6025  - CUMULATIVE: 63525
6608  - CUMULATIVE: 70133
6120  - CUMULATIVE: 76253
6185  - CUMULATIVE: 82438
3452  - CUMULATIVE: 85890
1858  - CUMULATIVE: 87748
5675  - CUMULATIVE: 93423
2227  - CUMULATIVE: 95650
4444  - CUMULATIVE: 100094
16607  - CUMULATIVE: 116701
7052  - CUMULATIVE: 123753
6000  - CUMULATIVE: 129753
5321  - CUMULATIVE: 135074
6205  - CUMULATIVE: 141279
8007  - CUMULATIVE: 149286
11234  - CUMULATIVE: 160520
7502  - CUMULATIVE: 168022
7111  - CUMULATIVE: 175133
6523  - CUMULATIVE: 181656
4342  - CUMULATIVE: 185998
9542  - CUMULATIVE: 195540
3212  - CUMULATIVE: 198752
5600  - CUMULATIVE: 204352
19324  - CUMULATIVE: 223676
22145  - CUMULATIVE: 245821
20541  - CUMULATIVE: 266362
6523  - CUMULATIVE: 272885
5785  - CUMULATIVE: 278670

So I have 278670 steps total

So there is the answer...

Bonus... so in awk I explained how a miniprogram runs at the end of every line. THats why I needed the numbers to be seperated out for the addition. Well with awk you can also have several miniprograms running at different portions of the sequence, for example I can have a miniprogram for the Beginning before any of the text is processed (which we could of used to initialize the total value to 0, but luckily initialization in awk doesnt matter, the total variable was initialized the second we asked for it to exist), then I can have the regular miniprogram that is processed at the end of each line, then I can have a miniprogram that runs at the very end of the entire text (which we could of used to display the final total). Also we can have a miniprogram run only when it finds certain lines, using a grep like mechanism to find lines and run the miniprogram of choice when it does find that line.

AWK KEYWORDS FOR ACTIONS: 
==========================

BEGIN { Actions}
{ACTION} # Action for everyline in a file (this is the regular action section)
/Find this/ {ACTION} # This is action for the line it finds - Case sensitive
END { Actions }

SIDENOTE: I will show examples later on on how to make this case insensitive

So for example in our case we could do this:

echo "TEXT"  | awk -F ':::' '{ print $2; }' | awk -F - '{print $1;}' | awk 'BEGIN {total=0; print "COUNTING STEPS";}{total=total+$1;print $1, " - CUMULATIVE:", total;} END {print "TOTAL NUMBER OF STEPS", total}'

Or here it is in final working status:

FINAL (just kidding there are a few extra bonus examples after this)
#####################################################################

# echo "Sept 30 - Monday::: 8000
Octo 1 - Tuesday::: 6000
Octo 2 - Wednesday::: 7000
Octo 3 - Thursday::: 6000
Octo 4 - Friday::: 9000 - Lost pedometer - heavy workout for three housr
Octo 5 - Saturday::: 9000 - heavy workout for three hours
Octo 6 - Sunday::: 4000 - Steps
Octo 7 - Monday::: 8500 - Got new Pedometer (2000 steps on old before 2:07, after 2:07 6500 steps)
Octo 8 - Tuesday::: 6025
Octo 9 - Wednesday::: 6608
octo 10 - Thursday::: 6120
octo 11 - Friday::: 6185
octo 12 - Saturday::: 3452
octo 13 - Sunday::: 1858
octo 14 - Monday::: 5675
octo 15 - Tuesday::: 2227
octo 16 - Wednesday::: 4444
octo 17 - Thursday::: 16607
octo 18 - Friday::: 7052
octo 19 - Saturday::: 6000
octo 20 - Sunday::: 5321
octo 21 - Monday::: 6205
octo 22 - Tuesday::: 8007
octo 23 - Wednesday::: 11234
octo 24 - Thursday::: 7502
octo 25 - Friday::: 7111
octo 26 - Saturday::: 6523
octo 27 - Sunday::: 4342
octo 28 - Monday::: 9542
octo 29 - Tuesday::: 3212
octo 30 - Wednesday::: 5600
octo 31 - Thursday::: 19324
nove 01 - Friday::: 22145
nove 02 - Saturday::: 20541
nove 03 - Sunday::: 6523
nove 04 - Monday::: 5785"  | awk -F ':::' '{ print $2; }' | awk -F - '{print $1;}' | awk 'BEGIN {total=0; print "COUNTING STEPS";}{total=total+$1;print $1, " - CUMULATIVE:", total;} END {print "TOTAL NUMBER OF STEPS", total}'

OUTPUT:
########

COUNTING STEPS
8000  - CUMULATIVE: 8000
6000  - CUMULATIVE: 14000
7000  - CUMULATIVE: 21000
6000  - CUMULATIVE: 27000
9000  - CUMULATIVE: 36000
9000  - CUMULATIVE: 45000
4000  - CUMULATIVE: 49000
8500  - CUMULATIVE: 57500
6025  - CUMULATIVE: 63525
6608  - CUMULATIVE: 70133
6120  - CUMULATIVE: 76253
6185  - CUMULATIVE: 82438
3452  - CUMULATIVE: 85890
1858  - CUMULATIVE: 87748
5675  - CUMULATIVE: 93423
2227  - CUMULATIVE: 95650
4444  - CUMULATIVE: 100094
16607  - CUMULATIVE: 116701
7052  - CUMULATIVE: 123753
6000  - CUMULATIVE: 129753
5321  - CUMULATIVE: 135074
6205  - CUMULATIVE: 141279
8007  - CUMULATIVE: 149286
11234  - CUMULATIVE: 160520
7502  - CUMULATIVE: 168022
7111  - CUMULATIVE: 175133
6523  - CUMULATIVE: 181656
4342  - CUMULATIVE: 185998
9542  - CUMULATIVE: 195540
3212  - CUMULATIVE: 198752
5600  - CUMULATIVE: 204352
19324  - CUMULATIVE: 223676
22145  - CUMULATIVE: 245821
20541  - CUMULATIVE: 266362
6523  - CUMULATIVE: 272885
5785  - CUMULATIVE: 278670
TOTAL NUMBER OF STEPS 278670

LAST SIDE NOTE: Notice that the print statement is seperated out with ,

Lets use Awk also to include more data like average steps per day. That will have to be calculated all at the very end so we will put that code in the END action. For an average you need a total (which we have) and a count of the events (number of days). I will count up the days at the end of every line that is processed in the REGULAR section action.

echo "TEXT" | awk -F ':::' '{ print $2; }' | awk -F ':::' '{ print $2; }' | awk -F - '{print $1;}' | awk 'BEGIN {total=0; count=0; avg=0; print "COUNTING STEPS";}{total=total+$1;print $1, " - CUMULATIVE:", total; count++;} END {print "TOTAL NUMBER OF STEPS:", total; avg=total/count; print "NUMBER OF DAYS:", count; print "STEPS PER DAY AVERAGE:", avg;}'

Note: only last awk changes

EXAMPLE WITH AVERAGE:
#####################

# echo "Sept 30 - Monday::: 8000
Octo 1 - Tuesday::: 6000
Octo 2 - Wednesday::: 7000
Octo 3 - Thursday::: 6000
Octo 4 - Friday::: 9000 - Lost pedometer - heavy workout for three housr
Octo 5 - Saturday::: 9000 - heavy workout for three hours
Octo 6 - Sunday::: 4000 - Steps
Octo 7 - Monday::: 8500 - Got new Pedometer (2000 steps on old before 2:07, after 2:07 6500 steps)
Octo 8 - Tuesday::: 6025
Octo 9 - Wednesday::: 6608
octo 10 - Thursday::: 6120
octo 11 - Friday::: 6185
octo 12 - Saturday::: 3452
octo 13 - Sunday::: 1858
octo 14 - Monday::: 5675
octo 15 - Tuesday::: 2227
octo 16 - Wednesday::: 4444
octo 17 - Thursday::: 16607
octo 18 - Friday::: 7052
octo 19 - Saturday::: 6000
octo 20 - Sunday::: 5321
octo 21 - Monday::: 6205
octo 22 - Tuesday::: 8007
octo 23 - Wednesday::: 11234
octo 24 - Thursday::: 7502
octo 25 - Friday::: 7111
octo 26 - Saturday::: 6523
octo 27 - Sunday::: 4342
octo 28 - Monday::: 9542
octo 29 - Tuesday::: 3212
octo 30 - Wednesday::: 5600
octo 31 - Thursday::: 19324
nove 01 - Friday::: 22145
nove 02 - Saturday::: 20541
nove 03 - Sunday::: 6523
nove 04 - Monday::: 5785" | awk -F ':::' '{ print $2; }' | awk -F - '{print $1;}' | awk 'BEGIN {total=0; count=0; avg=0; print "COUNTING STEPS";}{total=total+$1;print $1, " - CUMULATIVE:", total; count++;} END {print "TOTAL NUMBER OF STEPS:", total; avg=total/count; print "NUMBER OF DAYS:", count; print "STEPS PER DAY AVERAGE:", avg;}'

OUTPUT:
#######

COUNTING STEPS
8000  - CUMULATIVE: 8000
6000  - CUMULATIVE: 14000
7000  - CUMULATIVE: 21000
6000  - CUMULATIVE: 27000
9000  - CUMULATIVE: 36000
9000  - CUMULATIVE: 45000
4000  - CUMULATIVE: 49000
8500  - CUMULATIVE: 57500
6025  - CUMULATIVE: 63525
6608  - CUMULATIVE: 70133
6120  - CUMULATIVE: 76253
6185  - CUMULATIVE: 82438
3452  - CUMULATIVE: 85890
1858  - CUMULATIVE: 87748
5675  - CUMULATIVE: 93423
2227  - CUMULATIVE: 95650
4444  - CUMULATIVE: 100094
16607  - CUMULATIVE: 116701
7052  - CUMULATIVE: 123753
6000  - CUMULATIVE: 129753
5321  - CUMULATIVE: 135074
6205  - CUMULATIVE: 141279
8007  - CUMULATIVE: 149286
11234  - CUMULATIVE: 160520
7502  - CUMULATIVE: 168022
7111  - CUMULATIVE: 175133
6523  - CUMULATIVE: 181656
4342  - CUMULATIVE: 185998
9542  - CUMULATIVE: 195540
3212  - CUMULATIVE: 198752
5600  - CUMULATIVE: 204352
19324  - CUMULATIVE: 223676
22145  - CUMULATIVE: 245821
20541  - CUMULATIVE: 266362
6523  - CUMULATIVE: 272885
5785  - CUMULATIVE: 278670
TOTAL NUMBER OF STEPS: 278670
NUMBER OF DAYS: 36
STEPS PER DAY AVERAGE: 7740.83

Finally I know my stride is 2 foot long. So I can tell you that for the year of 2013 between those listed days (sept 30 to nov 4th) I walked a total of 278670 steps, for a total of 36 days. Averaging about 7740 steps per day.

Using google to find out the miles 

7740 * 2 feet = ? miles

Or this formula

Feet / 5280 = Miles

Anyhow 15480 feet per day or 2.93 miles per day.

So for the whole time between those days I walked 105.557 Miles!!!

Interesting to note that if my stride was 1 inch longer, I wuold have walked 4.39 miles longer that month:

278670 * 1 inches = ? miles

FINAL EXAMPLE:

Notice how I had days in november and september, what If I only wanted october? Well we need to select out october as soon as we can which would be in the first awk. Now we can just do that with a grep in the very beginning but I will show you awk... and grep...

ORIGINAL:
=========

echo "TEXT" | awk -F ':::' '{ print $2; }' | awk -F - '{print $1;}' | awk 'BEGIN {total=0; count=0; avg=0; print "COUNTING STEPS";}{total=total+$1;print $1, " - CUMULATIVE:", total; count++;} END {print "TOTAL NUMBER OF STEPS:", total; avg=total/count; print "NUMBER OF DAYS:", count; print "STEPS PER DAY AVERAGE:", avg;}'

ONLY OCTOBER WITH GREP:
======================

echo "TEXT" | grep -i 'oct' | awk -F ':::' '{ print $2; }' | awk -F - '{print $1;}' | awk 'BEGIN {total=0; count=0; avg=0; print "COUNTING STEPS";}{total=total+$1;print $1, " - CUMULATIVE:", total; count++;} END {print "TOTAL NUMBER OF STEPS:", total; avg=total/count; print "NUMBER OF DAYS:", count; print "STEPS PER DAY AVERAGE:", avg;}'

SIDENOTE: the grep we use with -i (case insensitivity) because we have Octo and octo, and we want oct to find both. Also notice with grep I used single quotes but I could of used double quotes, or no quotes in this example.

SIDENOTE: its an interesting science in all to know when to use single quotes '' or double quotes "" in bash/linux. There are whole articles on it, but main thing is to practice and try either way, sometimes if you have an error try to switch. The main reason single quotes are used is because they dont exapnd system variables so you can be certain whatever is inside single quotes will remain the same.

ONLY OCTOBER WITH AWK:
======================

echo "TEXT" | awk -F ':::' '/oct/{ print $2; }' | awk -F - '{print $1;}' | awk 'BEGIN {total=0; count=0; avg=0; print "COUNTING STEPS";}{total=total+$1;print $1, " - CUMULATIVE:", total; count++;} END {print "TOTAL NUMBER OF STEPS:", total; avg=total/count; print "NUMBER OF DAYS:", count; print "STEPS PER DAY AVERAGE:", avg;}'

The above would be a problem because it would find only the octo and not the Octo entries... So we need to have it be case insensitive.

So this:

/oct/{ print $2; }

Is case sensitive and will only find oct enteries (so all of the lines with octo...)

The solution is to convert each line to lower case first for the processing (its only for the processing not for the final output)

tolower($0) ~ /oct/ {print $2; }

This means first convert the string(the line at work $0) to lower case with tolower($0) then compare it ~ (only a rough compare is done with ~ not a hard compare, a hard compare would need the whole line to say oct or Oct or OCT etc. but we have stuff following the oct) finally you ask it to compare to /oct/ like that.

With awk you can also do if statements using the above statement, like this:

{ if (tolower($0) ~ /oct/) {print $2;} } Basically meaning if the lowercase version of the current line has the word oct in it then print the second column.

Here are the two working variations

Without if:

echo "TEXT" | awk -F ':::' 'tolower($0) ~ /oct/ {print $2; }' | awk -F - '{print $1;}' | awk 'BEGIN {total=0; count=0; avg=0; print "COUNTING STEPS";}{total=total+$1;print $1, " - CUMULATIVE:", total; count++;} END {print "TOTAL NUMBER OF STEPS:", total; avg=total/count; print "NUMBER OF DAYS:", count; print "STEPS PER DAY AVERAGE:", avg;}'

With if:

echo "TEXT" | awk -F ':::' '{ if (tolower($0) ~ /oct/) {print $2;} } ' | awk -F - '{print $1;}' | awk 'BEGIN {total=0; count=0; avg=0; print "COUNTING STEPS";}{total=total+$1;print $1, " - CUMULATIVE:", total; count++;} END {print "TOTAL NUMBER OF STEPS:", total; avg=total/count; print "NUMBER OF DAYS:", count; print "STEPS PER DAY AVERAGE:", avg;}'

Side note: Notice Im using that find action with the //

WITH GREP COMMAND:
##################

# echo "Sept 30 - Monday::: 8000
Octo 1 - Tuesday::: 6000
Octo 2 - Wednesday::: 7000
Octo 3 - Thursday::: 6000
Octo 4 - Friday::: 9000 - Lost pedometer - heavy workout for three housr
Octo 5 - Saturday::: 9000 - heavy workout for three hours
Octo 6 - Sunday::: 4000 - Steps
Octo 7 - Monday::: 8500 - Got new Pedometer (2000 steps on old before 2:07, after 2:07 6500 steps)
Octo 8 - Tuesday::: 6025
Octo 9 - Wednesday::: 6608
octo 10 - Thursday::: 6120
octo 11 - Friday::: 6185
octo 12 - Saturday::: 3452
octo 13 - Sunday::: 1858
octo 14 - Monday::: 5675
octo 15 - Tuesday::: 2227
octo 16 - Wednesday::: 4444
octo 17 - Thursday::: 16607
octo 18 - Friday::: 7052
octo 19 - Saturday::: 6000
octo 20 - Sunday::: 5321
octo 21 - Monday::: 6205
octo 22 - Tuesday::: 8007
octo 23 - Wednesday::: 11234
octo 24 - Thursday::: 7502
octo 25 - Friday::: 7111
octo 26 - Saturday::: 6523
octo 27 - Sunday::: 4342
octo 28 - Monday::: 9542
octo 29 - Tuesday::: 3212
octo 30 - Wednesday::: 5600
octo 31 - Thursday::: 19324
nove 01 - Friday::: 22145
nove 02 - Saturday::: 20541
nove 03 - Sunday::: 6523
nove 04 - Monday::: 5785" | grep -i  'oct' | awk -F ':::' '{ print $2; }' | awk -F - '{print $1;}' | awk 'BEGIN {total=0; count=0; avg=0; print "COUNTING STEPS";}{total=total+$1;print $1, " - CUMULATIVE:", total; count++;} END {print "TOTAL NUMBER OF STEPS:", total; avg=total/count; print "NUMBER OF DAYS:", count; print "STEPS PER DAY AVERAGE:", avg;}'

WITH GREP OUTPUT:
#################

See below its the same as the awk outputs below.

WITH AWK COMMAND - VARIATION 1:
###############################

# echo "Sept 30 - Monday::: 8000
Octo 1 - Tuesday::: 6000
Octo 2 - Wednesday::: 7000
Octo 3 - Thursday::: 6000
Octo 4 - Friday::: 9000 - Lost pedometer - heavy workout for three housr
Octo 5 - Saturday::: 9000 - heavy workout for three hours
Octo 6 - Sunday::: 4000 - Steps
Octo 7 - Monday::: 8500 - Got new Pedometer (2000 steps on old before 2:07, after 2:07 6500 steps)
Octo 8 - Tuesday::: 6025
Octo 9 - Wednesday::: 6608
octo 10 - Thursday::: 6120
octo 11 - Friday::: 6185
octo 12 - Saturday::: 3452
octo 13 - Sunday::: 1858
octo 14 - Monday::: 5675
octo 15 - Tuesday::: 2227
octo 16 - Wednesday::: 4444
octo 17 - Thursday::: 16607
octo 18 - Friday::: 7052
octo 19 - Saturday::: 6000
octo 20 - Sunday::: 5321
octo 21 - Monday::: 6205
octo 22 - Tuesday::: 8007
octo 23 - Wednesday::: 11234
octo 24 - Thursday::: 7502
octo 25 - Friday::: 7111
octo 26 - Saturday::: 6523
octo 27 - Sunday::: 4342
octo 28 - Monday::: 9542
octo 29 - Tuesday::: 3212
octo 30 - Wednesday::: 5600
octo 31 - Thursday::: 19324
nove 01 - Friday::: 22145
nove 02 - Saturday::: 20541
nove 03 - Sunday::: 6523
nove 04 - Monday::: 5785" | awk -F ':::' 'tolower($0) ~ /oct/ {print $2; }' | awk -F - '{print $1;}' | awk 'BEGIN {total=0; count=0; avg=0; print "COUNTING STEPS";}{total=total+$1;print $1, " - CUMULATIVE:", total; count++;} END {print "TOTAL NUMBER OF STEPS:", total; avg=total/count; print "NUMBER OF DAYS:", count; print "STEPS PER DAY AVERAGE:", avg;}'

OUTPUT WITH AWK - VARIATION 1:
##############################

Same output as awk below.

WITH AWK COMMAND - VARIATION 2 (with if):
#########################################

# echo "Sept 30 - Monday::: 8000
Octo 1 - Tuesday::: 6000
Octo 2 - Wednesday::: 7000
Octo 3 - Thursday::: 6000
Octo 4 - Friday::: 9000 - Lost pedometer - heavy workout for three housr
Octo 5 - Saturday::: 9000 - heavy workout for three hours
Octo 6 - Sunday::: 4000 - Steps
Octo 7 - Monday::: 8500 - Got new Pedometer (2000 steps on old before 2:07, after 2:07 6500 steps)
Octo 8 - Tuesday::: 6025
Octo 9 - Wednesday::: 6608
octo 10 - Thursday::: 6120
octo 11 - Friday::: 6185
octo 12 - Saturday::: 3452
octo 13 - Sunday::: 1858
octo 14 - Monday::: 5675
octo 15 - Tuesday::: 2227
octo 16 - Wednesday::: 4444
octo 17 - Thursday::: 16607
octo 18 - Friday::: 7052
octo 19 - Saturday::: 6000
octo 20 - Sunday::: 5321
octo 21 - Monday::: 6205
octo 22 - Tuesday::: 8007
octo 23 - Wednesday::: 11234
octo 24 - Thursday::: 7502
octo 25 - Friday::: 7111
octo 26 - Saturday::: 6523
octo 27 - Sunday::: 4342
octo 28 - Monday::: 9542
octo 29 - Tuesday::: 3212
octo 30 - Wednesday::: 5600
octo 31 - Thursday::: 19324
nove 01 - Friday::: 22145
nove 02 - Saturday::: 20541
nove 03 - Sunday::: 6523
nove 04 - Monday::: 5785"  | awk -F ':::' '{ if (tolower($0) ~ /oct/) {print $2;} } ' | awk -F - '{print $1;}' | awk 'BEGIN {total=0; count=0; avg=0; print "COUNTING STEPS";}{total=total+$1;print $1, " - CUMULATIVE:", total; count++;} END {print "TOTAL NUMBER OF STEPS:", total; avg=total/count; print "NUMBER OF DAYS:", count; print "STEPS PER DAY AVERAGE:", avg;}'

OUTPUT WITH AWK - VARIATION 2 (with if):
########################################

COUNTING STEPS
6000  - CUMULATIVE: 6000
7000  - CUMULATIVE: 13000
6000  - CUMULATIVE: 19000
9000  - CUMULATIVE: 28000
9000  - CUMULATIVE: 37000
4000  - CUMULATIVE: 41000
8500  - CUMULATIVE: 49500
6025  - CUMULATIVE: 55525
6608  - CUMULATIVE: 62133
6120  - CUMULATIVE: 68253
6185  - CUMULATIVE: 74438
3452  - CUMULATIVE: 77890
1858  - CUMULATIVE: 79748
5675  - CUMULATIVE: 85423
2227  - CUMULATIVE: 87650
4444  - CUMULATIVE: 92094
16607  - CUMULATIVE: 108701
7052  - CUMULATIVE: 115753
6000  - CUMULATIVE: 121753
5321  - CUMULATIVE: 127074
6205  - CUMULATIVE: 133279
8007  - CUMULATIVE: 141286
11234  - CUMULATIVE: 152520
7502  - CUMULATIVE: 160022
7111  - CUMULATIVE: 167133
6523  - CUMULATIVE: 173656
4342  - CUMULATIVE: 177998
9542  - CUMULATIVE: 187540
3212  - CUMULATIVE: 190752
5600  - CUMULATIVE: 196352
19324  - CUMULATIVE: 215676
TOTAL NUMBER OF STEPS: 215676
NUMBER OF DAYS: 31
STEPS PER DAY AVERAGE: 6957.29


FINAL WORDS
###########

With just some little c knowledge and knowing the -F option and the Action syntax you can use awk to mine some good juicy stuff out of data.

Here are those action keywords again (you use this to run the miniprograms at different times of the code - where the miniprogram goes in the action field)

AWK KEYWORDS FOR CONDITIONAL ACTIONS: 
=====================================

SIDENOTE: For all of the below notes everything after the # and including the # are just comments/explanations

BEGIN { Actions}
{ACTION} # Action for everyline in a file (this is the regular action section)
/Find this/ {ACTION} # This is action for the line it finds - case sensitive
END { Actions }

TWO MORE CONDITIONAL ACTIONs:
------------------------------

Also you can do things like commands for the /Find this/ or incorporate it in an if

like this:

tolower($0) ~ /Find this/ {ACTION} # put lower case term in 'Find this' to have a case insentive regex search

Like wise can do that with an if:

{if (tolower($0) ~ /Find this/) {Action}}

Of course alot more conditional actions exist then just this. Also by the way im not sure if these are called conditional actions, I just named them that.
Comments