Man I've watched so many awk videos and this has been the only one that has said "this is why you do it" rather than just do the thing I don't know how people expect you to learn when they don't explain what you're doing
I used awk to do data analysis of a large database that output data as a .csv. The company hired an "analyst" (really, he just ran DB scripts) to put the data into Excel, format it and run a few math functions on it. It took 3 hours to complete their way. The awk way took less than 15 minutes.
00:00 Intro 01:15 more show data in file 01:29 wc -l show number of lines in data 01:48 awk '{print}' prints file 02:24 awk '{print $0}' $0 prints every line, $1 prints first field ect. 03:10 awk '/gcc/ {print $1}' match gcc print file name 03:41 awk '/^w/ {print $1}' Lines start with a double-u 04:23 awk '/path/ {print $1,$2}' Lines start with path, include multiple lines 04:41 awk '/path/ {print $1,$2/1024}' can divide output
awk is one of my favourite tools, not because I'm any good at using it, mind, but just because of its history. It's from 1977! The basic apps of our unix/linux ecosystem have a rich history.
@@1MinuteFlipDoc Except for those people who do use it. There's nothing legacy about it, it's just different from excel. And while that obviously doesn't include you (and that's fine), there are lots of people who prefer to write a short script on the command line over clicking through excel dialogs for 2 hours. For conditioning data to use it in data science and number crunching, awk gives you an amazing amount of productivity much quicker than excel does. If you still prefer excel that's fine though, your choice.
@@1MinuteFlipDoc no, people do use it all the time! Plus, it fits perfectly with the Unix philosophy of piping data from one small terminal program to the next. It's been around since the 1970s, but that certainly doesn't mean it's legacy - it's still an amazingly powerful tool that people still choose to use.
Back when I was contracting for EDS one of the sys admins handed me a book on AWK and asked me if I could figure out a way to extract billing info from the processing logs. I was hooked. :-) One of the handy features was using strings as array subscripts, and having "sparse" arrays, where only the accessed elements existed. Eventually, I had most of my production automated with AWK scripts.
If you like using Awk to do stuff like this, there are a few other cool Unix tools you should have a look at. Split, join, uniq will allow you to do DB-like things on text files. Another is tr. Of course sed, cut, grep as well. I took a grad course in which we had to create software development tools only using Unix tools. That class was both illuminating and fascinating. Learned a lot that I still use to this day.
Oh yes. Think of AWK as SQL for text files. You can SELECT rows, and columns (words separated by whatever delimiter applies), and even declare "WHERE" (conditions) using regular expressions and/or boolean operators. Your input could be log files, emails, or whatever you have on text (like text stuff on chats). It could be source code of any programming language if you are interested in gathering quality metrics on them. Your imagination is the limit.
Yes, just like you said, "db-like". This is reinventing SQL with a bunch of half-assed incoherent unorthogonal tools barely cobbled together. It's simpler to just use a real DB like Sqlite.
Nice video Gary. In case you (or anyone else watching) didn't know - you can simplify your "rounding up the file sizes" example using printf() and a format specifier. For example to print the file sizes to one decimal place you could use: {printf("%s %.1fKb ",$1, $2/1024)} Or: {printf("%s %.0fKb ",$1, $2/1024)} To recreate your exmaple & round to the nearest integer…
I have used it to generate a useful database of user information from emails after concatenating the emails for processing. It was not hard to learn and ended up being a very useful multi-purpose tool in addition to its primary mission success. Thanks for an excellent video!
For small files there are more user-friendly tools, but awk really shines when you have some huge text file that you need to massage because it is just about as fast as you can get. Really important to know about in those cases, because you could easily be lead down an unnecessarily slow or expensive "big data" path.
AWK has been used to create a full parser/tokenizer, and other purposes that are arguably way more complex for its intended use. Using AWK you can also make advantage of pattern matching with regular expressions, and AWK has many other tools for text manipulation. But I think one of the most powerful aspects of AWK is using it as a complimentary Unix tool. Use it together with other Unix command-line utilities with pipelining, not everything has to be done in AWK. For example, you can use AWK to parse out formatted words from a complex text file, and now pipe this data to be processed by another utility.
It's been a while since I've used it. First time I ever had to use it was in the early 2000s at a call centre. Their call detail records were gigabyte size and excel was struggling with it. AWK just crunched through the numbers and spit out the results in less than 5 minutes. Think I used SED initially, but AWK was the answer.
I can have a log file parsed with awk faster than that young'n with a spreadsheet. And reformatting an address file that has quotes drives excel nuts. In awk you just manipulate the FS constant. I might be a dinosaur but I would point out that the dinosaur ruled the earth for 165m years and birds are still here.
I often use both Excel spreadsheets and awk in my work as a software engineer doing medical imaging research. Many times I will use these for organizing files / image selection where there are tens of thousands of files in a directory hierarchy and I need to create several hundred cases each containing a subset of the images. Sometimes cases have multiple readings where each reading adds a different imaging modality.
awk is useful as a very short/quick way to get at Nth field in a text file, especially as it treats consecutive delimiters as one. for example, fixed width input files. But where awk really shines is multi-levelled line delimited files, like old style config files etc., as you dont need to write loops and keep flags about which section of the input file you're in.
I used to work for a Telecom's company and at one point was involved in integrating our CMIP stack (millions of lines of C/C++ code) with network management solutions from vendors like IBM and HP etc. I remember the DEC solution to get their OSI network event logs into our stack was based around 5000 lines of AWK code. AWK is awesome.
OMG... after YEARS of programming I happen to learn such a trick to implement a round() function! ASTONISHING! Thanks! I used to use awk to only get a specific column, but that is such a nice tutorial on awk.
Same here! However, there is still an edge case (but no so uncommon) that this implementation won't take into account. That is the x.5 case. When x is odd, you are supposed to round up, when x is even you are supposed to round down. This implementation however always rounds up
Thanks for the info. Spreadsheets are more practical for me, and more practical for newbie experimentation, but AWK really bridges the gap between sheet logic and optimizing work. Plus, you get support from a lot more scripting capabilities. Great stuff!
@@paschikshehu7988 Systems engineer, but it's programmers who come to me for this, usually because they need simple parsing or data manipulation which doesn't take a lot of effort. Then, their programs run my AWK script. These programmers know Sed (which is even simpler than AWK), but their case is usually where Sed is too simplistic and using their programming language would be overkill.
Thanks Gary, that was simple and useful. I usually write small Python scripts for such data processing because I'm more fluent in it, but if it were CSVs or tab-delimited outputs (like from other shell programs) awk is just way simpler. Always wished I got some proper simple intro to it, now it's done :-) - perhaps you could make a follow-up episode or two? Thanks again!
I used awk in the early 1990’s as a developer. Not sure it is something that a non developer should really use. Good demo, I wrote complex code with Awk to parse flat files.
Not sure a non-developer should be using?! What's wrong with learning to use a tool? Should a non-sysadmin never open a command prompt because they're not an expert?! How do people become developers then? Any tool like this you can learn to use is a huge plus! Let's encourage experimentation and learning!
@@SteveJones172pilot - Totally agree with your sentiment on this. People will either be interested (or not) in doing this stuff. That will naturally weed out people that program (programmers/developers, etc.) from people that don't. I don't see any reason to have arbitrary mandates against writing AWK commands (or scripts) for people only because they are not *professional* developers. Anyhow, the 40+ year secret is out! 😏😆
Oddly, my very first AWK script was my most complex! Only a page and a half, but it replaced a 40-page SQR program that tried to parse CSV files (ugh) written by a hammer who thought every problem looked like a nail. After that I would use it in smaller piped sequences with several AWK commands like: awk '{if ( FNR == 1 ){print "FILENAME=",FILENAME}; if (NF && $0 !~ /^--/){print}}' Database/*.sql \ | awk '/^FILENAME=/{files++}; $0!~/FILENAME=/ && NF {loc++}; END{avgloc=loc/files; printf "%d Database LoC in %d files at avg lines per file = %0.f ", loc, files, avgloc}' > $countfile (sorry about the look of the run-in line). For more complex problems, like ETL cases, etc., I just used Perl which was a natural progression from using Shell + AWK.
My ex-colleagues used to hate me writing awk scripts! Brilliant little language. One happy use was to take the output from informix commands to detail table descriptions and create an output shell script to recreate the database for disaster recovery purposes.
I love awk for text formatting and, arguably informal reporting. Also admin scripts. Honestly, though, you can all this and more with PERL, which I recommend.
Actually, grep was taken out of the line editor ed. The command in ed is g/RE/p ( globally search for a regular expression and print the line). Hence "grep RE filename" nawk has more capabilties. BTW "awk" are the initials for Aho, Wineberger, and Kerinigan, the developers who created awk.
I've been just doing this from any programming language I was learning when I get to the read and write files section of the documentation. Nice to see it can be done directly on the command line.
Just want you to know you saved my ass with this video. Procrastinated on an assignment for my CS class and this really helped me understand some stuff I'd missed and get the assignment done in time. Thanks a ton!
I won't go into the specifics but AWK holds a special place in my heart. I know that might sound a bit weird but it's true. Even though I've only ever had to use it in anger twice it was well worth learning just for them.
Thanks, Im glad I clicked in. I never would have searched out this otherwise. I do use SED, GREP, and GVIM. The next time I have the opportunity, I'll have to try to apply these lessons.
awk one liners are great for ad hoc queries and I use it for that, but as soon as you go to scripting surely perl is the way to go? Or if you don't already know perl, then maybe Python which is more friendly for beginners?
Awk is great for awk-shaped problems (basically, report generation on files of simply-formatted ASCII data). If you have a different-shaped problem, don't use awk.
A long time ago when someone was telling me how wonderful Excel was, I simply said "ed, perl, tbl, troff" as in edit your data using ed (actually, I never use ed), process it with Perl (I don't know awk), and finally format it with troff using the tbl preprocessor.
Great video! I have always used grep to search strings in linux and never bothered to figure out what awk did.. This was a great introduction - Just what I need so that next time I have a use case I will remember this and figure out how to do it in awk!
I uses awk/sed on a daily basis at work. I uses AWK primary to analyze excel(exported to csv) or other data files for audits. That is on Windows! In both MINGW64 or WSL2 Linux.
Please can you make some more AWK videos Gary? I'm learning AWK at the moment spent a few days on it, its hard to learn but the rewards in knowing how to use it is worth the reward. This is a great video to get people into using it and seeing the power of it.
1) Learn a middling amount of 'C', K&R please, none of that C++/# crud. 2) Have a good understanding of regular expressions. 3) Realize that each line is processed in the order received by the program statements after BEGIN and before END. Process order can be important.
Thanks, one of the better awk videos on youtube. I use awk scripts on files containing quite chaotic data that lacks the neat structure of csv and similar files. I feel the many comments here suggesting superiority of python, or even perl (which to some extent I agree with) for parsing file data might change once enlightened. To each his own, but my view is don't knock it til you try it. Being efficient with one scripting language does not preclude the possibility that you could be more proficient with another once mastered, especially one purpose-built to extract, manipulate, and reformat data.
I've used awk to extract useful information from pdf documents. The problem was that the information was awkwardly (pun intended) split into several tables throughout the document so I had to first process each table to collect up all the pieces for each element I wanted to output. The solution I came up with was pdttotext + awk to do the processing. The few hours I spent on that awk script has paid off nicely since I've had to reprocess new versions of the same document several times over the years. The alternatives would have been: a) Manually copy paste all the information. One thing I've learned over the years is you *never* trust anything copy pasted by a human (least of all myself)! Also would have been extremely tedious (which adds to the chance to making a mistake), and I would've had to repeat it whenever a new version of the document came out. b) Find some pdf library for my favorite programming language to extract more structured data from the document. Couldn't quickly find anything that worked and I didn't want to start debugging pdf libraries.
Another solution is to use a command line pdf editing tool called, pdfTK. You can read out pdf files from command line and even fill in pdf forms with it.
OMG @Gary, im from another country, i speak another language, but i understand and i learning a lot things with your videos. Thank so much for the explain, congratulations for ur skills.
Thanks. Brought back some great memories of data manipulation of huge point cloud datasets on SGIs. We had to do very similar things before piping data into the OpenGL 3D engine for visualisation purposes. Awk is very flexible and fast and still have many usecases in todays system administration tasks.
I remember using Awk for extracting a column from a command result(using something like {print $1}, but I didn’t know that it could do much more than that.
GNU awk was the first scripting language I learned really well, and I wrote most of my early Bourne shell scripts as basically wrappers around huge chunks of awk code. Then I graduated to Perl, which is absolutely unmatched if you love regexes (I do!), and nowadays I write everything in Python if it's too much for a simple bash script. 😊 I still use awk and Perl daily for oneliners when I do data wrangling. The awk syntax is super comfortable for the things that it is good at. 👍🏻
@@peppigue you wouldn't call v a down arrow even though it's used that way sometimes < > can be less/greater than symbols or angle brackets or left/right arrows, but in a programming context you'd probably use the former ...except when it's a "shift left" operation in which case it'd make sense to call them arrows hm maybe left v and right v??
Very glad this video popped up on my feed! I've been currently working with data using sed but after watching this i think awk it much more suited for me, especially knowing I can write my own functions that run faster than Bash can! Great video, thanks for the explanation!
Need to install and use gawk instead of awk, though, as can then use a match function to match with regular expressions and then reference capture groups in the awk print command - this match is way better than just printing things that got separated into fields
Hi, I’m not a db admin but my feeling is spreadsheets are easier to use and they’re right in front of you. Databases need some kind of ui or they use the cli (inserts, selects). Please confirm/correct.
@@kencheng2929 hi Ken, you have a valid point. If you have a small set of data points to keep track of: then spreadsheets make sense. When you start to get into the 1000's+ then it's time to start looking into a database solution. Spreadsheets should be more for temporary data that has no long-term value. Like forecasting or basic customer metrics. =)
Currently assessing how to extract useful data from multiple differently formatted fuel receipts here. Found your lovely little primer video very helpful - thanks!
Just a question from a newbie: what can I do with those informations? I come to your channel, via ColdFusion and the graphene battery! Thanks in advance
You can manipulate information from files and extract what you want in the way you need. It is just pure formated text being manipulated. No spreadsheets needed. Cheers!
you can redirect the formated output of the awk script to another file, for example: I wanted to create a test file like the one used in the video but didn't know how to do it using only ls, so I used a "ls -la /usr/bin > ls-output.txt", then used awk to select only the fields in the order I wanted with "awk '{print $9,$5} ls-output.txt > ls-awk-output.txt". It's very handy to manipulate formated text files like csv, config files, logs, program outputs, whatever you can imagine...
One example: Extract data from a not very user-friendly system, in a tab-delimited format. Convert it into SQL commands (using loads of "printf"). Run the generated SQL code to load the data in a database. AWK can be the glue between otherwise incompatible systems.
Hi Gary. Something I never considered before, thanks for the video. However, what does one do about non-computer text (i.e. words and documents)? And of course filenames in non-English characters like Thai or Chinese or Hebrew? It's one of the issues I still have with Excel. You'd expect it to be standard and old-hat (at least since the early 2000's because of things like Unicode, especially since multilingual computing and "plug-and-play" was more or less established by 1990) - but I still find I can't always save or crate Excel files from within a script if the text is in some way not English (or "Western" / ANSI). I nearly always have to use third-party routines or non-standard commands to deal with simple things like "write to file" or "save file" - or process the output manually using Notepad++ to convert from one type of language or Unicode encoding to the same encoding that's then recognizable by "standard" software like Word or Notepad (or even back in Excel). Is there a version of AWK that has been updated to the post 2000 world of multiple languages and devices, that works on Windows or Mac? I noticed there's GAWK for Windows, but it looks like it's a kind of DOS version of Linux in a Window. It seems like the people who use these tools are still living in a pre-graphical, pre-internet, pre-phone world like those people lost in the jungle and still believe that we're still at war with Japan. Or do you know of an alternative to AWK that processes modern text files, no matter how many bits or bytes or language encoding they're in?
At 8:02 Gary should have mentioned (as it appears to me) that it goes through each {} once instead of completing each one individually, since the result was that all the “a” conditions were met and completed first. I wonder if it would have completed the “path” {} if the single quotes hadn’t been removed. I’ll have to try this.
Kay, maybe if - for god knows what reason - I'm writing a super complex bash script. Even then, probably not. I generally just sub in a proper scripting language for that.
I gave up with AWK when NAWK came out :-) . AWK was named after the people who developed it, Aho, Weinberger and Kernihan, taking the first letter of their surnames. When NAWK came out some people thought that Aho had died or left the team (as in No Aho, Weinburger and Kernihan), the reality was that the N stands for New. As an old school programmer I’m always pretty amazed that modern programmers, particularly Python aficionados, eschew simple tools like AWK for something larger, less efficient and more difficult to understand when attempting the same job.
Thanks on this comment, So you mean instead of new learning to Python We should got for Awk ? If yes I will proceed to it. Also pls let me know is AwK good as a Query language to the data set file which is continuously getting updated to that file every second.
In some systems, nawk is aliased (symlinked) to awk so they are the same thing there. That way AWK has been incorporating features subsequently added in nawk and gawk. Basically the same thing.
@@satishkmys2 - No. Different animals with different "sweet spots." If you are curious as to why AWK even exists, you might be interested in what Brian Kernighan has to say about it. He is the author of C and AWK. He even uses the word "Python" in the same sentence: th-cam.com/video/W5kr7X7EG4o/w-d-xo.html
GUIs are good for repeated, error-prone tasks. If you find yourself doing a task over and over again in which the task never changes, then build a GUI for it. But probably, that it is not likely, since the task can always improve and change. If you can isolate something so well that it can get its own GUI, then go for it. Nowadays, that is not easy to do.
GUI is nice if you need to see some visualized information or for entertainment. Terminal is nice for fast programs that have a specific task and work together with other programs. People who refuse one of them (GUI or Terminal) limit them self.
There's nothing stopping a program that is GUI that has all the functionality of a command line program, or even having a command line entry area inside of it. The problem is more with the fact that most or practically all GUI programs don't do this for some stupid reason!
I`m using AWK now for more than 20 years. Great. Did amazing jobs with that tool as a freelancer in big companies. AWK saved them a lot of money and brought out answers very quickly. And yes, PERL users don`t need AWK.
@@lacs83 - Perl es el siguiente paso si uno quiere resolver problemas de mayor complejidad que AWK. Por ejemplo, con Perl uno puede recibir o escribir datos de (a) una base de datos. O hacer toda una aplicación con gráficos para la Web. O filtrar un archivo que viene con datos japonés y convertirlo a "cristiano". Perl tiene una curva de aprendizaje bastante suave, pero larga (si uno quiere adentrarse a cosas muy sofisticadas). Para cosas simples, uno puede hacerlo al nivel de AWK. Por ejemplo un programa *completo* de Perl que muestra "HELLO WORLD" sería así: perl -le ‘say “Hello World!” ‘ ¿No es muy difícil, no? ¡Hasta más compacto que lo siguiente!: awk 'BEGIN{print "Hello World"}'
@@CARPB147 estuve un poco con Python y Ruby... Pero he visto buenos comentarios de perl y de que es la navaja suiza. La verdad es que hay diversas opciones... Haskell, clojure y crystal me hacen ojitos pero quizás opte por perl por venir de cajón integrado en Linux.
I use AWK a lot. And while I'm a C++ dev, I'd still recomment Python as a replacement for excel sheets for quick calculations. AWK has severe limitations, which makes it a bit harder to use for anything more complex than basic arithmetics (or string manipulations, but even that is a bit difficult sometimes.)
AWK has been used to create a full parser/tokenizer, and other purposes that are arguably way more complex for its intended use. Using AWK you can also make advantage of pattern matching with regular expressions, and AWK has many other tools for text manipulation. But I think one of the most powerful aspects of AWK is using it as a complimentary Unix tool. Use it together with other Unix command-line utilities with pipelining, not everything has to be done in AWK. For example, you can use AWK to parse out formatted words from a complex text file, and now pipe this data to be processed by another utility.
*Meanwhile*: *Dying in remorse for all the time I've wasted on learning how to use batch files syntax for Windows* what makes it even sadder is that I've always wanted to make use of what I've learned from Java especially when it comes to file management, bash scripts look a lot similar to Java, didn't expect Linux os to be this awesome, I've got bored from all the propaganda for Linux os but, now I understand. I'm woken at last 😂 btw, you did a brilliant job on the rounding function, so satisfying.🤩
In my work I do a fair amount of text processing, often from disparate files. Most programmers would rely on Perl for this but Perl is a gargantuan language and has a steep learning curve. Relying on trusty unix command line utilities like awk, egrep, sed, sort, find and a particularly useful xargs, pipelines and a few other commands I have always been able to accomplish whatever task I was presented with in a matter of minutes. I have never learned Perl and don't see any need to for what I do.
I used to use GREP, AWK and SED in the 80's while porting a CAD program from on operating system to other. But nowdays I tend to use PERL and many times with excel. You can do many things with excel, but complex data manipulation tasks are much easier with perl. One of the best concepts in data manipulation with PERL and AWK are associative arrays.
Thanks I finally found out how to run an awk script from a file. Also if you start your file with #! /usr/bin/awk -f and set the file to executable you can run the script with just ./script.awk
AWK isn’t a number manipulation tool. It’s a text processing tool that can do math. And so much more. Explore GAWK, the Gnu version. And use the tools you know, as best as you can to get the job done. And don’t stop learning.
The apostrophes surrounding the awk program in the command line aren't explicitly to specify it's the program. They are there because the shell processing the command doesn't change anything in a string inside apostrophes before passing the string to awk. In {print $1}, $1 is a variable according to sh, so the shell (I assume it's compatible with sh) substitutes it for something else. Also the program contains a space, therefore, if it wasn't surrounded by qoutes or apostrophes, the shell would split it to two strings.
OK that's interesting. Thanks for the info, now I understand why my awk commands always half break with double quotes. I think I won't forget that now.
Idk, the idea of learning awk has been rattling around in the back of my head for a while, I just don't feel like it's worth the overhead when I could do all this just as easily in Python.
That's cool and pandas is great, but it doesn't beat efficient command line scripting. That's one of the areas I think perl is actually preferable to python
@@MathieuDuponchelle I don't mean interval of characters but interval of consecutive records. RE is nice, but RE without any ifs and mandatory indents in expressions that match ranges of records because the first one matches the first RE (or any expression) and the second one matches the second is nicer. Python can do anything. (G)awk can't but what it can, it can with beautifully short but still understandable codes.
Man I've watched so many awk videos and this has been the only one that has said "this is why you do it" rather than just do the thing
I don't know how people expect you to learn when they don't explain what you're doing
I used awk to do data analysis of a large database that output data as a .csv. The company hired an "analyst" (really, he just ran DB scripts) to put the data into Excel, format it and run a few math functions on it. It took 3 hours to complete their way. The awk way took less than 15 minutes.
Una historia de exito en el uso de AWK
Excel is good at many things, but doing what that analyst did was shstoopid... "A hammer thinks that every problem is a nail."
Was it accurate tho?
00:00 Intro
01:15 more show data in file
01:29 wc -l show number of lines in data
01:48 awk '{print}' prints file
02:24 awk '{print $0}' $0 prints every line, $1 prints first field ect.
03:10 awk '/gcc/ {print $1}' match gcc print file name
03:41 awk '/^w/ {print $1}' Lines start with a double-u
04:23 awk '/path/ {print $1,$2}' Lines start with path, include multiple lines
04:41 awk '/path/ {print $1,$2/1024}' can divide output
awk is one of my favourite tools, not because I'm any good at using it, mind, but just because of its history. It's from 1977! The basic apps of our unix/linux ecosystem have a rich history.
Used AWK in 70's and 80's. Had an accounting system written in AWK. Also, had an AWK to C compiler, for the real hardcore number crunchers.
Now that's a new level of awesomeness
Very nice!
i used AWK in 1941 in ww2 to decrypt enigma. was fun time
"forget spreadsheets and excel"
crowd: ooh?
"use command line!!"
crowd : oh...
yulp! LOL Python (heavy duty jobs) > Excel > AWK.
AWK is a legacy tool. there's a reason people don't use it. hahahah
@@1MinuteFlipDoc Except for those people who do use it. There's nothing legacy about it, it's just different from excel. And while that obviously doesn't include you (and that's fine), there are lots of people who prefer to write a short script on the command line over clicking through excel dialogs for 2 hours. For conditioning data to use it in data science and number crunching, awk gives you an amazing amount of productivity much quicker than excel does. If you still prefer excel that's fine though, your choice.
nickie banchou thanks you save me Time.
1MinuteFlipDoc awk is very powerful, use it pretty much every day. Can’t use excel in pipe chains...
@@1MinuteFlipDoc no, people do use it all the time! Plus, it fits perfectly with the Unix philosophy of piping data from one small terminal program to the next. It's been around since the 1970s, but that certainly doesn't mean it's legacy - it's still an amazingly powerful tool that people still choose to use.
Awk and grep were the heart of many scripts I've written over the years.
Back when I was contracting for EDS one of the sys admins handed me a book on AWK and asked me if I could figure out a way to extract billing info from the processing logs. I was hooked. :-) One of the handy features was using strings as array subscripts, and having "sparse" arrays, where only the accessed elements existed. Eventually, I had most of my production automated with AWK scripts.
I've tried reading more than one awk intro and I've never made it very far. You've successfully taught me way more than all of them, thank you!
Glad I could help!
If you like using Awk to do stuff like this, there are a few other cool Unix tools you should have a look at. Split, join, uniq will allow you to do DB-like things on text files. Another is tr. Of course sed, cut, grep as well. I took a grad course in which we had to create software development tools only using Unix tools. That class was both illuminating and fascinating. Learned a lot that I still use to this day.
A lot of us learned to process tables of data using all these tools before spreadsheets were invented. Welcome to the club.
Oh yes. Think of AWK as SQL for text files. You can SELECT rows, and columns (words separated by whatever delimiter applies), and even declare "WHERE" (conditions) using regular expressions and/or boolean operators. Your input could be log files, emails, or whatever you have on text (like text stuff on chats). It could be source code of any programming language if you are interested in gathering quality metrics on them. Your imagination is the limit.
CAN YOU share ur knowledge?
Yes, just like you said, "db-like". This is reinventing SQL with a bunch of half-assed incoherent unorthogonal tools barely cobbled together. It's simpler to just use a real DB like Sqlite.
As previously asked can you share your program you made? Or anything like it?
Nice video Gary. In case you (or anyone else watching) didn't know - you can simplify your "rounding up the file sizes" example using printf() and a format specifier.
For example to print the file sizes to one decimal place you could use:
{printf("%s %.1fKb
",$1, $2/1024)}
Or:
{printf("%s %.0fKb
",$1, $2/1024)}
To recreate your exmaple & round to the nearest integer…
It's 2:00AM and I'm watching Gary Explain awk... and it was amazing!
Thank you, Mr. Simms!
I have used it to generate a useful database of user information from emails after concatenating the emails for processing. It was not hard to learn and ended up being a very useful multi-purpose tool in addition to its primary mission success. Thanks for an excellent video!
This is the best introduction to awk I have encountered.
Awk and sed one of the most useful and powerful text manipulation and formatting tools I ever learned to use.
For small files there are more user-friendly tools, but awk really shines when you have some huge text file that you need to massage because it is just about as fast as you can get. Really important to know about in those cases, because you could easily be lead down an unnecessarily slow or expensive "big data" path.
AWK has been used to create a full parser/tokenizer, and other purposes that are arguably way more complex for its intended use. Using AWK you can also make advantage of pattern matching with regular expressions, and AWK has many other tools for text manipulation. But I think one of the most powerful aspects of AWK is using it as a complimentary Unix tool. Use it together with other Unix command-line utilities with pipelining, not everything has to be done in AWK. For example, you can use AWK to parse out formatted words from a complex text file, and now pipe this data to be processed by another utility.
I survived my PhD thanks to awk and sed!! Command line rules!
This is the best awk tutorial I've seen so far.. please make a video for SED
AWK? I thought I was the only dinosaur in this world which still uses AWK. Glad to know that I'm not the only one.
That guy (or gal) that they don't let out much whips out awk like an old trucker whipping out a snatch block
Use it every day.
It's been a while since I've used it. First time I ever had to use it was in the early 2000s at a call centre. Their call detail records were gigabyte size and excel was struggling with it. AWK just crunched through the numbers and spit out the results in less than 5 minutes. Think I used SED initially, but AWK was the answer.
Now if you two will mate, the dinosaurs will not go extinct.
I can have a log file parsed with awk faster than that young'n with a spreadsheet. And reformatting an address file that has quotes drives excel nuts. In awk you just manipulate the FS constant. I might be a dinosaur but I would point out that the dinosaur ruled the earth for 165m years and birds are still here.
I often use both Excel spreadsheets and awk in my work as a software engineer doing medical imaging research. Many times I will use these for organizing files / image selection where there are tens of thousands of files in a directory hierarchy and I need to create several hundred cases each containing a subset of the images. Sometimes cases have multiple readings where each reading adds a different imaging modality.
Gary, I love these introductions to Linux/unix commands/software.
Gary, awesome job giving me the basic understanding of awk. All my little failed projects have been revived since Your walk thru of the AWK!
WHOA! As the class went on, My eyes only widened. Thank you Gary! Much love
awk is useful as a very short/quick way to get at Nth field in a text file, especially as it treats consecutive delimiters as one. for example, fixed width input files.
But where awk really shines is multi-levelled line delimited files, like old style config files etc., as you dont need to write loops and keep flags about which section of the input file you're in.
Awk is really amaz...
Syntax error: Missing ending '}'
% in vim will skip between matching parentheses
Java too
Missing bracket here
(insert bracket)
Extra bracket here
(smashes keyboard)
That'll do it.
I used to work for a Telecom's company and at one point was involved in integrating our CMIP stack (millions of lines of C/C++ code) with network management solutions from vendors like IBM and HP etc. I remember the DEC solution to get their OSI network event logs into our stack was based around 5000 lines of AWK code. AWK is awesome.
Used quite a bit of AWK in my 3rd year physics project. I had hundreds of experimental data files to process and it was a good choice.
OMG... after YEARS of programming I happen to learn such a trick to implement a round() function! ASTONISHING! Thanks! I used to use awk to only get a specific column, but that is such a nice tutorial on awk.
Same here!
However, there is still an edge case (but no so uncommon) that this implementation won't take into account. That is the x.5 case. When x is odd, you are supposed to round up, when x is even you are supposed to round down. This implementation however always rounds up
Thanks for the info. Spreadsheets are more practical for me, and more practical for newbie experimentation, but AWK really bridges the gap between sheet logic and optimizing work. Plus, you get support from a lot more scripting capabilities. Great stuff!
Awesome video man!. This was my first contact with awk command. Very very good explanation!
This is the only skill in my job and my previous job where people turn to no one other than me..:*( Getting sentimental here...
what do you do?
@@paschikshehu7988 Systems engineer, but it's programmers who come to me for this, usually because they need simple parsing or data manipulation which doesn't take a lot of effort. Then, their programs run my AWK script. These programmers know Sed (which is even simpler than AWK), but their case is usually where Sed is too simplistic and using their programming language would be overkill.
kaluq system engineer is so cool. I just found a good PID value for my motor speed control using Matlab earlier today.
omg, if you are working as software engineer and they can't do it... change jobs. I doubt your are technically growing in this company hahahah
@@Frankx520 he said systems engineer, not Control Engineer.
Regex, SED, & AWK are awesome tools!
Thanks Gary, that was simple and useful. I usually write small Python scripts for such data processing because I'm more fluent in it, but if it were CSVs or tab-delimited outputs (like from other shell programs) awk is just way simpler. Always wished I got some proper simple intro to it, now it's done :-) - perhaps you could make a follow-up episode or two? Thanks again!
I love it! The best awk demo on TH-cam. More awk vids in the future???
I used awk in the early 1990’s as a developer. Not sure it is something that a non developer should really use. Good demo, I wrote complex code with Awk to parse flat files.
That is what AWK is perfect for. Great for processing log files or other UNIX/Linux sysadmin stuff.
Not sure a non-developer should be using?! What's wrong with learning to use a tool? Should a non-sysadmin never open a command prompt because they're not an expert?! How do people become developers then? Any tool like this you can learn to use is a huge plus! Let's encourage experimentation and learning!
@@SteveJones172pilot - Totally agree with your sentiment on this. People will either be interested (or not) in doing this stuff. That will naturally weed out people that program (programmers/developers, etc.) from people that don't. I don't see any reason to have arbitrary mandates against writing AWK commands (or scripts) for people only because they are not *professional* developers. Anyhow, the 40+ year secret is out! 😏😆
Oddly, my very first AWK script was my most complex! Only a page and a half, but it replaced a 40-page SQR program that tried to parse CSV files (ugh) written by a hammer who thought every problem looked like a nail.
After that I would use it in smaller piped sequences with several AWK commands like:
awk '{if ( FNR == 1 ){print "FILENAME=",FILENAME}; if (NF && $0 !~ /^--/){print}}' Database/*.sql \
| awk '/^FILENAME=/{files++}; $0!~/FILENAME=/ && NF {loc++}; END{avgloc=loc/files; printf "%d Database LoC in %d files at avg lines per file = %0.f
", loc, files, avgloc}' > $countfile
(sorry about the look of the run-in line).
For more complex problems, like ETL cases, etc., I just used Perl which was a natural progression from using Shell + AWK.
My ex-colleagues used to hate me writing awk scripts! Brilliant little language. One happy use was to take the output from informix commands to detail table descriptions and create an output shell script to recreate the database for disaster recovery purposes.
Perfect application for AWK. Nice.
I love awk for text formatting and, arguably informal reporting. Also admin scripts. Honestly, though, you can all this and more with PERL, which I recommend.
Isn't perl a prolang ?
Prolang = PROgramming LANGuage
Yes! (upvoted you for being spot on)
This is the first I have heard of AWK. I am number crunching sports recreational sports handicapper, so AWK might be useful to me.
AWK, Sed, Bash, TCL, GREP, Perl and Nvim are my command line friends. 😍
Actually, grep was taken out of the line editor ed. The command in ed is
g/RE/p ( globally search for a regular expression and print the line). Hence "grep RE filename"
nawk has more capabilties.
BTW "awk" are the initials for Aho, Wineberger, and Kerinigan, the developers who created awk.
@@josephdoyle5304 you're a prince amongst men. 😊
Philistine emacs!
@@thaddeusolczyk5909 Emacs is nice, but I don't see a reason to use it, so I stick with nvim.
@@thaddeusolczyk5909 the command line version of Emacs is terrible.
I've been just doing this from any programming language I was learning when I get to the read and write files section of the documentation. Nice to see it can be done directly on the command line.
He has the smoothest advertising transitions I've come across. Great job! Great content, too!
Just want you to know you saved my ass with this video. Procrastinated on an assignment for my CS class and this really helped me understand some stuff I'd missed and get the assignment done in time. Thanks a ton!
I won't go into the specifics but AWK holds a special place in my heart. I know that might sound a bit weird but it's true. Even though I've only ever had to use it in anger twice it was well worth learning just for them.
Thanks, Im glad I clicked in. I never would have searched out this otherwise. I do use SED, GREP, and GVIM. The next time I have the opportunity, I'll have to try to apply these lessons.
awk one liners are great for ad hoc queries and I use it for that, but as soon as you go to scripting surely perl is the way to go?
Or if you don't already know perl, then maybe Python which is more friendly for beginners?
Awk is great for awk-shaped problems (basically, report generation on files of simply-formatted ASCII data). If you have a different-shaped problem, don't use awk.
Best introduction I've ever seen! I've always been kind of reluctant to learn, but knowing inside that I should do it...
Thanks for the video!
Love me some AWK and have made plenty of use of it over the years.
A long time ago when someone was telling me how wonderful Excel was, I simply said "ed, perl, tbl, troff" as in edit your data using ed (actually, I never use ed), process it with Perl (I don't know awk), and finally format it with troff using the tbl preprocessor.
Great video! I have always used grep to search strings in linux and never bothered to figure out what awk did.. This was a great introduction - Just what I need so that next time I have a use case I will remember this and figure out how to do it in awk!
I learned a lot of AWK about 20 years ago - very useful
I uses awk/sed on a daily basis at work. I uses AWK primary to analyze excel(exported to csv) or other data files for audits. That is on Windows! In both MINGW64 or WSL2 Linux.
I use awk at my job and I am always in awe of it. This video is a great little intro and the rounding logic was pretty neat too! Thanks Gary!
gawk
surely adding 0.5 to say 2.1 doesn't give the correct rounding integer as it doesn't round up ?
This has to be the best clarification i've ever seen . Thanks a lot !
Please can you make some more AWK videos Gary?
I'm learning AWK at the moment spent a few days on it, its hard to learn but the rewards in knowing how to use it is worth the reward. This is a great video to get people into using it and seeing the power of it.
1) Learn a middling amount of 'C', K&R please, none of that C++/# crud.
2) Have a good understanding of regular expressions.
3) Realize that each line is processed in the order received by the program statements after BEGIN and before END. Process order can be important.
@@johnmcginnis5201
Why we need to put the {} between print .
What does it mean?
Thanks, one of the better awk videos on youtube. I use awk scripts on files containing quite chaotic data that lacks the neat structure of csv and similar files. I feel the many comments here suggesting superiority of python, or even perl (which to some extent I agree with) for parsing file data might change once enlightened. To each his own, but my view is don't knock it til you try it. Being efficient with one scripting language does not preclude the possibility that you could be more proficient with another once mastered, especially one purpose-built to extract, manipulate, and reformat data.
I've used awk to extract useful information from pdf documents. The problem was that the information was awkwardly (pun intended) split into several tables throughout the document so I had to first process each table to collect up all the pieces for each element I wanted to output. The solution I came up with was pdttotext + awk to do the processing. The few hours I spent on that awk script has paid off nicely since I've had to reprocess new versions of the same document several times over the years.
The alternatives would have been:
a) Manually copy paste all the information. One thing I've learned over the years is you *never* trust anything copy pasted by a human (least of all myself)! Also would have been extremely tedious (which adds to the chance to making a mistake), and I would've had to repeat it whenever a new version of the document came out.
b) Find some pdf library for my favorite programming language to extract more structured data from the document. Couldn't quickly find anything that worked and I didn't want to start debugging pdf libraries.
Hi I am interested in pdf extraction. Can you kind of give some clue codes to me to explore further.
Another solution is to use a command line pdf editing tool called, pdfTK. You can read out pdf files from command line and even fill in pdf forms with it.
IIRC, Perl has some modules (think libraries) for handling PDF files, and Excel files too.
Cool.
However much you are being compensated for producing these videos, it is not enough. Thank you !!!
awk -F "/" {print ...} . for specifying the different seperator other than default whitespace
OMG @Gary, im from another country, i speak another language, but i understand and i learning a lot things with your videos.
Thank so much for the explain, congratulations for ur skills.
Great to hear!
Your way of explaining topice is very easy... Please make more videos on linux
Thanks. Brought back some great memories of data manipulation of huge point cloud datasets on SGIs. We had to do very similar things before piping data into the OpenGL 3D engine for visualisation purposes. Awk is very flexible and fast and still have many usecases in todays system administration tasks.
I remember using Awk for extracting a column from a command result(using something like {print $1}, but I didn’t know that it could do much more than that.
GNU awk was the first scripting language I learned really well, and I wrote most of my early Bourne shell scripts as basically wrappers around huge chunks of awk code. Then I graduated to Perl, which is absolutely unmatched if you love regexes (I do!), and nowadays I write everything in Python if it's too much for a simple bash script. 😊
I still use awk and Perl daily for oneliners when I do data wrangling. The awk syntax is super comfortable for the things that it is good at. 👍🏻
Just thought I'd explain, that 'up arrow' is a caret or circumflex.
That's why ppl call it up arrow
@@peppigue you wouldn't call v a down arrow even though it's used that way sometimes
< > can be less/greater than symbols or angle brackets or left/right arrows, but in a programming context you'd probably use the former
...except when it's a "shift left" operation in which case it'd make sense to call them arrows
hm
maybe left v and right v??
Sometimes people call it "hat" referring to the hat operator in mathematics.
Weird 'flex but okay.
@@gorgolyt very circumspect
Very glad this video popped up on my feed! I've been currently working with data using sed but after watching this i think awk it much more suited for me, especially knowing I can write my own functions that run faster than Bash can! Great video, thanks for the explanation!
Good old integer arithmetic, takes me back to when I was a lad..
Awk would have been very useful in a former life. Thank you very interesting.
Need to install and use gawk instead of awk, though, as can then use a match function to match with regular expressions and then reference capture groups in the awk print command - this match is way better than just printing things that got separated into fields
Databases! I was using SQL and databases / Dbase for a long time. Can't stand using spreadsheets as databases.
Spreadsheet is just for office work, can't do anything for big data (It blows up at the moment you open the file). Database is the real deal.
the right tool for the right job!
Nobody ever: I hate having to commute from LA to NY on a skateboard!
AMEN
Hi, I’m not a db admin but my feeling is spreadsheets are easier to use and they’re right in front of you. Databases need some kind of ui or they use the cli (inserts, selects). Please confirm/correct.
@@kencheng2929 hi Ken, you have a valid point. If you have a small set of data points to keep track of: then spreadsheets make sense. When you start to get into the 1000's+ then it's time to start looking into a database solution.
Spreadsheets should be more for temporary data that has no long-term value. Like forecasting or basic customer metrics. =)
Currently assessing how to extract useful data from multiple differently formatted fuel receipts here. Found your lovely little primer video very helpful - thanks!
Glad it was helpful!
Easy to learn, too. I love awk! Thank you Gary!
Finally a concise awk tutorial! Thanks!
perl >> (sed, awk ). You can do all of sed and awk in perl (and there are even conversion scripts for it, called a2p and s2p) but not the other way.
Sure. And an 18-wheeler can carry more than a pickup. But a lot of people find a pickup works just fine for day-to-day tasks.
@@jrd33 - Certainly. There are tools more suited for certain jobs than others. I think it is good to have variety of choice.
Excellent! You explained awk very well!
*GARY!!!*
*Good Morning Professor!*
*Good Morning Fellow Classmates!*
MARK!!!
Mark, sit back down and turn to page 33 in the 2020 edition of GE
'Gentle introduction to awk' .... gentle if you're like Gary... Thanks for this
I have been using AWK for 20 years, it rocks!
Just a question from a newbie:
what can I do with those informations?
I come to your channel, via ColdFusion and the graphene battery!
Thanks in advance
anything you want really ;)
You can manipulate information from files and extract what you want in the way you need. It is just pure formated text being manipulated. No spreadsheets needed. Cheers!
you can redirect the formated output of the awk script to another file, for example: I wanted to create a test file like the one used in the video but didn't know how to do it using only ls, so I used a "ls -la /usr/bin > ls-output.txt", then used awk to select only the fields in the order I wanted with "awk '{print $9,$5} ls-output.txt > ls-awk-output.txt". It's very handy to manipulate formated text files like csv, config files, logs, program outputs, whatever you can imagine...
See 14:47 for example
One example: Extract data from a not very user-friendly system, in a tab-delimited format. Convert it into SQL commands (using loads of "printf"). Run the generated SQL code to load the data in a database. AWK can be the glue between otherwise incompatible systems.
Hi Gary. Something I never considered before, thanks for the video.
However, what does one do about non-computer text (i.e. words and documents)? And of course filenames in non-English characters like Thai or Chinese or Hebrew? It's one of the issues I still have with Excel. You'd expect it to be standard and old-hat (at least since the early 2000's because of things like Unicode, especially since multilingual computing and "plug-and-play" was more or less established by 1990) - but I still find I can't always save or crate Excel files from within a script if the text is in some way not English (or "Western" / ANSI).
I nearly always have to use third-party routines or non-standard commands to deal with simple things like "write to file" or "save file" - or process the output manually using Notepad++ to convert from one type of language or Unicode encoding to the same encoding that's then recognizable by "standard" software like Word or Notepad (or even back in Excel).
Is there a version of AWK that has been updated to the post 2000 world of multiple languages and devices, that works on Windows or Mac? I noticed there's GAWK for Windows, but it looks like it's a kind of DOS version of Linux in a Window. It seems like the people who use these tools are still living in a pre-graphical, pre-internet, pre-phone world like those people lost in the jungle and still believe that we're still at war with Japan.
Or do you know of an alternative to AWK that processes modern text files, no matter how many bits or bytes or language encoding they're in?
My first time hearing and knowing this language!
Oh, oh. The secret is coming out!
At 8:02 Gary should have mentioned (as it appears to me) that it goes through each {} once instead of completing each one individually, since the result was that all the “a” conditions were met and completed first. I wonder if it would have completed the “path” {} if the single quotes hadn’t been removed. I’ll have to try this.
Kay, maybe if - for god knows what reason - I'm writing a super complex bash script.
Even then, probably not. I generally just sub in a proper scripting language for that.
I gave up with AWK when NAWK came out :-) . AWK was named after the people who developed it, Aho, Weinberger and Kernihan, taking the first letter of their surnames. When NAWK came out some people thought that Aho had died or left the team (as in No Aho, Weinburger and Kernihan), the reality was that the N stands for New.
As an old school programmer I’m always pretty amazed that modern programmers, particularly Python aficionados, eschew simple tools like AWK for something larger, less efficient and more difficult to understand when attempting the same job.
Thanks on this comment, So you mean instead of new learning to Python We should got for Awk ? If yes I will proceed to it. Also pls let me know is AwK good as a Query language to the data set file which is continuously getting updated to that file every second.
In some systems, nawk is aliased (symlinked) to awk so they are the same thing there. That way AWK has been incorporating features subsequently added in nawk and gawk. Basically the same thing.
@@satishkmys2 - No. Different animals with different "sweet spots."
If you are curious as to why AWK even exists, you might be interested in what Brian Kernighan has to say about it. He is the author of C and AWK. He even uses the word "Python" in the same sentence: th-cam.com/video/W5kr7X7EG4o/w-d-xo.html
In the early 90s: hey everyone, learn guis!!
Today: hey let's go back to the command line!!!
GUIs are good for repeated, error-prone tasks. If you find yourself doing a task over and over again in which the task never changes, then build a GUI for it. But probably, that it is not likely, since the task can always improve and change. If you can isolate something so well that it can get its own GUI, then go for it. Nowadays, that is not easy to do.
@@Hassan8Ola this seems like the criteria for scripted automation....
GUI is nice if you need to see some visualized information or for entertainment. Terminal is nice for fast programs that have a specific task and work together with other programs.
People who refuse one of them (GUI or Terminal) limit them self.
I think the general idea in the 90s was that for GUIs you didn't have to learn anything.
There's nothing stopping a program that is GUI that has all the functionality of a command line program, or even having a command line entry area inside of it. The problem is more with the fact that most or practically all GUI programs don't do this for some stupid reason!
The rounding function at 11:40 does not work for me. I get "awk: path15k.awk: line 5: return outside function body".
You probably have misplaced braces or parentheses. Works fine as shown.
Reminds me of my days on the Sparc 2!! :-D Those were the days. *sigh*
I`m using AWK now for more than 20 years. Great. Did amazing jobs with that tool as a freelancer in big companies. AWK saved them a lot of money and brought out answers very quickly.
And yes, PERL users don`t need AWK.
Realmente se ve que tienes mucha experiencia. Quizas algun dia aprenda Perl, por ahora solo me enfoco en AWK
@@lacs83 - Perl es el siguiente paso si uno quiere resolver problemas de mayor complejidad que AWK. Por ejemplo, con Perl uno puede recibir o escribir datos de (a) una base de datos. O hacer toda una aplicación con gráficos para la Web. O filtrar un archivo que viene con datos japonés y convertirlo a "cristiano". Perl tiene una curva de aprendizaje bastante suave, pero larga (si uno quiere adentrarse a cosas muy sofisticadas). Para cosas simples, uno puede hacerlo al nivel de AWK.
Por ejemplo un programa *completo* de Perl que muestra "HELLO WORLD" sería así:
perl -le ‘say “Hello World!” ‘
¿No es muy difícil, no? ¡Hasta más compacto que lo siguiente!:
awk 'BEGIN{print "Hello World"}'
@@CARPB147 si, ya lo tengo en la mira.
@@lacs83 - es una experiencia semi-religiosa de lo asombroso que puede ser.
@@CARPB147 estuve un poco con Python y Ruby... Pero he visto buenos comentarios de perl y de que es la navaja suiza. La verdad es que hay diversas opciones... Haskell, clojure y crystal me hacen ojitos pero quizás opte por perl por venir de cajón integrado en Linux.
I use AWK a lot. And while I'm a C++ dev, I'd still recomment Python as a replacement for excel sheets for quick calculations.
AWK has severe limitations, which makes it a bit harder to use for anything more complex than basic arithmetics (or string manipulations, but even that is a bit difficult sometimes.)
AWK has been used to create a full parser/tokenizer, and other purposes that are arguably way more complex for its intended use. Using AWK you can also make advantage of pattern matching with regular expressions, and AWK has many other tools for text manipulation. But I think one of the most powerful aspects of AWK is using it as a complimentary Unix tool. Use it together with other Unix command-line utilities with pipelining, not everything has to be done in AWK. For example, you can use AWK to parse out formatted words from a complex text file, and now pipe this data to be processed by another utility.
Thanks Garry! Really interesting video.
*Meanwhile*: *Dying in remorse for all the time I've wasted on learning how to use batch files syntax for Windows*
what makes it even sadder is that I've always wanted to make use of what I've learned from Java especially when it comes to file management, bash scripts look a lot similar to Java, didn't expect Linux os to be this awesome, I've got bored from all the propaganda for Linux os but, now I understand. I'm woken at last 😂
btw, you did a brilliant job on the rounding function, so satisfying.🤩
Sullying bash by comparisons to Java is heresy. Wash out your mouth with a bar of soap.
In my work I do a fair amount of text processing, often from disparate files. Most programmers would rely on Perl for this but Perl is a gargantuan language and has a steep learning curve. Relying on trusty unix command line utilities like awk, egrep, sed, sort, find and a particularly useful xargs, pipelines and a few other commands I have always been able to accomplish whatever task I was presented with in a matter of minutes. I have never learned Perl and don't see any need to for what I do.
subscribed. dammit, I'm not a coder (much of) but you made that understandable even while talking blisteringly fast! kudos.
You can slow playback of TH-cam videos when clicking on the cogwheel icon and selecting your playback speed.
high quality video. perfect audio. the pace was perfect. the explanation and examples were perfect.
I used to use GREP, AWK and SED in the 80's while porting a CAD program from on operating system to other. But nowdays I tend to use PERL and many times with excel. You can do many things with excel, but complex data manipulation tasks are much easier with perl. One of the best concepts in data manipulation with PERL and AWK are associative arrays.
Thanks I finally found out how to run an awk script from a file. Also if you start your file with
#! /usr/bin/awk -f
and set the file to executable you can run the script with just
./script.awk
AWK isn’t a number manipulation tool. It’s a text processing tool that can do math. And so much more. Explore GAWK, the Gnu version.
And use the tools you know, as best as you can to get the job done. And don’t stop learning.
The apostrophes surrounding the awk program in the command line aren't explicitly to specify it's the program. They are there because the shell processing the command doesn't change anything in a string inside apostrophes before passing the string to awk. In {print $1}, $1 is a variable according to sh, so the shell (I assume it's compatible with sh) substitutes it for something else. Also the program contains a space, therefore, if it wasn't surrounded by qoutes or apostrophes, the shell would split it to two strings.
OK that's interesting. Thanks for the info, now I understand why my awk commands always half break with double quotes. I think I won't forget that now.
Idk, the idea of learning awk has been rattling around in the back of my head for a while, I just don't feel like it's worth the overhead when I could do all this just as easily in Python.
Of course you can, but when it comes to tricky interval pattern matching that "just as easily" can fade away quickly.
@@lxathu import re; ?
@@lxathu I don't think you're aware that `pandas` is a thing, and that it has regular expression functionality.
That's cool and pandas is great, but it doesn't beat efficient command line scripting. That's one of the areas I think perl is actually preferable to python
@@MathieuDuponchelle I don't mean interval of characters but interval of consecutive records.
RE is nice, but RE without any ifs and mandatory indents in expressions that match ranges of records because the first one matches the first RE (or any expression) and the second one matches the second is nicer.
Python can do anything. (G)awk can't but what it can, it can with beautifully short but still understandable codes.
Gary! I'm excited to find your channel. Pleasing. Subscribed!
So does this transfer seamlessly to "console" in MacOSX (a UNIX command line)?
And thanks for pointing all of us "awk-ward."
Fred
Yup
Of course man what did you think macos is ?
A non unix?
I did not know awk is so powerful , thank you so much for sharing this Gary