Text Processing with Unix

So, I was reading this article. I was thinking that there may be nothing new to learn, but I was wrong. I still came to know about some things:

  • The article talks about 2| similar to 2> for piping stderr. It is wrong. On the latest version of bash, this does not work.
  • The tr command takes two sets replaces everything in the first set with the corresponding elements in the second set. Literally.
  • The size of both sets need not be the same. If A is the left set and B is the right set, then if A <= B, then all characters in A are replaced with corresponding characters in B. If A > B, and A is has x more elements than B then the x elements in A will get the last element in B.
  • One can also give character classes as character sets.
  • The -d flag deletes a character from the input.

So much for tr.

  • echo command can expand escape characters with -e option.

Now for sed:

  • sed also understands character classes, but they have to be kept in double brackets: [[ ]].
  • The regular expression grouping operators – ( and ) need not be escaped.

Now for perl:

  • perl takes -p and reads every line of stdin, applies the expression and displays the result to stdout.
  • perl has support for tr also, though its syntax is different.
  • perl has the same regex syntax as sed.