Text Utilities

My notes on Linux commands that can be used to manipulate or view text

cat – concatenate

In simple form cat displays the contents of a file

cat noobfile
#displays the contents of noobfile

cat noobfile1 noobfile2
#Displays noobfile2 APPENDED (added to the end of) noobfile1

cut

cut is often preceded by a pipe - in other words cut is fed the output of another command

cut -d: -f3,4,5 noobfile
#display fields 3, 4 and 5 ONLY from noobfile where the field delimiter is :

#the same could be achieved by using range
cut -d: -f3-5 noobfile

cut -c1-10 noobfile
#displays the first 10 characters of each line from noobfile

head/tail

head displays the first few lines of a file. tails conversely displays the LAST few lines of a file.

head/tail noobfile
#displays the first/last ten lines of noobfile

head/tail -5 noobfile
#displays the first/last five lines of noobfile

head/tail -n 5 noobfile
#displays the first/last five lines of noobfile

head -n -5 noobfile
#displays all lines of noobfile EXCEPT the last five

tail -n +5 noobfile
#displays all lines of noobfile FROM the fifth line ONWARD

FOLLOWING WITH tail

tail -f nooblog
#displays the last ten lines of nooblog but continues to follow the file so that new entries become visible - useful for administrators to monitor change while troubleshooting .

join

similar to paste, join combines text files on lines, separated by a delimiter; however the join must be specified (ie a column which is common to BOTH files)

join noobfile1 noobfile2
#joins lines of noobfile1 with lines of noobfile2 forming columnsusing the FIRST FIELD of BOTH files to join on

join -1 3 -2 7 -t ',' noob1.csv noob2.csv
#joins lines of noob1.csv with those of noob2.csv joining on the THIRD field of noob1.csv with the SEVENTH field of noob2.csv where the field delimiter is specified to be a ','

less – pager

less is a viewer for examining text files such as man pages in a way that simplifies stepping through the document.

less noobfile
#displays the first page of noobfile and allows stepping through remainder of document

space #move to next page
h #display pager help
q #quit

nl – number lines

nl numbers the lines in a file

nl noobfile
#displays contents of the file with number at beginning of each line. EMPTY lines DO NOT get numbered.
nl -ba noobfile
#numbers all lines including empty ones (NOTE: -ba is an option and an argument NOT two options)

od – octal dump

od displays content of a file in a specific format

DISPLAY OPTIONS:

-a #Named characters
-b #Octal bytes
-c #ASCII characters or backslash escapes
-d #Unsigned decimal 2-byte units
-f #floats
-i #decimal integers
-l #decimal longs
-o #octal 2-byte units
-s #decimal 2-byte units
-x #hexadecimal 2-byte units

The use of most of these is lost on me at the moment, except for -c and -x which I can see as being very useful particularly for forensics.

od -c noobfile
#displays the ASCII content of noobfile against a HEX (default) offset address with default width 16 bytes

od -x noobfile
#displays the HEX content of noobfile against a HEX (default) offset address with default width 16 bytes

od -j 5 -N 20 -x noobfile
#displays a TOTAL of TWENTY bytes of hex content from noobfile, STARTING AT hex offset 5 against a HEX (default) offset with default width 16 bytes

od -w 8 -x noobfile
#display HEX content of noobfile against a HEX offset (default) with WIDTH 8 bytes

OFFSET ADDRESS OPTIONS

-An #no offset address display
-Ad #DECIMAL offset address
-Ax #HEX offset address

paste

paste combines text files as columns separated by a delimiter (default TAB)

paste noobfile1 noobfile2
#displays lines from noobfile1 alongside lines from noobfile2 separated by TAB

paste -d , noobfile1 noobfile2
#as above but separated by , instead of TAB

sed – stream editor

sed replaces a string pattern with a defined string by default on the first occurrence only, but can also insert and append strings

sed 's/Windows/Linux/' noobfile
#will display noobfile with first occurrence of Windows replaced with Linux

#sed can edit a file in place and create a backup of the original
sed -i '.bk' 's/Windows/Linux/' noobfile
#performs same operation as above but changes it in noobfile itself, creating a copy of the original called noobfile.bk

sed 's/Windows/Linux/g' noobfile
#replaces ALL occurrences of Windows with Linux

sed '/Windows/i\Linux/' noobfile
#INSERTS the pattern Linux as a line before a line containing Windows

sed '/Windows/a\Linux/' noobfile
#APPENDS the pattern Linux as a line after a line containing Windows

sed '/1/d/' noobfile
#DELETES all lines containing the character 1 from display of noobfile

sed '/Windows/c\Linux/' noobfile
#REPLACE ANY LINE containing Windows with the pattern Linux

USING MULTIPLE EXPRESSIONS

sed - e 's/Windows/Linux/g/' -e '/1/d/' noobfile
#REPLACES all occurrences of Windows with Linux AND THEN delets all lines containing the character 1

sort

sort by default sorts lines alphabetically using TAB or SPACE as delimiter

sort noobfile
#sorts noobfile using default options

sort -t ',' -k 2 noobfile.csv
#sorts lines in noobfile.csv by the SECOND field using field delimiter ,

sort -t ',' -kn 3 noobfilecsv
#sorts lines in noobfile.csv NUMERICALLY by the THIRD field using field delimiter , (default ASCENDING)

sort -t ',' -knr 3 noobfile.csv
#sorts lines in noobfile.csv NUMERICALLY IN REVERSE ORDER by the THIRD field using field delimiter ,

sort -u noobfile
#perform a default sort on noobfile AND REMOVE DUPLICATES

split

split renders a file into smaller pieces

split noobfile
#splits noobfile into a number of files beginning with 'x' and a suffix starting at aa, then ab and so on.
#by default the split occurs every 1000 lines

split noobfile noobsplit.
#specifies the resulting file prefix. Instead of 'x' it is now noobsplit.

split -d noobfile noobsplit.
#changes the suffix to consecutive digits. The split files in this case will start at noobsplit.00, then noobsplit.01 and so on.

-l #specify number of lines to split on
-b #specify number of bytes to split on

tr – translate

tr often receives a piped input. It replaces a defined set of characters with a second defined set of characters; alternatively it can delete a set of characters altogether or remove duplicates where they appear consecutively.

cat noobfile | tr 'a-z' 'A-Z'
#displays all lower case characters with upper case from noobfile

cat leetspeek | tr 'aels' '@31$'
#replaces a with @, e with 3 and so on for noobfile

echo aappleeaaa | tr -s 'ae'
#squeezes a and e, removing consecutive duplicates
#output here would be applea

cat noobfile | tr -d '0-9'
#displays noobfile with all numbers removed

uniq – unique

uniq can produce a duplicate count, BUT will only remove duplicates on consecutive lines

uniq noobfile
#displays noobfile with all duplicate consecutive lines removed

uniq -c noobfile
#displays the number of consecutive duplicates found in noobfile, along with the duplicate string

vi/vim – text editor

vi is the basic editor, with vim providing the improved features. Many distributions alias vi to vim.

vi has three modes: Command, Insert and Ex

COMMAND MODE (ENTER BY DEFAULT AND ALSO BY ESC KEY)

MOTIONS

h #left one character (also LEFT ARROW on vim)
j #down one line (also DOWN ARROW on vim)
k #up one line (also UP ARROW on vim)
l #right one character (also RIGHT ARROW on vim)
w #one word forward
b #one word back
^ #beginning of the line
$ #end of the line
G #specific line number (prefix with number)
CTRL+G #identify cursor position

These command can be prefixed by numbers; so to move 4 characters to the right, one would type '5l'

ACTIONS

d #CUT (delete) removes and places in buffer
y #COPY (yank) does not remove but places copy in the buffer
P|p #PASTE (put)
u #UNDO last action

Action is performed from current cursor position to location defined in a motion in the following format:

ACTION [COUNT] MOTION | [COUNT] ACTION MOTION

EXTENDED ACTIONS
dd #delete current line
3dd #delete next three lines
dw #delete current word
d3w #delete next three words
d4h #delete four characters to the left

CHANGE 
#deletes selection into buffer and enters INSERT mode
cc #change current line 
cw #change current word
c3w #change next 3 words
c5h #change 5 characters to the left

YANK
#copies selection and places in buffer
yy #yank the current line
3yy #yank the next 3 lines
yw #yank the current word
y$ #yank to end of line

PUT
#inserts text from buffer
p #lower case p puts after the cursor
P #upper case P puts before the cursor

SEARCH
/string #search forwards for string
?string #search backwards for string

INSERT MODE
#entered by performing one of following commands
a #enter insert mode right after cursor
A #enter insert mode at end of line
i #enter insert mode immediately before cursor
I #enter insert mode at beginning of line
o #enter insert mode on blank line after cursor
O #enter insert mode on a blank line before the cursor

EX MODE
#entered by prefixing command with :
:w #SAVE (write) the current file
:w filename #SAVE AS (write to filename)
:w! #FORCE SAVE (will override read only permissions if possible)
:1 #GO TO line number 1
:e filename #OPEN (examine) filename
:q #quit if there are no changes
:q! #FORCE QUIT without saving
:wq #SAVE CHANGES AND QUIT

wc – word count

wc by default returns number of lines, words and bytes in a file

wc noobfile

-l #count lines only
-w #count words only
-c #count bytes only
-m #count characters only
-L #maximum line length in file (character)

Obtaining wordcount of multiple files:

wc *
#will display word counts for all files in the directory

mrn00b0t

Interfacing between technophile and technophobe

Text Utilities

cat – concatenate

cut

head/tail

join

less – pager

nl – number lines

od – octal dump

paste

sed – stream editor

sort

split

tr – translate

uniq – unique

vi/vim – text editor

wc – word count

Leave a comment Cancel reply

cat – concatenate

cut

head/tail

join

less – pager

nl – number lines

od – octal dump

paste

sed – stream editor

sort

split

tr – translate

uniq – unique

vi/vim – text editor

wc – word count

Share this:

Related

Leave a comment Cancel reply