file displays the file type. To get the mimetype, use the -i option.
$ file Unix.txt Unix.txt: ASCII text
$ file -i Unix.txt Unix.txt: text/plain; charset=us-ascii
wc tells you the number of lines, words and characters in a file.
$ wc hello.txt 2 6 29 hello.txt
$ wc -l hello.txt 2 hello.txt
$ wc -w hello.txt 6 hello.txt
$ wc -c hello.txt 29 hello.txt
Outputs a particular variant of 32-bit cyclic redundancy check (CRC) checksum of a file, files or standard input, together with sizes; in latest GNU Coreutils and some other implementations, it can output other checksums via -a option. This variant of 32-bit CRC is different from the CRC-32 used by zip, PNG and zlib; for one thing, cksum calculates the CRC not only from the octet stream of the file or input but rather from the stream to which the stream length has been appended.
The CRC output by cksum can be used to protect against accidental modifications to files: if the checksum has not changed, the file is very likely undamaged. The default CRC checksum is not cryptographic: it protects only against modifications that are not malicious (intentional).
Latest GNU Coreutils cksum allows a choice from multiple different kinds of checksums, including cryptographic ones, via -a option. These include sysv, bsd, crc, md5, sha1, sha224, sha256, sha384, sha512, blake2b, and sm3. None of the checksums is the CRC-32 of zip, PNG and zlib. OpenBSD cksum provides -a option as well, while the list of algorithms differs slightly. FreeBSD cksum allows a choice of one of three checksum algorithms in addition to the default one via -o1, -o2 and -o3 options; -o3 is the CRC-32 of zip, PNG and zlib; this applies to macOS as well.
$ cksum /etc/passwd 3052342160 2119 /etc/passwd
Some "cksum" implementations provide other algorithms, such as "md5" and "sha1":
$ cksum -a sha1 /etc/passwd SHA1 (/etc/passwd) = 816d937ca4cdb4dee92d5002610fae63b639d224
You can test "cksum" by feeding it a string via standard input:
$ printf 'Guide to UNIX'|cksum 2195826759 13
A legacy tool, outputs a certain kind of checksum of a file, files or standard input, together with sizes. Is not covered by POSIX; POSIX codified #cksum as a replacement tool instead, using a kind of checksum different from those used by legacy sum. Different variants of legacy sum used different algorithms. The legacy algorithms used by variants of sum are provided by the FreeBSD cksum via -o1 and -o2 options, and by latests GNU Coreutils cksum via -a option.
GNU Coreutils sum allows choice of legacy algorithm via -r and -s options.
The two commonly used legacy algorithms are as follows.
The BSD sum, -r in GNU sum:
- Initialize checksum to 0
- For each byte of the input stream
- Perform 16-bit bitwise right rotation by 1 bit on the checksum
- Add the byte to the checksum, and apply modulo 2 ^ 16 to the result, thereby keeping it within 16 bits
- The result is a 16-bit checksum
The System V sum, -s in GNU sum:
- checksum0 = sum of all bytes of the input stream modulo 2 ^ 32
- checksum1 = checksum0 modulo 2 ^ 16 + checksum0 / 2 ^ 16;
- checksum = checksum1 modulo 2 ^16 + checksum1 / 2 ^ 16;
- The result is a 16-bit checksum calculated from the initial 32-bit plain byte sum
Outputs file or file system status, including size, access rights, creation and modification times and more. The command seems absent from POSIX; POSIX only specifies system call stat().
Outputs lines matching a regular expression, not matching it, and similar, depending on options and the regular expression used. See Grep Wikibook.
Compares file content of two files line by line and outputs differences. See also diff3.
Compares file content of three files line by line and outputs differences. See also diff.
Compares files byte by byte, outputting the byte number and the line number where a first difference is found, if any. Outputs nothing if the files are binary identical. No indication is made of the further differences beyond the first one unless option -l is used.
Outputs printable strings found in files, useful when these files are binary.