git, rmlint, fd, rsync, ClamAV, antivirus, restic
Listing
- Count the number of files in a folder:
$ find ./ -type f | wc -l
- List all file extensions found in a folder:
$ find ./ -type f | rev | cut -d "." -f 1 | sort | uniq | rev
- List all files sharing the same name within the sub folders:
$ find . -type f -printf "%f\n" | sort | uniq --repeated --all-repeated=separate
- List number of files across all subfolders sharing the same name, whatever their extension is:
$ find . -type f -exec basename {} \; | sed 's/\(.*\)\..*/\1/' | sort | uniq -c | grep -v "^[ \t]*1 "
Size & Space
- List size in MiB of subfolders and files in the current folder and display them sorted by size:
- Show the 10 biggest files in MiB found amoung the current directory and its subfolders:
$ find . -type f -exec du -m "{}" \; | sort -nr | head -n 10
- Display the total size used by all PNG files in sub-directories:
$ find ./ -iname "*.png" -exec du -k "{}" \; | awk '{c+=$1} END {printf "%s KB\n", c}'
Search
- Case insensitive search from the current folder of all files that have the string
dummy
in their filename:
$ find ./ -iname "*dummy*"
- Recursive and case insensitive content search on non-binary files from the current folder:
$ grep -RiI "string to search" ./*
- Same as above but only search string in XML files:
$ find ./* -iname "*.xml" -exec grep -Hi "string to search" "{}" \;
- Find all Jpeg images in the system but exclude
/home
and /var/lib
directory:
$ find / -path "/home" -prune -or -path "/var/lib" -prune -or -iname "*.jpg" -print
- Get the list of the latest 10 modified files in the current folder tree:
$ find ./ -printf "%TY-%Tm-%Td %TT %p\n" | sort | tail -n10
- Same as above but sorted by latest access time:
$ find ./ -printf "%AY-%Am-%Ad %AT %p\n" | sort | tail -n10
- Search for
string
contained in all files named MANIFEST.in
, and print their folder path:
$ find . -name "MANIFEST.in" -exec bash -c 'grep --silent "string" "{}" && echo $(dirname "{}")' \;
- Search for 4+ characters long upper-cased strings with underscore, in all files but the
README.md
, LICENSE
and Git metadata:
$ grep --only-matching --no-filename --exclude=./{README.md,LICENSE,.git\*} -RIe '[A-Z_]\{4,\}' . | sort | uniq
- Search all files starting with a dot, and ending with an extension composed of 6 alphanumeric characters. These are temporary files created by rsync:
$ fd --type file --hidden --ignore-case "^\..+\.[0-9a-z]{6}$"
Creation
- Create several folder with a similar pattern:
$ mkdir -p ./folder/subfolder{001,002,003}
- Create a symbolic link (source):
Copy
- Dump a disk to an image while monitoring the copy progression:
$ pv /dev/da0 > '/mnt/tank/my-data/HDD-part1+part2.img'
- Same as above but over SSH:
Renaming
- Convert all files in the current folder to lower case:
- Prefix all files in the current folder:
$ rename 's/(.*)$/prefix-$1/' *
- Rename all mp3 files in the current folder by adding a “sub-extension”:
$ rename 's/\.mp3/\.my-sub-extension\.mp3/' *.mp3
- Renaming based on regular expression, for files matching another regular expression. The particular example below was used to fix some Dropbox conflicting files:
$ find ./Dropbox -type f -name "* (kev-laptop's conflicted copy 2013-02-01)*" -execdir rename -f -v "s/(.*) \(kev-laptop's conflicted copy 2013-02-01\)(.*)/\1\2/" {} \;
- Strip filenames of their leading dot and extension composed of 6 alphanumeric characters. These are temporary files created by rsync:
$ rename --force --dry-run 's/^\.(.+)\.[0-9a-zA-Z]{6}$/$1/' *
Cleaning-up
- Delete all empty files and folders (run this command several times to remove nested empty directories):
$ find ./ -empty -print -delete
- Remove empty directories found in all subfolders starting with
prefix
:
$ find ./ -type d -empty -ipath "./prefix*" -print -delete
- Delete files ending with
.thumbnail.jpg
or .thumbnail.png
files (case insensitive):
$ find ./ -iregex ".*\.thumbnail\.\(jpg\|png\)$" -print -delete
- Same as above but instead for files ending with their dimensions, like
image-640x480.jpg
or photo-2400x3200.png
:
$ find ./ -iregex ".*-[0-9]+x[0-9]+\.\(jpg\|png\)$" -print -delete
- Here is how I clean-up copies of external drives from accumulated cruft over the past decades:
# Remove metadata at volume's root.
$ find . -name "System Volume Information" -type d -depth 1 -mount -print -delete
$ find . -name ".DocumentRevisions-V*" -type d -depth 1 -mount -print -delete
$ find . -name ".TemporaryItems" -type d -depth 1 -mount -print -delete
$ find . -name "\$AVG8.VAULT\$" -type d -depth 1 -mount -print -delete
$ find . -name ".Spotlight-V*" -type d -depth 1 -mount -print -delete
$ find . -name "\$RECYCLE.BIN" -type d -depth 1 -mount -print -delete
$ find . -name ".VolumeIcon.*" -type f -depth 1 -mount -print -delete
$ find . -name "autorun.inf" -type f -depth 1 -mount -print -delete
$ find . -name ".fseventsd" -type d -depth 1 -mount -print -delete
$ find . -name ".Trash-*" -type d -depth 1 -mount -print -delete
$ find . -name "RECYCLER" -type d -depth 1 -mount -print -delete
$ find . -name "Recycled" -type d -depth 1 -mount -print -delete
$ find . -name "found.*" -type d -depth 1 -mount -print -delete
$ find . -name "\$AVG" -type d -depth 1 -mount -print -delete
# Remove metadata file and folders artifacts.
$ find . -name "desktop.ini" -type f -mount -print -delete
$ find . -name "__MACOSX" -type d -mount -print -delete
$ find . -name "Thumbs.db" -type f -mount -print -delete
$ find . -name ".DS_Store" -type f -mount -print -delete
$ find . -name "._*" -type f -mount -print -delete
# Remove empty directories (repeat until none left).
$ find -type d -empty -mount -print -delete
$ find -type d -empty -mount -print -delete
$ find -type d -empty -mount -print -delete
- Delete all files and folders in the current directory except the
README.txt
file:
$ ls ./ -I "README.txt" | xargs rm -rf
- Remove all duplicates within the whole pool of files (including
--hidden
ones) build up from folder-1
, folder-2
and folder-3
directories. In a set of duplicates, the first file in alphabeticcaly sorted named path is kept (-S p
option).
$ rmlint --progress --hidden -S p ./folder-1 ./folder-2 ./folder-3
$ ./rmlint.sh
- Remove all duplicates in
backup-set1
and backup-set2
if and only if they’re already present in backup-set3
(i.e. the reference folder tagged after the //
separator), but do not alter the latter in anyway (thanks to the --keep-all-tagged
option). To make things extra-safe we use --no-crossdev
to not jump to other physical file systems:
$ rmlint --progress --hidden --no-crossdev --keep-all-tagged ./backup-set1/ ./backup-set2/ // ./backup-set3/
$ ./rmlint.sh
Antivirus
- Download and refresh local ClamAV virus definition database:
- Check all files, only display infected files and ring a bell when found (really slow scan):
$ clamscan -r --bell -i /
Backups
- Initialize and start a backup with restic:
$ restic init
$ restic backup --one-file-system ~/
$ restic forget --keep-hourly 24 --keep-daily 15 --keep-weekly 13 --keep-monthly 12 --keep-yearly 3 --prune