Log File Maintenance and Cleanup
Log files sometimes take up a lot of disk space. Some applications have internal processes that run periodically to manage them and some do not. Often, this becomes a problems after a while as the logs consume all of your partition space.
You can manage these files yourself with a simple script running in a cronjob (or systemd timers if you’re so inclined) if they have a common naming convention and you have the proper access.
Let’s say you have an application called myapp that keeps
it’s logs in a directory called /opt/myapp/logs
and those
files all end with a .log
file extension.
cat >logmanage.sh <<"EOF"
#!/bin/sh
LOGDIR="/opt/myapp/logs"
# Compress all of the files older than a day
find ${LOGDIR} -name '*.log' -mtime +0 -exec compress {} \;
# Purge all of the logs older than a week
find ${LOGDIR} -name '*.Z' -mtime +7 -exec rm -f {} \;
EOF
These two commands will compress those files that are more than older than a day and remove the compressed files after a week.
Add a crontab entry to run this everyday and you’re all set.
Tags: cli, sysadmin, logs, find, motd
Finding Duplicate Files in a Directory Tree
Sometimes I need to find all of the duplicate files in a directory tree. I have this issue all of the time when I move my music collection. Here is a nifty script to sort these things out:
#!/bin/bash
find -not -empty -type f -printf "%s\n" | sort -rn | uniq -d | xargs -I{} -n1 find -type f -size {}c -print0 | xargs -0 md5sum | sort | uniq -w32 -D --all-repeated=separate
If you pipe the output of the above into a text file, for example,
duplicates.txt
, you can then create a script from that:
awk '{$1="";printf("rm \"%s\"\n",$0);}' ~/duplicates.txt >~/duplicates.sh
Then edit the file and remove the lines for the files you want to keep, make the script executable and run it. Done.
Tags: cli, duplicates, find, awk, motd
Processing The Results of The Find Command
As mentioned in the previous post, the find command searches for files, but it has another very useful feature in that it can also perform an action on the files it finds.
I am a vim user, and let’s assume that I want to find my editor’s backup files in the current directory trees. These filesall end with a tilda (~) character. We would use this command to search for them:
$ find . -name '*~'
./.buffer.un~
./.find_buffer.txt.un~
./.Tips.txt.un~
which results in a list of 3 files. All it takes to remove these files is to add on a little to the end of the last command:
$ find . -name '*~' -exec rm -i '{}' \;
Now, not only will this command find all the matching files in the current directory tree, but it will also delete them.
The -exec
parameter tells find
to
execute the command that follows it on each result. The string
{}
is replaced by the current file name being processed
everywhere it occurs in the arguments to the command. The
{}
string is enclosed in single quotes to protect them from
interpretation as shell script punctuation. The \;
sequence
indicates the end of the -exec
argument.
See the manpage for more information.
Using the Find Command to Search for Files
One of the most useful and flexible GNU utilities is the find command. Understanding how to use this command is very important to make you Linux life more efficient.
The general syntax of the find command is:
find [-H] [-L] [-P] [-D debugopts] [-Olevel] [starting-point...] [expression]
That looks like a lot, but most of the time you may only need 2 things:
find [path] [expression]
where the path is a starting point or the top of a directory tree to be searched, and expression is a property and value pair of what you’re trying to find. This may be a file name, last access time, last modification time, size, and/or ownership.
For example, if you’re looking for the file stdlib.h, use the following command:
find / -name stdlib.h
If you run this as a normal user, using find from
the root directory will often result in a lot of error messages
being output to the terminal because the normal user doesn’t
have access to view some of the directories in the search. Therefore you
may want to pipe the stderr output to /dev/null
to
avoid seeing those messages. You can do that like this:
find / -name stdlib.h 2>/dev/null
Move Files Older Than So Many Days with Find
You may want to clean up a directory that has files older than a certain number of days, for example, 30 days. You can do this with the find command:
To move files older than 30 days in current folder to the old folder:
$ find . -mtime +30 -exec mv {} old/ \;
Tags: cli, find, mtime, mv, motd
Rename Files That Start With a Special Character
Suppose you find that you have a file with a special character and you want to delete it:
$ ls
-badfile.txt PrintHood reg57.txt
Favorites Recent scripts
$ rm -badfile.txt
rm: invalid option -- 'b'
Try 'rm ./-badfile.txt' to remove the file '-badfile.txt'.
Try 'rm --help' for more information.
$ ls *.txt
ls: invalid option -- 'e'
Try 'ls --help' for more information.
First, find the inode
of the file by using
ls -i
on the command line:
$ ls -i
54804119 -badfile.txt 56634824 PrintHood
56634825 Recent 56634807 Favorites
54804251 reg57.txt 56634833 scripts
The “-i” flag will display the file’s inode:
54804119
-badfile.txt
The inode for the “bad” file is 54804119. Once the inode is identified, use the find command to rename the file:
$ find . -inum 54804119 -exec mv {} NewName \;
$ ls NewName
NewName
Now you can delete it.
$ rm NewName
Tags: cli, rename, find, inode, motd
How to recursively find the latest modified file in a directory
find . -type f -printf '%T@ %p\n' | sort -n | tail -1 | cut -f2- -d" "
Tags: cli, find, sort, tail, cut, motd
How To Count All The Files Extension Recursively In Linux
To count all the files by file extension recursively on the command line
$ find . -type f | sed -n 's/..*.//p' | sort | uniq -c
40 3g2
5 AVI
13 DS_Store
28 JPG
30 MOV
133 MP4
64 THM
1 docx
18 jpg
1 json
4 m3u
89 m4a
2 m4r
156 m4v
41 mkv
112 mov
38 mp3
587 mp4
1 nfo
2 osp
30 png
1 sh
4 srt
6 svg
10 torrent
6 txt
5 webm
10 zip