portallinux → find_the_command.htm This is a windrose

Find, the command   (Searching inside)

How to find using find

...and how to locate, list and grep whatever you need whenever you need it
(by fravia+, first published at searchlores in April 2007 - work in fieri)



Related Sections
 

Poser

Version 0.17: April 2008


Introduction

How do we find, locate, search, grep, obtain, conquer, smash, violate, assaporate, individuate the stuff INSIDE OUR OWN HARDDRIVES?

Well, we could use dedicated (and fairly good) programs like Agent ransack for windows or Beagle for GNU/Linux. However Beagle-type applications (tracker comes to mind as a quicker alternative) are all known to hog system resources: it is MUCH better and quicker to use ad hoc commands. In fact Beagle is more or less just a nice graphical interface for
find . -exec grep "pattern" {} \;
Graphical interfaces are nice and easy and dandy, but they tend to get awfully in the way whenever you have to perform any task really quickly and effectively, as anyone that uses GNU/Linux knows.
In fact, in many cases command line searches are MUCH more powerful and time-sparing than their graphical mouse-enabled equivalents.
This page tries to explain to all searchers how to quickly search inside their own hard disk(s) (and hopefully find) whatever they are looking for. A snippet of cosmic power for anyone interested. Of course a thorough read of the realtive "man" files would give you even more knowledge and power (if correctly understood and applied and experimented). But -alas- few are those that nowadays have still the time and the inclination to learn (and experiment) using man files.

The basic commands of an "inside searcher": locate, grep and find

The most important search related commands inside a GNU/Linux system are the following ones:

locate   grep   find

If you insist in using windows, there are still - fortunately - (almost) equivalent commands. Such commands are anyway much better than that lame and sniffing graphical "find" command you can access from the start button.
That evil "find" is just awful spyware which will send directly to Microsoft tons of data about whenever and whatever you are searching onto your own harddisk.
As a general graphical interface windows users have the free and good Agent ransack, that will cut most mustard, yet for more specifical commands I would suggest windows locate and windows grep. As for a real useful "find", there is indeed one: microsoft has stolen a unix-clone "find" that you can use only from the command line (start => run => cmd) using the following syntax:
find [/v] [/c] [/n] [/i] "string" [[Drive:][Path]FileName[...]]
where
/v : Displays all lines that do not contain the specified string
/c : Counts the lines that contain the specified string and displays the total
/n : Precedes each line with the file's line number
/i : Specifies that the search is not case-sensitive.

We'll explain first locate and grep, and then delve into find. Note that in the following all GNU/linux commands with a preceding # must be executed as root (or sudo in Ubuntu), while those with a $ do not require root's privileges. Windows conditioned readers can still follow along with the equivalent commands listed above.

OK: the real problem is that some of these useful commands have a command line syntax that comes straight from the Giurassic period. This notwithstanding, it is still worth learning such syntax by heart... as you'll quickly realize in the following :-P

Locating stuff

Locate is the easiest command for finding filenames. It maintains its own database of files on your system, so it's blazing fast. Some distros install with a cron job that rebuilds the locate database regularly. However you do it, it must be updated periodically:

# updatedb
or
# locate -u
(-u = Create slocate database starting at path /)

Once you create/update the central database, finding files with locate is then really easy:
$ locate filename
Here a case-insensitive search:
$ locate -i filename
Typically locate will return hundred zillions hits. Locate and grep together are the solution:
$ locate -i readme | grep -i image
Or try this:
locate *.mp3 | grep /path_to/Music_collection > ~/Music.txt
The command updatedb should be used routinely before a search if you want locate to give you truer results
How do you use locate to find something containings two words?
Here 'readme' and 'use', note that you don't need quotes
locate -i readme | grep -i use
but you CAN add them
locate -i 'readme' | grep -i 'use'
As you can see, all those "mouse" are still there.

We are still searching for filenames and/or directory & subdirectory names.
Now suppose you need to find a file with a particular bit of text in it. Here grep is really cool for searching the huge dmesg, so let's switch to the next section and use grep.

Grepping the grok out of it

$ dmesg | grep -i usb
You'll get quite a lot of blah blah. Grep prints the whole line each time it finds your search string. dmesg is a special case, as it is a command, and not just a text file. For example, to search an ordinary text file, such as xorg.conf, for references to 'nvidia', this is the normal grep syntax:
# grep -iw nvidia /etc/X11/xorg.conf
-iw is "ignore case + whole word search"
so the usual syntax is grep => options => search string => file to search.
The -w flag means "whole word search". This is really handy, because otherwise grep returns every single line that contains your search string even if it happens to be INSIDE A DIFFERENT WORD.
Here is the best grep trick of all: searching bales of files for a word or string or regexp. Relocate yourself (cd) to the directory you want to search, then:
$ grep -lir "swap" *
-l means "print only the filenames containing your search term". In fact, incomprehensibly enough -l means "do not print the lines". Note how the man gives an even more surreal definition: -l means "suppress normal output - instead print the name of each input file from which output would normally have been printed"), -i as we have seen means "ignore case", and -r is for act recursively (inside the subdirs/subfolders).
* means of course "all files".
If you omit the -l flag, it prints the filenames and the lines containing your search term. So retry it right now:
$ grep -ir swap *

More examples:
grep -R "search" .
Note the point .
this greps recursively down the given directory (in this case . being the directory you are currently in) for "search"
The . can easily be replaced with any directory including /
Note that this is all case sensitive.
Another example, if you have warhammer codex somewhere on your box:
$grep -rn *arhamme*  /
also grep alone, without -rn or such:
# grep "should" /etc/*
the above one as root, or with sudo, of course.

At times you need to search all your *.c and *.h files for a given expression
grep return *.[ch]
This will search all the .c and .h files for the name return.
What if we want to search all the files, and not only *.c and *.h files?
The grep command can do this, but searching binary files produces a lot of junk. Binary characters can do mean things to terminals, so we need a way to convert them into something printable.

So, to search a complete set of files (including binary ones), use the command:
grep return * | cat -v | cut -c 1-80
cat -v: Turns unprintable characters into something readable.
cut -c 1-80: Binary files have long "lines". This command trims them to 80 characters long for viewing and printing.

The find command is useful for going through a directory tree and locating files.
The grep command searches files for a given text string.
You can combine the two to create a system for searching a directory tree for a variable.
However always remember that grep -r (recursive) is FASTER than find/grep.


On the web you can find everywhere the following - quite useful and well written - interesting snippet about the command grep, I strongly suspect it must have been pilfered from "Ubuntu unleashed" by Andrew and Paul Hudson.

Searches for a String in Input with grep

The grep command is an incredibly powerful search tool in the right hands. Grep processes any text, whether in files, or just in standard input.
The basic usage of grep is this:
grep "some text" *
That searches all files in the current directory (but not subdirectories) for the string some text and prints matching lines along with the name of the file. To enable recursive searching in subdirectories, use the -r parameter, like this:
grep -r "some text" *
Each time a string is matched within a file, the filename and the match are printed. If a file contains multiple matches, each of the matches is printed. You can alter this behavior with the -l parameter (lowercase L), which forces grep to print the name of each file that contains at least one match, without printing the matching text. If a file contains more than one match, it is still printed only once. Alternatively, the -c parameter prints each filename that was searched and includes the number of matches at the end, even if there were no matches. You have a lot of control when specifying the pattern to search for. You can, as we did previously, specify a simple string like some text, or you can invert that search by specifying the -v parameter. For example, this returns all the lines of the file myfile.txt that do not contain the word hello:
grep -v "hello" myfile.txt
You can also use regular expressions for your search term. For example, you can search myfile.txt for all references to cat, sat, or mat with this command:
grep "[cms]at" myfile.txt
Adding the -i parameter to that removes case sensitivity, matching Cat, CAT, MaT, and so on:
grep -i [cms]at myfile.txt
The output can also be controlled to some extent with the -n and --color parameters. The first tells grep to print the line number for each match, which is where it appears in the source file. The --color parameter tells grep to color the search terms in the output, which helps them stand out when among all the other text on the line. You choose which color you want using the GREP_COLOR environment variable: export GREP_COLOR=36 gives you cyan, and export GREP_COLOR=32 gives you lime green. This next example uses these two parameters to number and color all matches to the previous command:
grep -in --color [cms]at myfile.txt

Finding files

Find is a recursive command. This means that it will automatically descend into each directory that it runs into while searching.
A typical find command à la beagle is
find . -exec grep "printer" {} \;

If you need to find a filename, you might as well use locate, but find can cut that mustard as well...

Let's find all ogg files on our box. This command:
$ find . -name \*.ogg -printf %f\\n
lists the filenames of all .ogg files inside (and under) the current directory.
The 'double backslash n' is important; '\n' indicates the start of a new line.
The single backslash needs to be escaped by another one so the shell doesn't run amok on it.

Ditto for pdf files:
find . -name \*.pdf -printf %f\\n


finding filetypes & the important use of quotes
Let's try to find all c files on our drives.
find / -iname *.c
no good, as you can see. We'll get the following:
find: paths must precede expression Usage: find [path...] [expression]
ergo we better "redo" our command, but this time we'll put our target in quotes:
find / -iname '*.c'
or
find / -iname '*arhamme*'
everywhere on your box or
find . -iname '*arhamme*'
only downward your given directory
In fact everytime we use wildcards, in order to prevent the shell from interpreting those very wildcards in ways we don't intend, we MUST surround what we're searching for with quotation marks.

If there are much too many files (technically, if the number of characters of output produced by "find" exceeds a built-in limit), this will fail and you'll need to use the "xargs" command to overcome that limitation.
find -name '*.c' | xargs grep -n *nde*

but you can use ls for finding files of a given kind
ls -ld $(find -iname "*.doc" -o -iname "*.pdf")
ls -ld $(find -iname "*.pdf")
check the differences
find /home -iname '*.c' -exec grep -H 'nde' '{}' \;
find /home -iname '*.c' -exec grep -H 'index' '{}' \;
find /home -iname '*.c' -exec grep -H 'inde*' '{}' \;
find /home -iname '*.c' -exec grep -H '*nde*' '{}' \;

---- pdf ----------
find /home -iname '*.pdf' -exec grep -H 'index' '{}' \;
as you can see, this find 'matches' but then gets crazy characters
Another interesting approach:
find . -name 'the*' -type f -ls
you can of course redirect errors to /dev/null (when not running as root/sudo)
find / -iname 'Visual*' -type f 2>/dev/null
find -name '*.c' | xargs grep -n should 2>/dev/null


The wrong quotes problem
Some "misteries"
why does this work:
find /etc  *.conf
while this does not?
find /etc  -iname `*.conf`
because of the space? No! Because of the "weird" quotes/accents/apostrophes you didn't see/notice/realize when copying text from a book or a pdf file.
Quotes and apostrophes are NOT the same. Try the same search again with correct quotes:
find /etc -iname '*.conf'
See? YOU OFTEN (unwittingly) FETCH WRONG QUOTES/APOSTROPHES/ACCENTS FROM PASTED TEXT.
Again, in the following example, the wrong quotes:
find . -name `*.conf`
won't work, while
find . -name "*.conf"
will find all the conf files you have got.

small files finding
(very useful at times)
find /home/babbaz/downloads/medical -size -5k -ls
(Adapt to your directory structure)

huge files finding
(very useful at times)
find . -size +300000k -print
Also handy, in order to just show top disk users:
du -s * | sort -k1,1rn | head
Also convenient, for debian packages sizes:
dpkg-query -W -f='${Installed-Size;10}\t${Package}\n' | sort -k1,1n
pdf files finding
sudo find /home/ -iname '*.pdf'
Woah: we all have much too many pdf files...


On the web you can find everywhere the following - quite useful and well written - interesting snippet about the command find, I strongly suspect it must have been pilfered from "Ubuntu unleashed" by Andrew and Paul Hudson.


Finding Files by Searching with find

The find command is one of the darkest and least understood areas of Linux, but it is also one of the most powerful. Admittedly, the find command does not help itself by using X-style parameters. The UNIX standard is -c, -s, and so on, whereas the GNU standard is --dosomething, --mooby, and so forth. X-style parameters merge the two by having words preceded by only one dash. However, the biggest problem with find is that it has more options than most people can remember. It truly is capable of doing most things you could want.
The most basic usage is this:
find -name "*.txt"
That searches the current directory and all subdirectories for files that end in .txt. The previous search finds files ending in .txt but not .TXT, .Txt, or other case variations. To search without case sensitivity, use -iname instead of -name. You can optionally specify where the search should start before the -name parameter, like this:
find /home -name "*.txt"
Another useful test is -size, which lets you specify how big the files should be to match. You can specify your size in kilobytes and optionally also use + or - to specify greater than or less than. For example:
find /home -name "*.txt" -size 100k 
find /home -name "*.txt" -size +100k 
find /home -name "*.txt" -size -100k 
The first brings up files of exactly 100KB, the second only files larger than 100KB, and the last only files under 100KB. Moving on, the -user option enables you to specify the user who owns the files you are looking for. So, to search for all files in /home that end with .txt, are under 100KB, and are owned by user paul, you would use this:
find /home -name "*.txt" -size -100k -user paul
You can flip any of the conditions by specifying -not before them. For example, you can add a -not before -user paul to find matching files owned by everyone but paul:
find /home -name "*.txt" -size -100k -not -user paul
You can add as many -not parameters as you need, even using -not -not to cancel each other out! (Yes, that is pointless.) Keep in mind, though, that -not -size -100k is essentially equivalent to -size +100k, with the exception that the former will match files of exactly 100KB whereas the latter will not. You can use -perm to specify which permissions a file should have for it to be matched. This is tricky, so read carefully. The permissions are specified in the same way as with the chmod command: u for user, g for group, o for others, r for read, w for write, and x for execute. However, before you give the permissions, you need to specify either a plus, a minus, or a blank space. If you specify neither a plus nor a minus, the files must exactly match the mode you give. If you specify -, the files must match all the modes you specify. If you specify +, the files must match any the modes you specify. Confused yet? The confusion can be cleared up with some examples. This next command finds all files that have permission o=r (readable for other users). Notice that if you remove the -name parameter, it is equivalent to * because all filenames are matched.
find /home -perm -o=r
Any files that have o=r set are returned from that query. Those files also might have u=rw and other permissions, but as long as they have o=r, they will match. This next query matches all files that have o=rw set:
find /home -perm -o=rw
However, that query does not match files that are o=r or o=w. To be matched, a file must be readable and writeable by other users. If you want to match readable or writeable (or both), you need to use +, like this:
find /home -perm +o=rw
Similarly, this next query matches files only that are readable by user, group, and others:
find /home -perm -ugo=r
Whereas this query matches files as long as they are readable by the user, or by the group, or by others, or by any combination of the three:
find /home -perm +ugo=r
If you use neither + or -, you are specifying the exact permissions to search for. For example, the next query searches for files that are readable by user, group, and others but not writeable or executable by anyone:
find /home -perm ugo=r
You can be as specific as you need to be with the permissions. For example, this query finds all files that are readable for the user, group, and others and writeable by the user:
find /home -perm ugo=r,u=w
To find files that are not readable by others, use the -not condition, like this:
find /home -not -perm +o=r
Now, on to the most advanced aspect of the find command: the -exec parameter. This enables you to execute an external program each time a match is made, passing in the name of the matched file wherever you want it. This has very specific syntax: Your command and its parameters should follow immediately after -exec, terminated by \;. You can insert the filename match at any point using {} (an opening and a closing brace side by side). So, you can match all text files on the entire system (that is, searching recursively from / rather than from /home as in our previous examples) over 10KB, owned by paul, that are not readable by other users, and then use chmod to enable reading, like this:
find / -name "*.txt" -size +10k -user paul -not -perm +o=r -exec chmod o+r {} \;
When you type your own -exec parameters, be sure to include a space before \;. Otherwise, you might see an error such as missing argument to `-exec'. Do you see now why some people think the find command is scary? Many people learn just enough about find to be able to use it in a very basic way, but hopefully you will see how much it can do if you give it chance.

grepping inside found files
sudo find /home/ -iname '*.php' -exec grep -H 'search' '{}' \;
"Well, it basically looks for every file in the directory you gave it AND ALL SUBDIRECTORIES THEREOF, and then executes "grep -H 'search' FILENAME" on each one. -H = Print the filename for each match. So this will print out the name of each file that contains your string" ...however, don't try it with pdf files
sudo find /home/ -iname '*.pdf' -exec grep -H 'search' '{}' \;


Finding Files using timestamps

You might want to find recent files, or maybe grep through the last 3 days of activity...

Find comes into its own here: it can limit the scope of the files found according to timestamps.

Now, suppose you want to see what hidden files in your home directory did actually change in the last 3 days:
# find ~ -mtime -3 -name \.\*
TRY IT.
If you know or suspect that something has changed much more recently than that, say in the last 13 minutes, and want to know what -exactly- happened, use the mmin argument:
# find ~ -mmin 13 -name \.\*
Be aware that doing a 'ls' will affect the access time-stamps of the files shown by that action. If you perform ls to see what's in a directory and try the above to see what files were accessed in the last 13 minutes **all those files** will now be listed by find.

To locate files that have been modified since some arbitrary date use a well known trick: create a date mark
# touch -d "13 may 2007 17:54:19" date_mark
# find . -newer date_mark 
To find files created before that date, use the cnewer and negation conditions:
# find . \! -cnewer date_marker
To find a file which was modified yesterday (but less than 24 hours ago):
# find . -daystart -atime 1 -maxdepth
The -daystart argument means the day starts at the actual beginning of the day. Not 24 hours ago.
This argument has meaning for the -amin, -atime, -cmin, ctime, -mmin and -mtime options.

FIND-Related searching tricks

list directories

for i in $( find -type d); do echo $i; done
or
for i in $( ls -d */); do echo $i; done



© 3rd Millennium: [fravia+], all rights reserved, reversed, revealed and reviled