Title: Understanding FindFirst and FindNext
Understanding How FindFirst and FindNext Work
Reference Threads:
thread102-1418012: FindFirst/FindNext
thread102-1512569: findfirst - findnext
This document will be a discussion of how FindFirst/FindNext functions. It will include a description of the behavior of the Sysutils variant and how it functions.
As well, it will describe some ideas to utilize what is going on under the hood to
provide some simplicity and some performance in the code.
The basics of FindFirst/FindNext
Most of us should know the basics of how FindFirst and FindNext work. They are located
in the Sysutils unit. What follows is an example that will form a basis for argument
in the reference thread that I started. Since only minor changes will be required, I
will just post the program that I used.
CODE
{$APPTYPE CONSOLE}
program test1; uses sysutils;
{ test program #1 for FindFirst/FindNext FAQ by Glenn9999,
demonstrates testing for attributes }
procedure writeattr(inattr: longint);
{ tests file attributes and writes out a corresponding letter for it }
begin
{ list of file attributes }
if (inattr and faReadOnly) = faReadOnly then
write('R');
if (inattr and faHidden) = faHidden then
write('H');
if (inattr and faSysFile) = faSysFile then
write('S');
if (inattr and faVolumeID) = faVolumeID then
write('V');
if (inattr and faDirectory) = faDirectory then
write('D');
if (inattr and faArchive) = faArchive then
write('A');
end;
var
SR: TSearchRec;
retcode: integer;
begin
retcode := FindFirst('C:\*.*', faAnyFile, SR);
while retcode = 0 do
begin
write(SR.Name, '(':30-Length(SR.Name));
writeattr(Sr.Attr);
writeln(')');
retcode := FindNext(SR);
end;
FindClose(SR);
readln;
end.
As we can see, this program lists all files in "C:\*.*" along with the attributes attached to them. To distinguish between the attributes you test using bitmasks.
And
is the operator which is used. How this works in a technical level is
a subject for another FAQ, if necessary, but know that this is how it is done.
I selected C:\*.* because most of us can relate to what we will see (the contents of a relatively basic XP install). The output of the above program on my system is as follows:
CODE
AUTOEXEC.BAT (A)
BJPrinter (HD)
boot.ini (RHSA)
CONFIG.SYS (A)
Documents and Settings (D)
error.log (A)
FSC (D)
IO.SYS (RHSA)
MSDOS.SYS (RHSA)
NTDETECT.COM (RHSA)
ntldr (RHSA)
pagefile.sys (HSA)
Program Files (RD)
RECYCLER (HSD)
sqmdata00.sqm (HA)
sqmnoopt00.sqm (HA)
System Volume Information (HSD)
WINDOWS (D)
The issue of the first thread
To continue on to the idea that was mentioned in the first thread I started, I'll change the first program by putting faDirectory in the Attr spot of FindFirst. We can then
observe the issue which prompted my post:
CODE
AUTOEXEC.BAT (A)
CONFIG.SYS (A)
Documents and Settings (D)
error.log (A)
FSC (D)
Program Files (RD)
WINDOWS (D)
We are told in the manual for FindFirst:
Quote:
FindFirst searches the directory specified by Path for the first file that matches the file name implied by Path and the attributes specified by the Attr parameter.
..
Attributes can be combined by adding their constants or values. For example, to search for read-only and hidden files
in addition to normal files
, pass (faReadOnly + faHidden) the Attr parameter.
I didn't look at the manual too closely until just now, but the bolded part is what puts some confusion into things - and why the question was asked about in the thread. The specified attribute is included
IN ADDITION
to files with regular attributes.
Hence, there are 4 directories and 3 files specified.
Admittedly it seems to work according to documentation, but it is still definitely broken in functionality, since it doesn't behave in what I would call a predictable way. We have 3 "normal files" listed, which would be true, but only 4 of the 7 directories that we expected back returned. This is no doubt because the other directories had hidden attributes attached to them.
SFFFN
After I posted that, I experimented with writing something with a more predictable filter. In having a look into it, I found that the filtering function related to
Attr is written in the Sysutils implementation of FindFirst as opposed to anything to
do with a Windows API call. So I ended up writing a version which will do strict filtering, more out of interest. I shared it because I thought it would be something
interesting to share. You will see it at the end of that thread, but I will reproduce it here.
CODE
unit sfffn;
{ unit to encapsulate new FindFirst/FindNext/FindClose. Certain
optimizations were made compared to the sysutil versions - most
important being the fact that the functions in this unit will
*STRICTLY* filter attributes presented to it. So don't expect this
version to act like the Sysutils version.
Example: faDirectory in Attr only returns directories.
faHidden+faSysFile only returns files /directories
with BOTH attributes
This means there shouldn't be a need for post-filtering unless you
want to "OR" the attributes - like if you want to return either
faHidden or faSysFile. To that end, this should present an
improvement in that realm.
The relevant definitions from Sysutils were copied into this source file
so the added weight of Sysutils is not necessary if all you do is
search files. Certain helps were also gotten from the Sysutils source.
Functions in this unit:
SFindFirst; Like the FindFirst
SFindNext; Like the FindNext
SFindClose; Like the FindClose }
interface
uses windows;
const
faReadOnly = $00000001;
faHidden = $00000002;
faSysFile = $00000004;
{ faVolumeID = $00000008; faVolumeID seems not used in Windows anyway }
faDirectory = $00000010;
faArchive = $00000020;
faAnyFile = $0000003F;
type
TFileName = string;
LongRec = packed record
Lo, Hi: Word;
end;
TSearchRec = record
Time: Integer;
Size: Integer;
Attr: Integer;
Name: TFileName;
ExcludeAttr: Integer;
FindHandle: THandle;
FindData: TWin32FindData;
end;
function SFindFirst(const Path: string; Attr: Integer;
var F: TSearchRec): Integer;
function SFindNext(var F: TSearchRec): Integer;
procedure SFindClose(F: TSearchRec);
implementation
procedure move_rec(var F: TSearchRec);
var
LocalFileTime: TFileTime;
begin
with F do
begin
Size := FindData.nFileSizeLow;
Attr := FindData.dwFileAttributes;
Name := FindData.cFileName;
FileTimeToLocalFileTime(FindData.ftLastWriteTime, LocalFileTime);
FileTimeToDosDateTime(LocalFileTime, LongRec(F.Time).Hi,
LongRec(F.Time).Lo);
end;
end;
function SFindFirst(const Path: string; Attr: Integer;
var F: TSearchRec): Integer;
const
faSpecial = faHidden or faSysFile or faDirectory;
var
ret: boolean;
begin
if Attr = faAnyFile then
F.ExcludeAttr := not Attr and faSpecial
else
F.ExcludeAttr := Attr;
F.FindHandle := FindFirstFile(PChar(Path), F.FindData);
if F.FindHandle INVALID_HANDLE_VALUE then
begin
while (F.FindData.dwFileAttributes and F.ExcludeAttr) F.ExcludeAttr do
begin
ret := FindNextFile(F.FindHandle, F.FindData);
if (not ret) then
begin
Result := GetLastError;
exit;
end;
end;
move_rec(F);
Result := 0;
end
else
Result := GetLastError;
end;
function SFindNext(var F: TSearchRec): Integer;
var
ret: boolean;
begin
repeat
ret := FindNextFile(F.FindHandle, F.FindData);
if (not ret) then break;
until (F.FindData.dwFileAttributes and F.ExcludeAttr) = F.ExcludeAttr;
move_rec(F);
if ret then
Result := 0
else
Result := GetLastError;
end;
procedure SFindClose(F: TSearchRec);
{ copied from SysUtils source - nothing changes for this }
begin
if F.FindHandle INVALID_HANDLE_VALUE then
Windows.FindClose(F.FindHandle);
end;
end.
We can see that the FindFirst and FindNext call a Windows API function which does not
do any filtering. You will also see that this is true of the Sysutils FindFirst and
FindNext if you load up the copy of SYSUTILS.PAS that is likely with your Delphi install
(not reproducing it here for the obvious copyright concerns).
The main difference between this version and the Sysutils version is that it does not
return files with the faArchive attribute, and will return anything with an attribute (the two main logic changes). We will see that in recoding the program that was given
to use SFFFN.PAS
CODE
BJPrinter (HD)
Documents and Settings (D)
FSC (D)
Program Files (RD)
RECYCLER (HSD)
System Volume Information (HSD)
WINDOWS (D)
We now only have directories and we have all of them.
Post-Filtering
Now as it was pointed out, it's pretty hard to come up with a filter that would suit all circumstances. As was pointed out in the second thread, the only way to return things that are NOT faDirectory is to test for that (again an issue of bitmasks and how they work). So we're back to doing the post-filtering, as most people seem to generally do, and what seems best.
Now the question out of this comes whether we can improve upon FindFirst/FindNext in this respect and come up with something that is at the very least a little more functionally useful, and most certainly efficient. An attempt at that is posted
below:
CODE
unit bfffn;
{ unit to encapsulate new FindFirst/FindNext/FindClose.
This is a bare version which does not take attribute in any form and will
return all files - ideal for post-filtering algorithms. Performance
improvements can be made which departs from the TSearchRec structure to
TWin32FindData, but those will not be made within this unit.
The relevant definitions from Sysutils were copied into this source file
so the added weight of Sysutils is not necessary if all you do is
search files. Certain helps were also gotten from the Sysutils source.
Functions in this unit:
BFindFirst; Like the FindFirst
BFindNext; Like the FindNext
BFindClose; Like the FindClose }
interface
uses windows;
const
faReadOnly = $00000001;
faHidden = $00000002;
faSysFile = $00000004;
{ faVolumeID = $00000008; faVolumeID seems not used in Windows anyway }
faDirectory = $00000010;
faArchive = $00000020;
faAnyFile = $0000003F;
type
TFileName = string;
LongRec = packed record
Lo, Hi: Word;
end;
TSearchRec = record
Time: Integer;
Size: Integer;
Attr: Integer;
Name: TFileName;
ExcludeAttr: Integer;
FindHandle: THandle;
FindData: TWin32FindData;
end;
{ attribute not specified in BFindFirst! }
function BFindFirst(const Path: string; var F: TSearchRec): Integer;
function BFindNext(var F: TSearchRec): Integer;
procedure BFindClose(F: TSearchRec);
implementation
procedure move_rec(var F: TSearchRec);
var
LocalFileTime: TFileTime;
begin
with F do
begin
Size := FindData.nFileSizeLow;
Attr := FindData.dwFileAttributes;
Name := FindData.cFileName;
FileTimeToLocalFileTime(FindData.ftLastWriteTime, LocalFileTime);
FileTimeToDosDateTime(LocalFileTime, LongRec(F.Time).Hi,
LongRec(F.Time).Lo);
end;
end;
function BFindFirst(const Path: string; var F: TSearchRec): Integer;
begin
F.FindHandle := Windows.FindFirstFile(PChar(Path), F.FindData);
if F.FindHandle INVALID_HANDLE_VALUE then
begin
move_rec(F);
Result := 0;
end
else
Result := GetLastError;
end;
function BFindNext(var F: TSearchRec): Integer;
var
ret: boolean;
begin
ret := Windows.FindNextFile(F.FindHandle, F.FindData);
if ret then
begin
move_rec(F);
Result := 0;
end
else
Result := GetLastError;
end;
procedure BFindClose(F: TSearchRec);
{ copied from SysUtils source - nothing changes for this }
begin
if F.FindHandle INVALID_HANDLE_VALUE then
Windows.FindClose(F.FindHandle);
end;
end.
As you can see a lot of code was removed, which relates to finding files by a specific
bitmask specified within Attr. This also affords some control over the code, and
actually allows us to be specific about what is moved or not moved.
Some Testing Programs
I ran some trials against a testing program using Sysutils and this BFFFN unit, using this program.
CODE
{$APPTYPE CONSOLE}
{$DEFINE NODEBUG}
program test4; uses sysutils;
{ test program #4 for FindFirst/FindNext FAQ by Glenn9999,
allows for benchmarking of file seeking performance.
parses through all directories, optionally writing all the results}
type
DWord = Longint;
{ a timing routine }
function TimeMS: DWord;
stdcall; external 'winmm.dll' name 'timeGetTime';
procedure listdirectory(indir: string);
var
SR: TSearchRec;
retcode: integer;
begin
{$IFDEF DEBUG }
writeln(indir);
{$ENDIF}
retcode := FindFirst(indir + '\*.*', faAnyFile, SR);
while retcode = 0 do
begin
{$IFDEF DEBUG }
writeln(SR.Name);
{$ENDIF}
if (SR.Attr and FaDirectory) = faDirectory then
begin
if (SR.Name '.') and (SR.Name '..') then
ListDirectory(indir + '\' + SR.Name);
end;
retcode := FindNext(SR);
end;
Findclose(SR);
end;
var
STime: Longint;
numtimes, i: longint;
begin
write('Number of times to run: ');
readln(numtimes);
STime := TimeMS;
for i := 1 to numtimes do
listdirectory('C:');
STime := TimeMS - STime;
write('Program completed in ', STime, ' ms. Press Enter to Exit.');
readln;
end.
All were run against the drive five times(51,040 files; 6,555 folders).
Using Sysutils: 3485ms
Using BFFN as posted: 3469ms
Using BFFN moving only file name and attr: 3250ms
Using the Windows variants in the main program: 3157ms
Another benchmark - five times (15300 files, 1670 folders on WinME)
Using Sysutils: 1737ms
Using BFFN as posted: 1798ms
Using BFFN moving only file name and attr: 1560ms
Using the Windows variants in the main program: 1509ms
In profiling these programs, the Windows calls, specifically NTDLL.DLL and KERNEL32.DLL take up 98% of the execution time, there is not incredibly much to tune (of this time, FindClose calls took 11%, FindNext took 3%, and FindFirst 0.2%). Though, as you will notice, there is some room for improvement, especially if you scan a large number of files. This difference is from the existence of the code layer on top of the Windows calls in Sysutils.
For those interested in what the program looked like with the Windows variants in it,
here that is:
CODE
{$APPTYPE CONSOLE}
{$DEFINE NODEBUG}
program test7; uses windows;
{ test program #7 for FindFirst/FindNext FAQ by Glenn9999,
allows for benchmarking of file seeking performance.
parses through all directories, optionally writing all the results
change define to DEBUG in order to write the directory paths and names
TWin32FindDataA = record
dwFileAttributes: DWORD;
ftCreationTime: TFileTime;
ftLastAccessTime: TFileTime;
ftLastWriteTime: TFileTime;
nFileSizeHigh: DWORD;
nFileSizeLow: DWORD;
dwReserved0: DWORD;
dwReserved1: DWORD;
cFileName: array[0..MAX_PATH - 1] of AnsiChar;
cAlternateFileName: array[0..13] of AnsiChar;
end;
}
const
faDirectory = $00000010;
type
DWord = Longint;
{ a timing routine }
function TimeMS: DWord;
stdcall; external 'winmm.dll' name 'timeGetTime';
procedure listdirectory(indir: string);
var
SR: TWin32FindData;
retcode, FindHandle: integer;
begin
{$IFDEF DEBUG }
writeln(indir);
{$ENDIF}
FindHandle := Windows.FindFirstFile(PChar(Indir + '\*.*'), SR);
if FindHandle INVALID_HANDLE_VALUE then
Retcode := 0
else
Retcode := GetLastError;
while retcode = 0 do
begin
{$IFDEF DEBUG }
writeln(SR.cFileName);
{$ENDIF}
if (SR.dwFileAttributes and FaDirectory) = faDirectory then
begin
if (String(SR.cFileName) '.') and (String(SR.cFileName) '..') then
ListDirectory(indir + '\' + SR.cFileName);
end;
if Windows.FindNextFile(FindHandle, SR) then
Retcode := 0
else
Retcode := GetLastError;
end;
if FindHandle INVALID_HANDLE_VALUE then
Windows.FindClose(FindHandle);
end;
var
STime: Longint;
numtimes, i: longint;
begin
write('How many times to run: ');
readln(numtimes);
STime := TimeMS;
for i := 1 to numtimes do
listdirectory('C:');
STime := TimeMS - STime;
write('Program completed in ', STime, ' ms. Press Enter to Exit.');
readln;
end.