Title: Fast search of a string into a file
Question: Search an string through a file. Something similar to a 'Grep'.
Answer:
Serve this trick like example of a quick search in a file using readings through a Buffer to hurry.
In short, in this example it is to look for the first time that a string appears in a file (in case it appears, of course) indicating its position
from the beginning of the file.
It would be as making a search by means of a Pos(Substring, String), unless instead of looking for in a string, we will be able to read in a
file of several gigas, but with the advantage of not having to load it suddenly in memory.
To achieve it, we will go loading the file in a memory Buffer (that is of 8 Kbytes in the example), piece for piece.
The process is like it continues:
-We load a piece of 8 Kbytes and we look for inside the one.
-If we find the string in that piece of 8 kbytes, we end up and we show where was found.
-If the string was not found in that piece, we will repeat the process, that is to say, we will load a new piece and we look for again.
All this is very well, but we leave ourselves a small detail: What happens if the string is located just between two pieces of those of 8
Kbytes?... because it would pass then our search would fail wretchedly:)
To avoid it, we rewind the Stream a piece back fair before reading the following one.
In short, we will rewind so many bytes like the longitude of the looked for chain, we make sure this way that we will find it although it
plunders in a middle of two pieces.
Easily it could be adapted for, for example, to count the times that it is that string inside the file, to substitute a string for other, to build
your own command Grep, etc, etc...
Here is the function and a call example is, everything it content in the OnClick of a TButton anyone:
procedure TForm1.Button1Click(Sender: TObject);
var
EncontradaEn : integer;
function BuscaStringEnFichero(const Fichero: string ;const Cadena: string):integer;
{ Busca la primera vez que la cadena 'Cadena' aparece dentro del fichero 'Fichero',
devolviendo la posicin (Offset) en la que se encuentra (contando desde el principio
del fichero) o bien devuelve un -1 si la cadena no fu encontrada.
It looks for the first time that the string ' Cadena' appears inside the file ' Fichero',
returning the position (Offset) in the one that is (counting from the beginning
of the file) or it returns a -1 if the string was not find
Radikal Q3 para Trucomania}
const
{Leeremos de 8K en 8K
We will read of 8K in 8K }
CUANTOBUFFER = 8192;
var
Corriente : TFileStream;
Almacen : String;
Donde : integer;
Parar : boolean;
Posicion : integer;
begin
SetLength(Almacen, CUANTOBUFFER);
Corriente:=TFileStream.Create(Fichero,fmOpenRead OR fmShareDenyWrite);
Result:=-1;
try
Corriente.Seek(0,soFromBeginning);
Parar:=FALSE;
repeat
{Guardamos el inicio de lo leido, antes de leer
We keep the beginning of that read, before reading }
Posicion:=Corriente.Position;
{Parar:=TRUE cuando no haya mas que leer o bien hayamos encontrado la cadena
Parar(stop):=TRUE when there is not but to read or we have found the string }
Parar:= ( Corriente.Read(Almacen[1],CUANTOBUFFER) CUANTOBUFFER );
{Buscamos la cadena en el Almacen leido
We look for the string in the read Almacen }
Donde:=Pos(Cadena, Almacen);
If Donde 0 then begin
Result:=Donde+Posicion;
{Si la hemos encontrado... tambien paramos
If we have found it... we also stopped }
Parar:=TRUE;
end else begin
{Rebobinamos un poco por si la cadena estuviera en medio de dos
pginas de CUANTOBUFFER de longitud:
We rewind a little for if the string was in a middle of two
pages of CUANTOBUFFER of longitude }
Corriente.Seek(Length(Cadena),soFromCurrent);
end;
until Parar;
finally
Corriente.Free;
end;
end;
begin
{Ejemplo de uso
Use example }
{Ejecutamos la busqueda
We execute the search }
EncontradaEn:=BuscaStringEnFichero('c:\Ejemplo.txt','BuscaMe');
{Si la ladeca fu encontrada, mostramos donde, sino no
If the string was find, we show where, but not }
if EncontradaEn -1 then begin
{Aqui si la encontr
Here we just found it }
ShowMessage( 'Cadena encontrada en: '+ // string found in:
IntToStr( EncontradaEn )
);
end else begin
ShowMessage( 'Lo siento, cadena no encontrada en el fichero'+#13+
'Im sorry, string not found in the file');
end;
end;
Radikal