Title: Removing HTML elements from text.
Question: A situation arose where I had to develop a set of procedures to remove HTML elements such as unwanted links from within a text file and at the same time convert any carriage returns to HTML paragraph markers, tabs to spaces etc to form a new web document.
Answer:
The following two procedures were implemented :
____________________________________________________
Code:
procedure TMainForm.LoadFileIntoList(TextFileName:String; AWebPage:TStringList; WithFilter:Boolean);
var CurrentFile : TStringList;
begin
CurrentFile := TStringList.Create;
CurrentFile.LoadFromFile(TextFileName);
if WithFilter then
FilterHTML(CurrentFile,AWebPage)
else
with AWebPage do AddStrings(CurrentFile);
CurrentFile.Free;
end;
procedure TMainForm.FilterHTML(FilterInput, AWebPage:TStringList);
var
i,j : LongInt;
S : String;
begin
FilterMemo.Lines.Clear;
FilterMemo.Lines := FilterInput;
with AWebPage do
begin
FilterMemo.SelectAll;
j := FilterMemo.SelLength;
if j 0 then
begin
i := 0;
repeat
if FilterMemo.Lines.GetText[i] = Char(VK_RETURN) // detect cr
then S := S+''
else if FilterMemo.Lines.GetText[i] = ' then repeat
inc(i);
until FilterMemo.Lines.GetText[i] = ''
else if FilterMemo.Lines.GetText[i] = Char(VK_TAB) // detect tab
then S := S+' '
else S := S+ FilterMemo.Lines.GetText[i]; // just add text
inc(i);
until i = j+1;
Add(S); // add string to WebPage
end else Add('No data entered into field.'); // no data in text file
end;
end;
___________________________________________________
Implementation:
All you have to do is call :
LoadFileIntoList("filename.txt",Webpage, True);
Where the filename is the name of the file you want to process.
"WebPage" is a TStringList
And the boolean value on true filters the file, but on false does not.
NB: In this example a TMemo object called "FilterMemo" was placed on a form (not visible).
___________________________________________________
Example:
WebPage := TStringList.Create;
try
Screen.Cursor := crHourGlass;
AddHeader(WebPage);
with WebPage do
begin
Add('Personal Details');
LoadFileIntoList("filename.txt",Webpage, True);
end;
AddFooter(WebPage);
finally
WebPage.SaveToFile(HTMLFileName);
WebPage.Free;
Screen.Cursor := crDefault;
end;
___________________________________________________
Improvements:
If anybody has any suggested improvements or modifications then please contact me.
Thanks, Pete Davies,
14th August 2000