Title: Ultimate CSV parsing procedure
Question: Elegant CSV parser procedure to handle all the probabilities of the
CSV exported file.
Answer:
The delphi provided "CommaText" can handle the string which is in system data
format (SDF) and Spaces and commas that are not contained within double quote
marks are treated as delimiters. If the format of CSV file is unknown, data
imported from this can mess up the database.
Result of Delphi "CommaText" function for the following string:
"My Name Your Name"
Result: My Name Your Name
"My Name, Your Name"
Result: My Name, Your Name
My Name, Your Name
Result: My Name
Your Name
My Name Your Name
Result: My
Name
Your
Name
CSV exported file can have all these probability of string output. Here is the
elegant procedure to handle all these probabilities.
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
// ParseCSVLine by Perry Way
procedure ParseCSVLine(ALine: string; AFields: TStrings);
var
iState: cardinal;
i: cardinal;
iLength: cardinal;
sField: string;
begin
// determine length of input //
iLength := Length(ALine);
// exit if empty string //
if iLength = 0 then
Exit;
// initialize State Machine //
iState := 0;
sField := '';
// state machine //
for i := 1 to iLength do
begin
case iState of
//--------------------------------------------------------------//
0: // unknown //
begin
sField := '';
case ALine[i] of
'"': // start of embedded quotes or commas //
begin
iState := 2;
end;
',': // empty field //
begin
AFields.Add(sField);
end;
else
begin // start of regular field //
sField := ALine[i];
iState := 1;
end;
end; // case ALine[i] of //
end; // 0 //
//--------------------------------------------------------------//
1: // continuation of regular field //
begin
case ALine[i] of
',': // end of regular field //
begin
AFields.Add(sField);
// if end of input, then we know there remains a "null" field //
if (i = iLength) then
begin
AFields.Add('');
end // (i = iLength) //
else
begin
iState := 0;
end;
end;
else // concatenate current char //
begin
sField := sField + ALine[i];
if (i = iLength) then // EOL //
AFields.Add(sField);
end;
end; // case ALine[i] //
end; // 1 //
//--------------------------------------------------------------//
2: // continuation of embedded quotes or commas //
begin
case ALine[i] of
'"': // end of embedded comma field or beginning of embedded quote
//
begin
if (i begin
if (ALine[i+1] = ',') then
begin // end of embedded comma field //
iState := 1
end
else
begin
iState := 3;
end;
end
else
begin // end of field since end of line //
AFields.Add(sField);
end;
end
else // concatenate current char //
begin
sField := sField + ALine[i];
end;
end; // case ALine[i] //
end; // 2 //
//--------------------------------------------------------------//
3: // beginning of embedded quote //
begin
case ALine[i] of
'"':
begin
sField := sField + ALine[i];
iState := 2;
end;
end; // case ALine[i] //
end; // 3 //
//--------------------------------------------------------------//
end; // case iState //
end; // for i := 1 to iLength //
end;
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Related articles:
1. Simple parsing procedure and function (article_2494)
2. Quick and Dirty CSV file parsing (article_1367)