──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── ════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════ Info ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── ════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════

│ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║ ║

Info

Date/Time2026-03-10 00:00
Size1245 words
Reading6 min

Backlinks

VFAT Shortname Rules

▲

▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ █ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒

▼

──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── ════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════ Deterministic DIR Output ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── ════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════

C:\RETRO\DOS>type determ~1.htm

Deterministic DIR Output

Treating DIR formatting as an API contract for automation

The story starts at 23:14 in a room with two beige towers, one half-dead fluorescent tube, and a whiteboard covered in hand-written file counts. We had one mission: rebuild a damaged release set from mixed backup disks and compare it against a known-good manifest.

On paper, that sounds easy. In practice, it meant parsing DIR output across different machines, each configured slightly differently, each with enough personality to make automation fail at the worst moment.

By 23:42 we had already hit the first trap. One machine produced DIR output that looked “normal” to a human and ambiguous to a parser. Another printed dates in a different shape. A third had enough local customization that every assumption broke after line three. We were not failing because DOS was bad. We were failing because we had not written down what “correct output” meant.

That night we stopped treating DIR as a casual command and started treating it as an API contract.

This article is that deep dive: why a deterministic profile matters, how to structure it, and how to parse it without superstitions.

The turning point: formatting is behavior

In modern systems, people accept that JSON schemas and protocol contracts are architecture. In DOS-era workflows, plain text command output played that same role. If your automation consumed command output, formatting was behavior.

Our internal profile locked one specific command shape:

DIR [drive:][path][filespec]
default long listing
no /W, no /B, no formatting switches
fixed US date/time rendering (MM-DD-YY, h:mma / h:mmp)

That scoping decision solved half the problem. We stopped pretending one parser should support every possible switch/locale and instead declared a strict operating envelope.

A canonical listing is worth hours of debugging

The profile included a canonical example and we used it as a fixture:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13


 Volume in drive C has no label
 Volume Serial Number is 3F2A-19C0

 Directory of C:\RETROLAB

AUTOEXEC BAT      1024 03-09-96  9:40a
BIN              <DIR> 03-08-96  4:15p
DOCS             <DIR> 03-07-96 11:02a
README   TXT       512 03-09-96 10:20a
SRC              <DIR> 03-07-96 11:04a
TOOLS    EXE     49152 03-09-96 10:21a
       3 File(s)      50,688 bytes
       3 Dir(s)  14,327,808 bytes free

Why include this in a spec? Because examples settle debates that prose cannot. When two engineers disagree, the fixture wins.

The 38-column row discipline

The core entry template was fixed-width:

1

%-8s %-3s  %8s %8s %6s

That yields exactly 38 columns:

columns 1..8: basename (left-aligned)
column 9: space
columns 10..12: extension (left-aligned)
columns 13..14: spaces
columns 15..22: size-or-dir (right-aligned)
column 23: space
columns 24..31: date
column 32: space
columns 33..38: time (right-aligned)

Once you adopt positional parsing instead of regex guesswork, DIR lines become boring in the best way.

Why this works even on noisy nights

Fixed-width parsing has practical advantages under pressure:

no locale-sensitive token splitting for date/time columns
no ambiguity between <DIR> and size values
deterministic handling of one-digit vs two-digit hour
easy visual validation during manual triage

At 01:12, when you are diffing listings by eye and caffeine alone, “column 15 starts the size field” is operational mercy.

Header and footer are part of the protocol

Many parsers only parse entry rows and ignore header/footer. That is a missed opportunity.

Our profile explicitly fixed header sequence:

volume label line (is <LABEL> or has no label)
serial line (XXXX-XXXX, uppercase hex)
blank line
Directory of <PATH>
blank line

And footer sequence:

file totals: %8u File(s) %11s bytes
dir/free totals: %8u Dir(s) %11s bytes free

Those two footer lines are not decoration. They are integrity checks. If parsed file count says 127 and footer says 126, stop and investigate before touching production disks.

Parsing algorithm we actually trusted

This is the skeleton we converged on in Turbo Pascal style:

type
  TDirEntry = record
    BaseName: string[8];
    Ext: string[3];
    IsDir: Boolean;
    SizeBytes: LongInt;
    DateText: string[8]; { MM-DD-YY }
    TimeText: string[6]; { right-aligned h:mma/h:mmp }
  end;

function TrimRight(const S: string): string;
var
  I: Integer;
begin
  I := Length(S);
  while (I > 0) and (S[I] = ' ') do Dec(I);
  TrimRight := Copy(S, 1, I);
end;

function ParseEntryLine(const L: string; var E: TDirEntry): Boolean;
var
  NameField, ExtField, SizeField, DateField, TimeField: string;
  Code: Integer;
begin
  ParseEntryLine := False;
  if Length(L) < 38 then Exit;

  NameField := Copy(L, 1, 8);
  ExtField  := Copy(L, 10, 3);
  SizeField := Copy(L, 15, 8);
  DateField := Copy(L, 24, 8);
  TimeField := Copy(L, 33, 6);

  E.BaseName := TrimRight(NameField);
  E.Ext      := TrimRight(ExtField);
  E.DateText := DateField;
  E.TimeText := TimeField;

  if TrimRight(SizeField) = '<DIR>' then
  begin
    E.IsDir := True;
    E.SizeBytes := 0;
  end
  else
  begin
    E.IsDir := False;
    Val(TrimRight(SizeField), E.SizeBytes, Code);
    if Code <> 0 then Exit;
  end;

  ParseEntryLine := True;
end;

This parser is intentionally plain. No hidden assumptions, no dynamic heuristics, no “best effort.” It either matches the profile or fails loudly.

Edge cases that must be explicit

The spec was strict about awkward but common cases:

extensionless files: extension field is blank (three spaces in raw row)
short names/exts: right-padding in fixed fields
directories always use <DIR> in size field
if value exceeds width, allow rightward overflow; never truncate data

The overflow rule is subtle and important. Truncation creates false data, and false data is worse than ugly formatting.

Counting bytes: grouped vs ungrouped is not random

A detail teams often forget:

entry SIZE_OR_DIR file size is decimal without grouping
footer byte totals are grouped with US commas in this profile

That split looks cosmetic until a parser accidentally strips commas in one place but not the other. If totals are part of your acceptance gate, normalize once and test it with fixtures.

The fictional incident that made it real

At 02:07 in our story, we finally had a clean parse on machine A. We ran the same process on machine B, then compared manifests. Everything looked perfect except one tiny mismatch: file count agreed, byte count differed by 1,024.

Old us would have guessed corruption and started copying disks again.

Spec-driven us inspected footer math first, then entry parse, then source listing capture. The issue was not corruption. One listing had accidentally included a generated staging file from a side directory because the operator typed a wildcard path incorrectly.

The deterministic header (Directory of ...) and footer checks caught it in minutes.

No drama. Just protocol discipline.

What this teaches beyond DOS

The strongest lesson is not “DOS output is neat.” The lesson is operational:

any text output consumed by tools should be treated as a contract
contracts need explicit scope and out-of-scope declarations
examples + field widths + sequence rules beat vague descriptions
integrity lines (counts/totals) should be first-class validation points

That mindset scales from floppy-era rebuild scripts to modern CI logs and telemetry processors.

Implementation checklist for your own parser

If you want a stable implementation from this profile:

enforce command profile (no unsupported switches)
parse header in strict order
parse entry rows by fixed columns, not token split
parse footer totals and cross-check with computed values
fail explicitly on profile deviation
keep canonical fixture listings in version control

This gives you deterministic behavior and debuggable failures.

Closing scene

At 03:18 we printed two manifests, one from recovered media and one from archive baseline, and compared them line by line. For the first time that night, we trusted the result.

Not because the room got quieter.
Not because the disks got newer.
Because the contract got clearer.

The old DOS prompt did what old prompts always do: it reflected our discipline back at us.