<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
  <channel>
    <title>Filesystem on TurboVision</title>
    <link>https://turbovision.in6-addr.net/tags/filesystem/</link>
    <description>Recent content in Filesystem on TurboVision</description>
    <generator>Hugo</generator>
    <language>en</language>
    <lastBuildDate>Tue, 21 Apr 2026 14:06:12 +0000</lastBuildDate>
    <atom:link href="https://turbovision.in6-addr.net/tags/filesystem/index.xml" rel="self" type="application/rss&#43;xml" />
    
    
    
    <item>
      <title>Deterministic DIR Output as an Operational Contract</title>
      <link>https://turbovision.in6-addr.net/retro/dos/deterministic-dir-output-as-an-operational-contract/</link>
      <pubDate>Tue, 10 Mar 2026 00:00:00 +0000</pubDate>
      <lastBuildDate>Tue, 10 Mar 2026 00:00:00 +0000</lastBuildDate>
      <guid>https://turbovision.in6-addr.net/retro/dos/deterministic-dir-output-as-an-operational-contract/</guid>
      <description>&lt;p&gt;The story starts at 23:14 in a room with two beige towers, one half-dead fluorescent tube, and a whiteboard covered in hand-written file counts. We had one mission: rebuild a damaged release set from mixed backup disks and compare it against a known-good manifest.&lt;/p&gt;
&lt;p&gt;On paper, that sounds easy. In practice, it meant parsing &lt;code&gt;DIR&lt;/code&gt; output across different machines, each configured slightly differently, each with enough personality to make automation fail at the worst moment.&lt;/p&gt;
&lt;p&gt;By 23:42 we had already hit the first trap. One machine produced &lt;code&gt;DIR&lt;/code&gt; output that looked &amp;ldquo;normal&amp;rdquo; to a human and ambiguous to a parser. Another printed dates in a different shape. A third had enough local customization that every assumption broke after line three. We were not failing because DOS was bad. We were failing because we had not written down what &amp;ldquo;correct output&amp;rdquo; meant.&lt;/p&gt;
&lt;p&gt;That night we stopped treating &lt;code&gt;DIR&lt;/code&gt; as a casual command and started treating it as an API contract.&lt;/p&gt;
&lt;p&gt;This article is that deep dive: why a deterministic profile matters, how to structure it, and how to parse it without superstitions.&lt;/p&gt;
&lt;h2 id=&#34;the-turning-point-formatting-is-behavior&#34;&gt;The turning point: formatting is behavior&lt;/h2&gt;
&lt;p&gt;In modern systems, people accept that JSON schemas and protocol contracts are architecture. In DOS-era workflows, plain text command output played that same role. If your automation consumed command output, formatting &lt;em&gt;was&lt;/em&gt; behavior.&lt;/p&gt;
&lt;p&gt;Our internal profile locked one specific command shape:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;DIR [drive:][path][filespec]&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;default long listing&lt;/li&gt;
&lt;li&gt;no &lt;code&gt;/W&lt;/code&gt;, no &lt;code&gt;/B&lt;/code&gt;, no formatting switches&lt;/li&gt;
&lt;li&gt;fixed US date/time rendering (&lt;code&gt;MM-DD-YY&lt;/code&gt;, &lt;code&gt;h:mma&lt;/code&gt; / &lt;code&gt;h:mmp&lt;/code&gt;)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;That scoping decision solved half the problem. We stopped pretending one parser should support every possible switch/locale and instead declared a strict operating envelope.&lt;/p&gt;
&lt;h2 id=&#34;a-canonical-listing-is-worth-hours-of-debugging&#34;&gt;A canonical listing is worth hours of debugging&lt;/h2&gt;
&lt;p&gt;The profile included a canonical example and we used it as a fixture:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;div class=&#34;chroma&#34;&gt;
&lt;table class=&#34;lntable&#34;&gt;&lt;tr&gt;&lt;td class=&#34;lntd&#34;&gt;
&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code&gt;&lt;span class=&#34;lnt&#34;&gt; 1
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt; 2
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt; 3
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt; 4
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt; 5
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt; 6
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt; 7
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt; 8
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt; 9
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;10
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;11
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;12
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;13
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class=&#34;lntd&#34;&gt;
&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-text&#34; data-lang=&#34;text&#34;&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt; Volume in drive C has no label
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt; Volume Serial Number is 3F2A-19C0
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt; Directory of C:\RETROLAB
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;AUTOEXEC BAT      1024 03-09-96  9:40a
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;BIN              &amp;lt;DIR&amp;gt; 03-08-96  4:15p
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;DOCS             &amp;lt;DIR&amp;gt; 03-07-96 11:02a
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;README   TXT       512 03-09-96 10:20a
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;SRC              &amp;lt;DIR&amp;gt; 03-07-96 11:04a
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;TOOLS    EXE     49152 03-09-96 10:21a
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;       3 File(s)      50,688 bytes
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;       3 Dir(s)  14,327,808 bytes free&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;&lt;p&gt;Why include this in a spec? Because examples settle debates that prose cannot. When two engineers disagree, the fixture wins.&lt;/p&gt;
&lt;h2 id=&#34;the-38-column-row-discipline&#34;&gt;The 38-column row discipline&lt;/h2&gt;
&lt;p&gt;The core entry template was fixed-width:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;div class=&#34;chroma&#34;&gt;
&lt;table class=&#34;lntable&#34;&gt;&lt;tr&gt;&lt;td class=&#34;lntd&#34;&gt;
&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code&gt;&lt;span class=&#34;lnt&#34;&gt;1
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class=&#34;lntd&#34;&gt;
&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-text&#34; data-lang=&#34;text&#34;&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;%-8s %-3s  %8s %8s %6s&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;&lt;p&gt;That yields exactly 38 columns:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;columns &lt;code&gt;1..8&lt;/code&gt;: basename (left-aligned)&lt;/li&gt;
&lt;li&gt;column &lt;code&gt;9&lt;/code&gt;: space&lt;/li&gt;
&lt;li&gt;columns &lt;code&gt;10..12&lt;/code&gt;: extension (left-aligned)&lt;/li&gt;
&lt;li&gt;columns &lt;code&gt;13..14&lt;/code&gt;: spaces&lt;/li&gt;
&lt;li&gt;columns &lt;code&gt;15..22&lt;/code&gt;: size-or-dir (right-aligned)&lt;/li&gt;
&lt;li&gt;column &lt;code&gt;23&lt;/code&gt;: space&lt;/li&gt;
&lt;li&gt;columns &lt;code&gt;24..31&lt;/code&gt;: date&lt;/li&gt;
&lt;li&gt;column &lt;code&gt;32&lt;/code&gt;: space&lt;/li&gt;
&lt;li&gt;columns &lt;code&gt;33..38&lt;/code&gt;: time (right-aligned)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Once you adopt positional parsing instead of regex guesswork, &lt;code&gt;DIR&lt;/code&gt; lines become boring in the best way.&lt;/p&gt;
&lt;h2 id=&#34;why-this-works-even-on-noisy-nights&#34;&gt;Why this works even on noisy nights&lt;/h2&gt;
&lt;p&gt;Fixed-width parsing has practical advantages under pressure:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;no locale-sensitive token splitting for date/time columns&lt;/li&gt;
&lt;li&gt;no ambiguity between &lt;code&gt;&amp;lt;DIR&amp;gt;&lt;/code&gt; and size values&lt;/li&gt;
&lt;li&gt;deterministic handling of one-digit vs two-digit hour&lt;/li&gt;
&lt;li&gt;easy visual validation during manual triage&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;At 01:12, when you are diffing listings by eye and caffeine alone, &amp;ldquo;column 15 starts the size field&amp;rdquo; is operational mercy.&lt;/p&gt;
&lt;h2 id=&#34;header-and-footer-are-part-of-the-protocol&#34;&gt;Header and footer are part of the protocol&lt;/h2&gt;
&lt;p&gt;Many parsers only parse entry rows and ignore header/footer. That is a missed opportunity.&lt;/p&gt;
&lt;p&gt;Our profile explicitly fixed header sequence:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;volume label line (&lt;code&gt;is &amp;lt;LABEL&amp;gt;&lt;/code&gt; or &lt;code&gt;has no label&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;serial line (&lt;code&gt;XXXX-XXXX&lt;/code&gt;, uppercase hex)&lt;/li&gt;
&lt;li&gt;blank line&lt;/li&gt;
&lt;li&gt;&lt;code&gt;Directory of &amp;lt;PATH&amp;gt;&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;blank line&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;And footer sequence:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;file totals: &lt;code&gt;%8u File(s) %11s bytes&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;dir/free totals: &lt;code&gt;%8u Dir(s) %11s bytes free&lt;/code&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Those two footer lines are not decoration. They are integrity checks. If parsed file count says 127 and footer says 126, stop and investigate before touching production disks.&lt;/p&gt;
&lt;h2 id=&#34;parsing-algorithm-we-actually-trusted&#34;&gt;Parsing algorithm we actually trusted&lt;/h2&gt;
&lt;p&gt;This is the skeleton we converged on in Turbo Pascal style:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code class=&#34;language-pascal&#34; data-lang=&#34;pascal&#34;&gt;type
  TDirEntry = record
    BaseName: string[8];
    Ext: string[3];
    IsDir: Boolean;
    SizeBytes: LongInt;
    DateText: string[8]; { MM-DD-YY }
    TimeText: string[6]; { right-aligned h:mma/h:mmp }
  end;

function TrimRight(const S: string): string;
var
  I: Integer;
begin
  I := Length(S);
  while (I &amp;gt; 0) and (S[I] = &amp;#39; &amp;#39;) do Dec(I);
  TrimRight := Copy(S, 1, I);
end;

function ParseEntryLine(const L: string; var E: TDirEntry): Boolean;
var
  NameField, ExtField, SizeField, DateField, TimeField: string;
  Code: Integer;
begin
  ParseEntryLine := False;
  if Length(L) &amp;lt; 38 then Exit;

  NameField := Copy(L, 1, 8);
  ExtField  := Copy(L, 10, 3);
  SizeField := Copy(L, 15, 8);
  DateField := Copy(L, 24, 8);
  TimeField := Copy(L, 33, 6);

  E.BaseName := TrimRight(NameField);
  E.Ext      := TrimRight(ExtField);
  E.DateText := DateField;
  E.TimeText := TimeField;

  if TrimRight(SizeField) = &amp;#39;&amp;lt;DIR&amp;gt;&amp;#39; then
  begin
    E.IsDir := True;
    E.SizeBytes := 0;
  end
  else
  begin
    E.IsDir := False;
    Val(TrimRight(SizeField), E.SizeBytes, Code);
    if Code &amp;lt;&amp;gt; 0 then Exit;
  end;

  ParseEntryLine := True;
end;&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;This parser is intentionally plain. No hidden assumptions, no dynamic heuristics, no &amp;ldquo;best effort.&amp;rdquo; It either matches the profile or fails loudly.&lt;/p&gt;
&lt;h2 id=&#34;edge-cases-that-must-be-explicit&#34;&gt;Edge cases that must be explicit&lt;/h2&gt;
&lt;p&gt;The spec was strict about awkward but common cases:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;extensionless files: extension field is blank (three spaces in raw row)&lt;/li&gt;
&lt;li&gt;short names/exts: right-padding in fixed fields&lt;/li&gt;
&lt;li&gt;directories always use &lt;code&gt;&amp;lt;DIR&amp;gt;&lt;/code&gt; in size field&lt;/li&gt;
&lt;li&gt;if value exceeds width, allow rightward overflow; never truncate data&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The overflow rule is subtle and important. Truncation creates false data, and false data is worse than ugly formatting.&lt;/p&gt;
&lt;h2 id=&#34;counting-bytes-grouped-vs-ungrouped-is-not-random&#34;&gt;Counting bytes: grouped vs ungrouped is not random&lt;/h2&gt;
&lt;p&gt;A detail teams often forget:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;entry &lt;code&gt;SIZE_OR_DIR&lt;/code&gt; file size is decimal without grouping&lt;/li&gt;
&lt;li&gt;footer byte totals are grouped with US commas in this profile&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;That split looks cosmetic until a parser accidentally strips commas in one place but not the other. If totals are part of your acceptance gate, normalize once and test it with fixtures.&lt;/p&gt;
&lt;h2 id=&#34;the-fictional-incident-that-made-it-real&#34;&gt;The fictional incident that made it real&lt;/h2&gt;
&lt;p&gt;At 02:07 in our story, we finally had a clean parse on machine A. We ran the same process on machine B, then compared manifests. Everything looked perfect except one tiny mismatch: file count agreed, byte count differed by 1,024.&lt;/p&gt;
&lt;p&gt;Old us would have guessed corruption and started copying disks again.&lt;/p&gt;
&lt;p&gt;Spec-driven us inspected footer math first, then entry parse, then source listing capture. The issue was not corruption. One listing had accidentally included a generated staging file from a side directory because the operator typed a wildcard path incorrectly.&lt;/p&gt;
&lt;p&gt;The deterministic header (&lt;code&gt;Directory of ...&lt;/code&gt;) and footer checks caught it in minutes.&lt;/p&gt;
&lt;p&gt;No drama. Just protocol discipline.&lt;/p&gt;
&lt;h2 id=&#34;what-this-teaches-beyond-dos&#34;&gt;What this teaches beyond DOS&lt;/h2&gt;
&lt;p&gt;The strongest lesson is not &amp;ldquo;DOS output is neat.&amp;rdquo; The lesson is operational:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;any text output consumed by tools should be treated as a contract&lt;/li&gt;
&lt;li&gt;contracts need explicit scope and out-of-scope declarations&lt;/li&gt;
&lt;li&gt;examples + field widths + sequence rules beat vague descriptions&lt;/li&gt;
&lt;li&gt;integrity lines (counts/totals) should be first-class validation points&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;That mindset scales from floppy-era rebuild scripts to modern CI logs and telemetry processors.&lt;/p&gt;
&lt;h2 id=&#34;implementation-checklist-for-your-own-parser&#34;&gt;Implementation checklist for your own parser&lt;/h2&gt;
&lt;p&gt;If you want a stable implementation from this profile:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;enforce command profile (no unsupported switches)&lt;/li&gt;
&lt;li&gt;parse header in strict order&lt;/li&gt;
&lt;li&gt;parse entry rows by fixed columns, not token split&lt;/li&gt;
&lt;li&gt;parse footer totals and cross-check with computed values&lt;/li&gt;
&lt;li&gt;fail explicitly on profile deviation&lt;/li&gt;
&lt;li&gt;keep canonical fixture listings in version control&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;This gives you deterministic behavior and debuggable failures.&lt;/p&gt;
&lt;h2 id=&#34;closing-scene&#34;&gt;Closing scene&lt;/h2&gt;
&lt;p&gt;At 03:18 we printed two manifests, one from recovered media and one from archive baseline, and compared them line by line. For the first time that night, we trusted the result.&lt;/p&gt;
&lt;p&gt;Not because the room got quieter.&lt;br&gt;
Not because the disks got newer.&lt;br&gt;
Because the contract got clearer.&lt;/p&gt;
&lt;p&gt;The old DOS prompt did what old prompts always do: it reflected our discipline back at us.&lt;/p&gt;
&lt;p&gt;Related reading:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;https://turbovision.in6-addr.net/retro/dos/batch-file-wizardry/&#34;&gt;Batch File Wizardry&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://turbovision.in6-addr.net/retro/dos/config-sys-as-architecture/&#34;&gt;CONFIG.SYS as Architecture&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://turbovision.in6-addr.net/retro/dos/interrupts-as-user-interface/&#34;&gt;Interrupts as User Interface&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
</description>
    </item>
    
    <item>
      <title>VFAT to 8.3: The Shortname Rules Behind the Curtain</title>
      <link>https://turbovision.in6-addr.net/retro/dos/vfat-to-8dot3-the-shortname-rules-behind-the-curtain/</link>
      <pubDate>Tue, 10 Mar 2026 00:00:00 +0000</pubDate>
      <lastBuildDate>Tue, 10 Mar 2026 00:00:00 +0000</lastBuildDate>
      <guid>https://turbovision.in6-addr.net/retro/dos/vfat-to-8dot3-the-shortname-rules-behind-the-curtain/</guid>
      <description>&lt;p&gt;The second story begins with a floppy label that looked harmless:&lt;/p&gt;
&lt;p&gt;&lt;code&gt;RELEASE_NOTES_FINAL_REALLY_FINAL.TXT&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;By itself, that filename is only mildly annoying. Inside a mixed DOS/Windows pipeline in 1990s tooling, it can become a release blocker.&lt;/p&gt;
&lt;p&gt;Our fictional team learned this in one long weekend. The packager ran on a VFAT-capable machine. The installer verifier ran in a strict DOS context. The build ledger expected 8.3 aliases. Nobody had documented the shortname translation rules completely. Everybody thought they &amp;ldquo;basically knew&amp;rdquo; them.&lt;/p&gt;
&lt;p&gt;&amp;ldquo;Basically&amp;rdquo; lasted until the audit script flagged twelve mismatches that were all technically valid and operationally catastrophic.&lt;/p&gt;
&lt;p&gt;This article is the deep dive we wish we had then: how long names become 8.3 aliases, how collisions are resolved, and how to build deterministic tooling around those rules.&lt;/p&gt;
&lt;h2 id=&#34;first-principle-translate-per-path-component&#34;&gt;First principle: translate per path component&lt;/h2&gt;
&lt;p&gt;The most important rule is easy to miss:&lt;/p&gt;
&lt;p&gt;Translation happens per single path component, not on the full path string.&lt;/p&gt;
&lt;p&gt;That means each directory name and final file name is handled independently. If you normalize the entire path in one pass, you will eventually generate aliases that cannot exist in real directory contexts.&lt;/p&gt;
&lt;p&gt;In practical terms:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;C:\SRC\Very Long Directory\My Program Source.pas&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;is translated component-by-component, each with its own collision scope&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;That &amp;ldquo;collision scope&amp;rdquo; phrase matters. Uniqueness is enforced within a directory, not globally across the volume.&lt;/p&gt;
&lt;h2 id=&#34;fast-path-already-legal-83-names-stay-as-is&#34;&gt;Fast path: already legal 8.3 names stay as-is&lt;/h2&gt;
&lt;p&gt;If the input is already a legal short name after OEM uppercase normalization, use that 8.3 form directly (uppercase).&lt;/p&gt;
&lt;p&gt;This avoids unnecessary alias churn and preserves operator expectations. A file named &lt;code&gt;CONFIG.SYS&lt;/code&gt; should not become something novel just because your algorithm always builds &lt;code&gt;FIRST6~1&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;Teams that skip this rule create avoidable incompatibilities.&lt;/p&gt;
&lt;h2 id=&#34;when-alias-generation-is-required&#34;&gt;When alias generation is required&lt;/h2&gt;
&lt;p&gt;If the name is not already legal 8.3, generate alias candidates using strict steps.&lt;/p&gt;
&lt;p&gt;The baseline candidate pattern is:&lt;/p&gt;
&lt;p&gt;&lt;code&gt;FIRST6~1.EXT&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;Where:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;FIRST6&lt;/code&gt; is normalized/truncated basename prefix&lt;/li&gt;
&lt;li&gt;&lt;code&gt;~1&lt;/code&gt; is initial numeric tail&lt;/li&gt;
&lt;li&gt;&lt;code&gt;.EXT&lt;/code&gt; is extension if one exists, truncated to max 3&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;No extension? Then no trailing dot/extension segment.&lt;/p&gt;
&lt;h2 id=&#34;dot-handling-is-where-most-bugs-hide&#34;&gt;Dot handling is where most bugs hide&lt;/h2&gt;
&lt;p&gt;Real filenames can contain multiple dots, trailing dots, and decorative punctuation. The rules must be explicit:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;skip leading &lt;code&gt;.&lt;/code&gt; characters&lt;/li&gt;
&lt;li&gt;allow only one basename/extension separator in 8.3&lt;/li&gt;
&lt;li&gt;prefer the last dot that has valid non-space characters after it&lt;/li&gt;
&lt;li&gt;if name ends with a dot, ignore that trailing dot and use a previous valid dot if present&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;This is the difference between deterministic behavior and parser folklore.&lt;/p&gt;
&lt;p&gt;Example intuition:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;report.final.v3.txt&lt;/code&gt; -&amp;gt; extension source is last meaningful dot before &lt;code&gt;txt&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;archive.&lt;/code&gt; -&amp;gt; trailing dot is ignored; extension may end up empty&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;character-legality-and-normalization&#34;&gt;Character legality and normalization&lt;/h2&gt;
&lt;p&gt;Normalization from the spec includes:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;remove spaces and extra dots&lt;/li&gt;
&lt;li&gt;uppercase letters using active OEM code page semantics&lt;/li&gt;
&lt;li&gt;drop characters that are not representable/legal for short names&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Disallowed characters include control chars and:&lt;/p&gt;
&lt;p&gt;&lt;code&gt;&amp;quot; * + , / : ; &amp;lt; = &amp;gt; ? [ \ ] |&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;A critical note from the rules:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Microsoft-documented NT behavior: &lt;code&gt;[ ] + = , : ;&lt;/code&gt; are replaced with &lt;code&gt;_&lt;/code&gt; during short-name generation&lt;/li&gt;
&lt;li&gt;other illegal/superfluous characters are removed&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;If your toolchain mixes &amp;ldquo;replace&amp;rdquo; and &amp;ldquo;remove&amp;rdquo; without policy, you will drift from expected aliases.&lt;/p&gt;
&lt;h2 id=&#34;collision-handling-is-an-algorithm-not-a-guess&#34;&gt;Collision handling is an algorithm, not a guess&lt;/h2&gt;
&lt;p&gt;The collision rule set is precise:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;try &lt;code&gt;~1&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;if occupied, try &lt;code&gt;~2&lt;/code&gt;, &lt;code&gt;~3&lt;/code&gt;, &amp;hellip;&lt;/li&gt;
&lt;li&gt;as tail digits grow, shrink basename prefix so total basename+tail stays within 8 chars&lt;/li&gt;
&lt;li&gt;continue until unique in the directory&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;That means &lt;code&gt;~10&lt;/code&gt; and &lt;code&gt;~100&lt;/code&gt; are not formatting quirks. They force basename compaction decisions.&lt;/p&gt;
&lt;p&gt;A common implementation failure is forgetting to shrink prefix when suffix width grows. The result is invalid aliases or silent truncation.&lt;/p&gt;
&lt;h2 id=&#34;a-deterministic-translator-skeleton&#34;&gt;A deterministic translator skeleton&lt;/h2&gt;
&lt;p&gt;The following Pascal-style pseudocode keeps policy explicit:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code class=&#34;language-pascal&#34; data-lang=&#34;pascal&#34;&gt;function MakeShortAlias(const LongName: string; const Existing: TStringSet): string;
var
  BaseRaw, ExtRaw, BaseNorm, ExtNorm: string;
  Tail, PrefixLen: Integer;
  Candidate: string;
begin
  SplitUsingDotRules(LongName, BaseRaw, ExtRaw);   { skip leading dots, last valid dot logic }
  BaseNorm := NormalizeBase(BaseRaw);              { remove spaces/extra dots, uppercase, legality policy }
  ExtNorm  := NormalizeExt(ExtRaw);                { uppercase, legality policy, truncate to 3 }

  if IsLegal83(BaseNorm, ExtNorm) and (not Existing.Contains(Compose83(BaseNorm, ExtNorm))) then
  begin
    MakeShortAlias := Compose83(BaseNorm, ExtNorm);
    Exit;
  end;

  Tail := 1;
  repeat
    PrefixLen := 8 - (1 + Length(IntToStr(Tail))); { room for &amp;#34;~&amp;#34; + digits }
    if PrefixLen &amp;lt; 1 then PrefixLen := 1;
    Candidate := Copy(BaseNorm, 1, PrefixLen) + &amp;#39;~&amp;#39; + IntToStr(Tail);
    Candidate := Compose83(Candidate, ExtNorm);
    Inc(Tail);
  until not Existing.Contains(Candidate);

  MakeShortAlias := Candidate;
end;&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;This intentionally leaves &lt;code&gt;NormalizeBase&lt;/code&gt;, &lt;code&gt;NormalizeExt&lt;/code&gt;, and &lt;code&gt;SplitUsingDotRules&lt;/code&gt; as separate units so policy stays testable.&lt;/p&gt;
&lt;h2 id=&#34;table-driven-tests-beat-intuition&#34;&gt;Table-driven tests beat intuition&lt;/h2&gt;
&lt;p&gt;Our fictional team fixed its pipeline by building a test corpus, not by debating memory:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;div class=&#34;chroma&#34;&gt;
&lt;table class=&#34;lntable&#34;&gt;&lt;tr&gt;&lt;td class=&#34;lntd&#34;&gt;
&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code&gt;&lt;span class=&#34;lnt&#34;&gt;1
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;2
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;3
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;4
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;5
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;6
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;7
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class=&#34;lntd&#34;&gt;
&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-text&#34; data-lang=&#34;text&#34;&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;Input Component                         Expected Shape
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;--------------------------------------  ------------------------
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;README.TXT                              README.TXT
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;very long filename.txt                  VERYLO~1.TXT
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;archive.final.build.log                 ARCHIV~1.LOG
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;...hiddenprofile                        HIDDEN~1
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;name with spaces.and.dots...cfg         NAMEWI~1.CFG&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;&lt;p&gt;The exact alias strings can vary with existing collisions and code-page/legality policy details, but the algorithmic behavior should not vary.&lt;/p&gt;
&lt;h2 id=&#34;why-this-matters-in-operational-pipelines&#34;&gt;Why this matters in operational pipelines&lt;/h2&gt;
&lt;p&gt;Shortname translation touches many workflows:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;installer scripts that reference legacy names&lt;/li&gt;
&lt;li&gt;backup/restore verification against manifests&lt;/li&gt;
&lt;li&gt;cross-tool compatibility between VFAT-aware and strict 8.3 utilities&lt;/li&gt;
&lt;li&gt;reproducible release artifacts&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;If alias generation is non-deterministic, two developers can build &amp;ldquo;same version&amp;rdquo; media with different effective filenames.&lt;/p&gt;
&lt;p&gt;That is a release-management nightmare.&lt;/p&gt;
&lt;h2 id=&#34;the-fictional-incident-response&#34;&gt;The fictional incident response&lt;/h2&gt;
&lt;p&gt;In our story, the break happened during a Friday packaging run. By Saturday morning, three teams had three conflicting explanations:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&amp;ldquo;the verifier is wrong&amp;rdquo;&lt;/li&gt;
&lt;li&gt;&amp;ldquo;Windows generated weird aliases&amp;rdquo;&lt;/li&gt;
&lt;li&gt;&amp;ldquo;someone copied files manually&amp;rdquo;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;By Saturday afternoon, a tiny deterministic translator plus collision-aware tests cut through all three theories. The verifier was correct, alias generation differed between tools, and manual copies had introduced namespace collisions in one directory.&lt;/p&gt;
&lt;p&gt;Nobody needed blame. We needed rules.&lt;/p&gt;
&lt;h2 id=&#34;subtle-rule-legality-depends-on-oem-code-page&#34;&gt;Subtle rule: legality depends on OEM code page&lt;/h2&gt;
&lt;p&gt;One more important caveat from the spec:&lt;/p&gt;
&lt;p&gt;Uppercasing and character validity are evaluated in active OEM code page context.&lt;/p&gt;
&lt;p&gt;That means &amp;ldquo;works on my machine&amp;rdquo; can still fail if code-page assumptions differ. For strict reproducibility, pin the environment and test corpus together.&lt;/p&gt;
&lt;h2 id=&#34;practical-implementation-checklist&#34;&gt;Practical implementation checklist&lt;/h2&gt;
&lt;p&gt;For a robust translator:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;process one path component at a time&lt;/li&gt;
&lt;li&gt;implement legal-8.3 fast path first&lt;/li&gt;
&lt;li&gt;codify dot-selection/trailing-dot behavior exactly&lt;/li&gt;
&lt;li&gt;separate remove-vs-replace character policy clearly&lt;/li&gt;
&lt;li&gt;enforce extension max length 3&lt;/li&gt;
&lt;li&gt;implement collision tail growth with dynamic prefix shrink&lt;/li&gt;
&lt;li&gt;ship fixture tests with occupied-directory scenarios&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;That last point is non-negotiable. Most alias bugs only appear under collision pressure.&lt;/p&gt;
&lt;h2 id=&#34;closing-scene&#34;&gt;Closing scene&lt;/h2&gt;
&lt;p&gt;Our weekend story ends around 01:03 on Sunday. The final verification pass prints green across every directory. The whiteboard still looks chaotic. The room still smells like old plastic and instant coffee. But now the behavior is explainable.&lt;/p&gt;
&lt;p&gt;Long names can still be expressive. Short names can still be strict. The bridge between them does not need magic. It needs documented rules and testable translation.&lt;/p&gt;
&lt;p&gt;In DOS-era engineering, that is usually the whole game: reduce mystery, increase repeatability, and let simple tools carry serious work.&lt;/p&gt;
&lt;p&gt;Related reading:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;https://turbovision.in6-addr.net/retro/dos/deterministic-dir-output-as-an-operational-contract/&#34;&gt;Deterministic DIR Output as an Operational Contract&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://turbovision.in6-addr.net/retro/dos/batch-file-wizardry/&#34;&gt;Batch File Wizardry&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://turbovision.in6-addr.net/retro/dos/tp/turbo-pascal-units-as-architecture/&#34;&gt;Turbo Pascal Units as Architecture&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
</description>
    </item>
    
  </channel>
</rss>
