<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
  <channel>
    <title>Naming on TurboVision</title>
    <link>https://turbovision.in6-addr.net/tags/naming/</link>
    <description>Recent content in Naming on TurboVision</description>
    <generator>Hugo</generator>
    <language>en</language>
    <lastBuildDate>Tue, 21 Apr 2026 14:06:12 +0000</lastBuildDate>
    <atom:link href="https://turbovision.in6-addr.net/tags/naming/index.xml" rel="self" type="application/rss&#43;xml" />
    
    
    
    <item>
      <title>VFAT to 8.3: The Shortname Rules Behind the Curtain</title>
      <link>https://turbovision.in6-addr.net/retro/dos/vfat-to-8dot3-the-shortname-rules-behind-the-curtain/</link>
      <pubDate>Tue, 10 Mar 2026 00:00:00 +0000</pubDate>
      <lastBuildDate>Tue, 10 Mar 2026 00:00:00 +0000</lastBuildDate>
      <guid>https://turbovision.in6-addr.net/retro/dos/vfat-to-8dot3-the-shortname-rules-behind-the-curtain/</guid>
      <description>&lt;p&gt;The second story begins with a floppy label that looked harmless:&lt;/p&gt;
&lt;p&gt;&lt;code&gt;RELEASE_NOTES_FINAL_REALLY_FINAL.TXT&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;By itself, that filename is only mildly annoying. Inside a mixed DOS/Windows pipeline in 1990s tooling, it can become a release blocker.&lt;/p&gt;
&lt;p&gt;Our fictional team learned this in one long weekend. The packager ran on a VFAT-capable machine. The installer verifier ran in a strict DOS context. The build ledger expected 8.3 aliases. Nobody had documented the shortname translation rules completely. Everybody thought they &amp;ldquo;basically knew&amp;rdquo; them.&lt;/p&gt;
&lt;p&gt;&amp;ldquo;Basically&amp;rdquo; lasted until the audit script flagged twelve mismatches that were all technically valid and operationally catastrophic.&lt;/p&gt;
&lt;p&gt;This article is the deep dive we wish we had then: how long names become 8.3 aliases, how collisions are resolved, and how to build deterministic tooling around those rules.&lt;/p&gt;
&lt;h2 id=&#34;first-principle-translate-per-path-component&#34;&gt;First principle: translate per path component&lt;/h2&gt;
&lt;p&gt;The most important rule is easy to miss:&lt;/p&gt;
&lt;p&gt;Translation happens per single path component, not on the full path string.&lt;/p&gt;
&lt;p&gt;That means each directory name and final file name is handled independently. If you normalize the entire path in one pass, you will eventually generate aliases that cannot exist in real directory contexts.&lt;/p&gt;
&lt;p&gt;In practical terms:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;C:\SRC\Very Long Directory\My Program Source.pas&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;is translated component-by-component, each with its own collision scope&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;That &amp;ldquo;collision scope&amp;rdquo; phrase matters. Uniqueness is enforced within a directory, not globally across the volume.&lt;/p&gt;
&lt;h2 id=&#34;fast-path-already-legal-83-names-stay-as-is&#34;&gt;Fast path: already legal 8.3 names stay as-is&lt;/h2&gt;
&lt;p&gt;If the input is already a legal short name after OEM uppercase normalization, use that 8.3 form directly (uppercase).&lt;/p&gt;
&lt;p&gt;This avoids unnecessary alias churn and preserves operator expectations. A file named &lt;code&gt;CONFIG.SYS&lt;/code&gt; should not become something novel just because your algorithm always builds &lt;code&gt;FIRST6~1&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;Teams that skip this rule create avoidable incompatibilities.&lt;/p&gt;
&lt;h2 id=&#34;when-alias-generation-is-required&#34;&gt;When alias generation is required&lt;/h2&gt;
&lt;p&gt;If the name is not already legal 8.3, generate alias candidates using strict steps.&lt;/p&gt;
&lt;p&gt;The baseline candidate pattern is:&lt;/p&gt;
&lt;p&gt;&lt;code&gt;FIRST6~1.EXT&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;Where:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;FIRST6&lt;/code&gt; is normalized/truncated basename prefix&lt;/li&gt;
&lt;li&gt;&lt;code&gt;~1&lt;/code&gt; is initial numeric tail&lt;/li&gt;
&lt;li&gt;&lt;code&gt;.EXT&lt;/code&gt; is extension if one exists, truncated to max 3&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;No extension? Then no trailing dot/extension segment.&lt;/p&gt;
&lt;h2 id=&#34;dot-handling-is-where-most-bugs-hide&#34;&gt;Dot handling is where most bugs hide&lt;/h2&gt;
&lt;p&gt;Real filenames can contain multiple dots, trailing dots, and decorative punctuation. The rules must be explicit:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;skip leading &lt;code&gt;.&lt;/code&gt; characters&lt;/li&gt;
&lt;li&gt;allow only one basename/extension separator in 8.3&lt;/li&gt;
&lt;li&gt;prefer the last dot that has valid non-space characters after it&lt;/li&gt;
&lt;li&gt;if name ends with a dot, ignore that trailing dot and use a previous valid dot if present&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;This is the difference between deterministic behavior and parser folklore.&lt;/p&gt;
&lt;p&gt;Example intuition:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;report.final.v3.txt&lt;/code&gt; -&amp;gt; extension source is last meaningful dot before &lt;code&gt;txt&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;archive.&lt;/code&gt; -&amp;gt; trailing dot is ignored; extension may end up empty&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;character-legality-and-normalization&#34;&gt;Character legality and normalization&lt;/h2&gt;
&lt;p&gt;Normalization from the spec includes:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;remove spaces and extra dots&lt;/li&gt;
&lt;li&gt;uppercase letters using active OEM code page semantics&lt;/li&gt;
&lt;li&gt;drop characters that are not representable/legal for short names&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Disallowed characters include control chars and:&lt;/p&gt;
&lt;p&gt;&lt;code&gt;&amp;quot; * + , / : ; &amp;lt; = &amp;gt; ? [ \ ] |&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;A critical note from the rules:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Microsoft-documented NT behavior: &lt;code&gt;[ ] + = , : ;&lt;/code&gt; are replaced with &lt;code&gt;_&lt;/code&gt; during short-name generation&lt;/li&gt;
&lt;li&gt;other illegal/superfluous characters are removed&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;If your toolchain mixes &amp;ldquo;replace&amp;rdquo; and &amp;ldquo;remove&amp;rdquo; without policy, you will drift from expected aliases.&lt;/p&gt;
&lt;h2 id=&#34;collision-handling-is-an-algorithm-not-a-guess&#34;&gt;Collision handling is an algorithm, not a guess&lt;/h2&gt;
&lt;p&gt;The collision rule set is precise:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;try &lt;code&gt;~1&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;if occupied, try &lt;code&gt;~2&lt;/code&gt;, &lt;code&gt;~3&lt;/code&gt;, &amp;hellip;&lt;/li&gt;
&lt;li&gt;as tail digits grow, shrink basename prefix so total basename+tail stays within 8 chars&lt;/li&gt;
&lt;li&gt;continue until unique in the directory&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;That means &lt;code&gt;~10&lt;/code&gt; and &lt;code&gt;~100&lt;/code&gt; are not formatting quirks. They force basename compaction decisions.&lt;/p&gt;
&lt;p&gt;A common implementation failure is forgetting to shrink prefix when suffix width grows. The result is invalid aliases or silent truncation.&lt;/p&gt;
&lt;h2 id=&#34;a-deterministic-translator-skeleton&#34;&gt;A deterministic translator skeleton&lt;/h2&gt;
&lt;p&gt;The following Pascal-style pseudocode keeps policy explicit:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code class=&#34;language-pascal&#34; data-lang=&#34;pascal&#34;&gt;function MakeShortAlias(const LongName: string; const Existing: TStringSet): string;
var
  BaseRaw, ExtRaw, BaseNorm, ExtNorm: string;
  Tail, PrefixLen: Integer;
  Candidate: string;
begin
  SplitUsingDotRules(LongName, BaseRaw, ExtRaw);   { skip leading dots, last valid dot logic }
  BaseNorm := NormalizeBase(BaseRaw);              { remove spaces/extra dots, uppercase, legality policy }
  ExtNorm  := NormalizeExt(ExtRaw);                { uppercase, legality policy, truncate to 3 }

  if IsLegal83(BaseNorm, ExtNorm) and (not Existing.Contains(Compose83(BaseNorm, ExtNorm))) then
  begin
    MakeShortAlias := Compose83(BaseNorm, ExtNorm);
    Exit;
  end;

  Tail := 1;
  repeat
    PrefixLen := 8 - (1 + Length(IntToStr(Tail))); { room for &amp;#34;~&amp;#34; + digits }
    if PrefixLen &amp;lt; 1 then PrefixLen := 1;
    Candidate := Copy(BaseNorm, 1, PrefixLen) + &amp;#39;~&amp;#39; + IntToStr(Tail);
    Candidate := Compose83(Candidate, ExtNorm);
    Inc(Tail);
  until not Existing.Contains(Candidate);

  MakeShortAlias := Candidate;
end;&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;This intentionally leaves &lt;code&gt;NormalizeBase&lt;/code&gt;, &lt;code&gt;NormalizeExt&lt;/code&gt;, and &lt;code&gt;SplitUsingDotRules&lt;/code&gt; as separate units so policy stays testable.&lt;/p&gt;
&lt;h2 id=&#34;table-driven-tests-beat-intuition&#34;&gt;Table-driven tests beat intuition&lt;/h2&gt;
&lt;p&gt;Our fictional team fixed its pipeline by building a test corpus, not by debating memory:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;div class=&#34;chroma&#34;&gt;
&lt;table class=&#34;lntable&#34;&gt;&lt;tr&gt;&lt;td class=&#34;lntd&#34;&gt;
&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code&gt;&lt;span class=&#34;lnt&#34;&gt;1
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;2
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;3
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;4
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;5
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;6
&lt;/span&gt;&lt;span class=&#34;lnt&#34;&gt;7
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class=&#34;lntd&#34;&gt;
&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-text&#34; data-lang=&#34;text&#34;&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;Input Component                         Expected Shape
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;--------------------------------------  ------------------------
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;README.TXT                              README.TXT
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;very long filename.txt                  VERYLO~1.TXT
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;archive.final.build.log                 ARCHIV~1.LOG
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;...hiddenprofile                        HIDDEN~1
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;name with spaces.and.dots...cfg         NAMEWI~1.CFG&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;&lt;p&gt;The exact alias strings can vary with existing collisions and code-page/legality policy details, but the algorithmic behavior should not vary.&lt;/p&gt;
&lt;h2 id=&#34;why-this-matters-in-operational-pipelines&#34;&gt;Why this matters in operational pipelines&lt;/h2&gt;
&lt;p&gt;Shortname translation touches many workflows:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;installer scripts that reference legacy names&lt;/li&gt;
&lt;li&gt;backup/restore verification against manifests&lt;/li&gt;
&lt;li&gt;cross-tool compatibility between VFAT-aware and strict 8.3 utilities&lt;/li&gt;
&lt;li&gt;reproducible release artifacts&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;If alias generation is non-deterministic, two developers can build &amp;ldquo;same version&amp;rdquo; media with different effective filenames.&lt;/p&gt;
&lt;p&gt;That is a release-management nightmare.&lt;/p&gt;
&lt;h2 id=&#34;the-fictional-incident-response&#34;&gt;The fictional incident response&lt;/h2&gt;
&lt;p&gt;In our story, the break happened during a Friday packaging run. By Saturday morning, three teams had three conflicting explanations:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&amp;ldquo;the verifier is wrong&amp;rdquo;&lt;/li&gt;
&lt;li&gt;&amp;ldquo;Windows generated weird aliases&amp;rdquo;&lt;/li&gt;
&lt;li&gt;&amp;ldquo;someone copied files manually&amp;rdquo;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;By Saturday afternoon, a tiny deterministic translator plus collision-aware tests cut through all three theories. The verifier was correct, alias generation differed between tools, and manual copies had introduced namespace collisions in one directory.&lt;/p&gt;
&lt;p&gt;Nobody needed blame. We needed rules.&lt;/p&gt;
&lt;h2 id=&#34;subtle-rule-legality-depends-on-oem-code-page&#34;&gt;Subtle rule: legality depends on OEM code page&lt;/h2&gt;
&lt;p&gt;One more important caveat from the spec:&lt;/p&gt;
&lt;p&gt;Uppercasing and character validity are evaluated in active OEM code page context.&lt;/p&gt;
&lt;p&gt;That means &amp;ldquo;works on my machine&amp;rdquo; can still fail if code-page assumptions differ. For strict reproducibility, pin the environment and test corpus together.&lt;/p&gt;
&lt;h2 id=&#34;practical-implementation-checklist&#34;&gt;Practical implementation checklist&lt;/h2&gt;
&lt;p&gt;For a robust translator:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;process one path component at a time&lt;/li&gt;
&lt;li&gt;implement legal-8.3 fast path first&lt;/li&gt;
&lt;li&gt;codify dot-selection/trailing-dot behavior exactly&lt;/li&gt;
&lt;li&gt;separate remove-vs-replace character policy clearly&lt;/li&gt;
&lt;li&gt;enforce extension max length 3&lt;/li&gt;
&lt;li&gt;implement collision tail growth with dynamic prefix shrink&lt;/li&gt;
&lt;li&gt;ship fixture tests with occupied-directory scenarios&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;That last point is non-negotiable. Most alias bugs only appear under collision pressure.&lt;/p&gt;
&lt;h2 id=&#34;closing-scene&#34;&gt;Closing scene&lt;/h2&gt;
&lt;p&gt;Our weekend story ends around 01:03 on Sunday. The final verification pass prints green across every directory. The whiteboard still looks chaotic. The room still smells like old plastic and instant coffee. But now the behavior is explainable.&lt;/p&gt;
&lt;p&gt;Long names can still be expressive. Short names can still be strict. The bridge between them does not need magic. It needs documented rules and testable translation.&lt;/p&gt;
&lt;p&gt;In DOS-era engineering, that is usually the whole game: reduce mystery, increase repeatability, and let simple tools carry serious work.&lt;/p&gt;
&lt;p&gt;Related reading:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;https://turbovision.in6-addr.net/retro/dos/deterministic-dir-output-as-an-operational-contract/&#34;&gt;Deterministic DIR Output as an Operational Contract&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://turbovision.in6-addr.net/retro/dos/batch-file-wizardry/&#34;&gt;Batch File Wizardry&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://turbovision.in6-addr.net/retro/dos/tp/turbo-pascal-units-as-architecture/&#34;&gt;Turbo Pascal Units as Architecture&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
</description>
    </item>
    
  </channel>
</rss>
