Diamond: A Compression and Disk Layout Tool
Ben Slivka (bens)

8/5/94	1.00.21	(Build 507) /SIZE parameter only affects INF file contents; Remove FailOnMissingSource and DefaultFileSize variables
8/1/94	1.00.20	(Build 506) Added CabinetName[n], DiskDirectory[n], allow generation of leading blanks on .InfWrite[Disk|Cabinet] directive; Added chapter on Optimization and Tuning.
7/13/94	1.00.19	(Build 504) Added InfHeader[n], InfFooter[n], file# parameter, InfDisk/Cabinet/FileLineFormat[n], DoNoCopyFiles, self-extracting cabinet files
6/6/94	1.00.18	(Build 500) Added Quantum Compression: CompressionType, CompressionLevel, and CompressionMemory variables; .Option Explicit/.Define to catch variable name typos; .Dump to display entire variable table; .InfDateFormat to control date format in the INF file
5/2/94	1.00.17	Add Unified vs. Relational INF generation (gives total control over both file layout and INF layout); Removed .Group/.Filler directives; Removed resolved issues and discussion rendered obsolete by the new customizable INF features.
4/21/94	1.00.16	Refer to STATUS.XLS for list of implemented features; Added MaxDiskSizeN; removed DiskSizeList; misc reviewer comment updates; make FreezeLayout variable high priority (per dfrumin)
3/28/94	1.00.15	Provide examples in overview section to illustrate Diamond applications; update Cryptosystem/RESERVE notes & documentation
2/18/94	1.00.14	Change %d back to * for disk number replacement; Jessev review comments; latest directives & variables
1/13/94	1.00.13	Finish adding notes from 12/21/93 ACME/Chicago design review.
1/12/94	1.00.12	Remove MaxDiskSizeNext/MaxCabinetSizeNext/MinFolderSizeNext; completed variable definitions; added more directive examples.
1/11/94	1.00.11	Add details on integrating encryption from 1/10/94 Ali Baba meeting.
1/05/94	1.00.10	Add details on cabinet/folder/file construction, notes from 12/21/93 design review.
9/21/93	1.00.09	Add more details on EXTRACT.EXE in response to Qs from DavidRe.
9/03/93	1.00.08	Add named groups, DIST file system discussion, upper bound on gains by using Cabinet Files.
8/19/93	1.00.07	Add DiskSizeList=, CompressedFileExtenstionChar=, FailOnMissingSource=, MaxDiskSize=, MaxDiskSizeNext=, DiskDirectoryTemplate=
8/17/93	1.00.06	Add .Filler directives; algorithm for performing file layout
8/16/93	1.00.05	Add .Group and .OnDisk directives
8/15/93	1.00.04	Changed names to DIAMOND.EXE and EXTRACT.EXE
8/11/93	1.00.03	Most of Directive File syntax is .SET var=value
7/09/93	1.00.02	Added details on COMPRESS.EXE directives, FC Format
6/14/93	1.00.01	Initial version


Table of Contents
1. Overview	4
1.1. Case 1: Diamond for Setup Programs	5
Characteristics of a Setup Program	5
Diamond Application	5
1.2. Case 2: Diamond for a 200Mb Source Code Archive	6
Characteristics of a Source Code Archive	6
Diamond Application	6
1.3. Case 3: Diamond as a PKZIP/PKUNZIP Replacement	7
1.4. Case 4: Self-extracting Cabinet File(s)	7
1.5. Diamond Deliverables	7
1.6. Diamond Goals	8
2. Diamond Optimizing and Tuning	8
2.1. How Data Compression Works	8
2.2. Comparison of MSZIP and Quantum	9
2.3. Saving Diskettes	9
2.4. Tuning Access Time vs. Compression Ratio	10
2.5. Piecemeal DDFs for Localization and Different Disk Sizes	11
3. Diamond Concepts	11
3.1. Decoupling File Layout and INF Layout	11
4. DIAMOND.EXE	14
4.1. DIAMOND.EXE Syntax	14
4.2. DIAMOND.EXE Directive File Syntax	15
4.2.1. Command Summary	16
4.2.2. Variable Summary	16
4.2.3. InfDisk/Cabinet/FileLineFormat Syntax and Semantics	17
4.2.4. INF Parameters	17
4.2.5. Command Details	18
4.2.6. Variable Details	23
5. EXTRACT.EXE	36
6. Issues	37
6.1. Diamond does not do file type sorting	37
6.2. Diamond does not empty output directories	37
6.3. How should MCI deal with incompressible blocks?	38
6.4. No .Include directive	38
6.5. No accounting for subdirectories on disks	38
6.6. Drag and Drop Extraction from Cabinet Files	38
7. Diamond File Cabinet Format (FCF)	38
7.1. FCF Details	38
7.2. Data Integrity Strategy	38
8. Implementation/Design Notes	39
8.1. Folders always begin and end at a file boundary	39
8.2. Allow compressed blocks to span Cabinet File boundaries	39
8.3. Byte Ordering & Source Code Portability	39
8.4. Feedback, Reporting, & Logging	39
9. Future Enhancement Thoughts	39
9.1. Add FreezeLayout=file variable	39
9.2. Add DiskID INF Parameter	40
9.3. Add .If/.Else/.ElseIf/.Endif directives	40
9.4. Add .LET directive to do math	40
9.5. Allow wild-cards in File Copy Commands	40
9.6. Exclude certain files	40
9.7. Allow directory recurse in copy specification	40
9.8. Have option to check for invalid CD-ROM directory/file names	41
10. Memory Compression Interface (MCI)	41
11. File Compression Interface (FCI)	41
12. Encryption Support in Diamond	41
12.1. Motivation	41
12.2. Overview of Solution	41
12.2.1. Encryption is optional	41
12.2.2. An encrypted cabinet will have an FCRESERVE structure in the FCHEADER	41
12.2.3. Encryption context is assumed to be reset at folder boundaries	42
12.2.4. Each data block may have an optional header	42
12.2.5. DIACRYPT.EXE applies encryption after DIAMOND.EXE	42
12.2.6. Standard Encryption Interface for DIACRYPT.EXE	42
12.2.7. Decryption is plugged into the File Decompression Interface	42
12.3. Design notes for setup programs	42
12.3.1. Crypto and non-crypto versions of ACME	42
13. Distribution Media Format (DMF) Disks	43
14. Compression Gain with Cabinet Files	43
15. DIAMOND.EXE Output Report Formats	43

ATTENTION:	If you are an ACME (Windows Setup Toolkit) customer, please see JenC or AlanR for details on how to use Diamond with ACME 1.1 or later.

1	Overview
Diamond is a lossless data compression tool that can be used for a wide variety of purposes.  Although it was originally designed for use by setup programs, it can also be used in almost any situation where lossless data compression is required and slow compression time (in exchange for better compression) is OK.

Diamond has three key features: 1) storing multiple files together in a single cabinet file, 2) compressing across file boundaries, and 3) permitting files to span across cabinets.  Existing products like PKZIP, LHARC, ARJ, etc. support some of these features, but combining them all does not seem to be a common practice.  Diamond also supports self-extracting archives, by simply concatenating EXTRACT.EXE with a cabinet file.

Depending upon how many files are to be compressed, and what kind of access patterns are expected (sequential versus random access; most of the files will be read versus only a small portion), you will make different choices about how you tell Diamond to build your cabinet files.  One very key concept in Diamond is the folder.  A folder is a collection of one or more files that are compressed together, as a single entity.  The most important property of a folder is that to access a particular file in the folder, any preceding files in the folder must be read and decompressed.  For example, if you have 100 files in a folder, and they compress down from 3Mb to 1Mb, and you want to extract the last file in the folder, you must read the entire 

There are two other technologies that are related to Diamond: A new core compressor called Quantum, and a new read-only format for 3.5&#65533; floppy disks called Distribution Media Format (DMF).  A quick summary of each of these is presented below.

Quantum is a new compression technology that Microsoft obtained an unrestricted license to in early May, 1994.  It achieves compressed file sizes 10-15% smaller than MSZIP, and Quantum will be the preferred compressor (possibly the only one) supported by Diamond.  In order to achieve these impressive results, Quantum can require a fair bit of memory (up to 12Mb) at compress time, and even at decompress time (configurable from 1K to 2Mb), and Quantum gets its best results on large data streams.  For this reason, cabinet files and Quantum are a great fit, because cabinet files with large folders ensure that Quantum is always compressing big blocks of data.  The decompression memory requirements for Quantum is tunable in the Diamond directive file (see the CompressionMemory variable).

DMF is a special read-only format for 3.5&#65533; floppy disks that permits storing 1.68Mb of data (a 17.7% increase over the standard 1.44Mb format).  This is achieved by reducing the inter-sector gap so that we can add 3 sectors per track.  This does not affect the ability of arbitrary floppy drives to read the diskette, because we have not changed the magnetic recording density.  With this reduced inter-sector gap, however, there is not enough room between sectors to allow a floppy drive to reliably write to a DMF diskette.  There are tools to create DMF disk images, and we have verified that the disk duplicating machines (Trace and Rimage) used by Microsoft and our key duplicators will correctly and efficiently duplicate these disks.  One limitation of the DMF format is that the root directory only holds 16 entries, and the cluster size is 2K.  For this reason, using cabinet files on DMF is ideal, since the root directory size will not be exceeded, and with only one cabinet file per DMF disk, the 2K cluster allocation granularity does not cause any wasted space.

The combination of Diamond, Quantum, and DMF should yield a 20-30% reduction in the number of disks in a product, compared with ACME1+MSZIP.  Your actual mileage may vary, but in measurements on MS Office 4.2 for Windows, the 25 x 3.5&#65533; disks used by ACME1+MSZIP were reduced to 18 x 3.5&#65533; disks by Diamond + Quantum + DMF -- a 28% savings!

The following sections provide case studies of several possible ways that Diamond might be used.  These are only provided to stimulate your imagination -- they are not the only ways that Diamond can be used!
1.1	Case 1: Diamond for Setup Programs
Diamond was designed with setup programs in mind, so it has a great deal of power and flexibility to tradeoff compressed size for speed of random access to files. The primary impact of Diamond is to minimize the number of diskettes in a Microsoft product, thereby minimizing the Cost of Goods Sold (COGS) - one 1.44Mb disk costs one dollar! 

Prior to Diamond, the files of a product were compressed individually, and then someone (a program manager or a builder, usually) used an Excel macro to optimally pack files onto the distribution disks to avoid wasting space.  This was time consuming, tedious, and error-prone.

With Diamond, you simply write a Diamond directive file, specifying the list of files in a product, and any constraints on where those files should be located on disks, and Diamond builds the disk images.  The same Diamond Directive File can even be used for all the various localized versions of a product, since directive files support parameterization.
Characteristics of a Setup Program
1)Minimizing disk usage is very, very important; an extra disk is an extra dollar!
2)Files are accessed sequentially
3)Most files are accessed
Diamond Application
The distribution disks for a typical application product like Microsoft Excel produced by Diamond would look something like this:

Figure 1: Distribution disk layout


SETUP.EXE is the setup program, and SETUP.INF is a file that guides the operation of the setup program (what files are needed for what options, and on which disk and in which cabinet file the file is contained).  All of the remaining product files are contained in the cabinet files EXCEL.1 through EXCEL.N (N might be 7, for example).

To produce this disk layout with Diamond is very simple.  A Diamond Directive File (DDF) is prepared which lists all of the files for Microsoft Excel, along with some optional Diamond settings to control things like: 1) what size disks are being used, 2) what are the cabinet names to be, 3) what are the visible (user-readable) labels on each disk, 4) how much random access is desired for files within a cabinet.  The following is an example of a DDF that might be appropriate for Microsoft Excel:

;*** EXCEL Diamond Directive file example
;
.OPTION EXPLICIT							; Generate errors on variable typos

.Set DiskLabel1=Setup					; Label of first disk
.Set DiskLabel2=Program      			; Label of second disk
.Set DiskLabel3="Program Continued"	; Label of third disk
.Set CabinetNameTemplate=EXCEL.* 	; EXCEL.1, EXCEL.2, etc.
.set DiskDirectoryTemplate=Disk*		; disk1, disk2, etc.
.Set MaxDiskSize=1.44M	       		; 3.5" disks

;** Setup.exe and setup.inf are placed uncompressed in the first disk
.Set Cabinet=off
.Set Compress=off
.Set InfAttr=							; Turn off read-only, etc. attrs
bin\setup.exe		       			; Just copy SETUP.EXE as is
bin\setup.inf		       			; Just copy SETUP.INF as is

;** The rest of the files are stored, compressed, in cabinet files
.Set Cabinet=on
.Set Compress=on
bin\excel.exe		       			; Big EXE, will span cabinets
bin\excel.hlp
bin\olecli.dll
bin\olesrv.dll
;...										; Many more files
;*** <the end>							; That&#65533;s it

Now, you run Diamond to create the disk layout:
diamond /f excel.ddf

Diamond will created directories Disk1, Disk2, etc. to hold the files for each disk, and will copy uncompressed files or create cabinet files (as appropriate) in each directory.  The file SETUP.RPT will be written to the current directory (this can be overridden) with a summary of what Diamond did, and the file SETUP.INF will contain details on every disk and cabinet created, including a list of where each file was placed.
1.2	Case 2: Diamond for a 200Mb Source Code Archive
The Microsoft Developers Network (MSDN) CD wants to ship 200Mb of source code on their CD.  While this is uncompressed this is only 1/3rd of the CD, that is still too much space, so they want to compress it very tightly.  This is slightly different from the Setup case, however, since they have a front-end tool that allows users to select sample programs and expand them onto their hard disk.
Characteristics of a Source Code Archive
1)Minimizing space usage is slightly less important
2)Files are accessed somewhat randomly, though in groups
3)Only a small portion of the files will be accessed at any one time
Diamond Application
The cabinet files produced for the source archive need to be big enough to get good compression, but not so big that they sacrifice random access speed.  So the challenge is to get a good tradeoff between compression and access time.

;*** MSDN Sample Source Code Diamond Directive file example
;
.OPTION EXPLICIT							; Generate errors on variable typos

.Set CabinetNameTemplate=MSDN.* 		; MSDN.1, MSDN.2, etc.
.set DiskDirectoryTemplate=CDROM		; All cabinets go in a single directory
.Set MaxDiskFileCount=1000      		; Limit file count per cabinet, so that
											; scanning is not too slow
.Set FolderSizeThreshold=200000		; Aim for ~200K per folder
.Set CompressionType=Quantum			; Use the best compressor
.Set CompressionLevel=7				; Most intensive searching algorithm
.Set CompressionMemory=800000			; 4 times larger than folder limit,
											; since this is an uncompressed size,
											; whereas folder limit is a compressed
											; size, and we expect a compression
											; ratio of 3:1 to 4:1.

;** All files are compressed in cabinet files
.Set Cabinet=on
.Set Compress=on
foo.c
foo.h
....
;*** <the end>							; That&#65533;s it

1.3	Case 3: Diamond as a PKZIP/PKUNZIP Replacement
DIAMOND.EXE and EXTRACT.EXE have fairly simply command-line syntax, and so can be used in much the same was as PKZIP and PKUNZIP to archive and distribute files.  The only difference is that, unlike PKZIP, DIAMOND cannot add files to an existing cabinet.  Rather, it has to start from a set of files and build up the cabinet from scratch.
1.4	Case 4: Self-extracting Cabinet File(s)
EXTRACT.EXE has a small bit of cleverness to recognize when it has been copied to the front of a cabinet file, and in that case it automatically extracts the files in that cabinet file (and any continuation cabinet files).  Here is what you do:

1) Create a cabinet file (set)
2) Prepend EXTRACT.EXE to the first cabinet file
3) Distribute the self-extracting cabinet (and any subsequent cabinets)

Example:
diamond /f self.ddf			; Build cabinet file set self1.cab, self2.cab
copy /b extract.exe+self1.cab self.exe  ; self.exe is self-extracting

1.5	Diamond Deliverables
The following table is a list of all the libraries and programs that are part of Diamond:

[6/6/94 bens: At this time, we have versions of diamond.exe, extract.exe, and fdi.lib for both 16-bit MS-DOS/Windows and Win32 on x86.  Alpha and Mips versions are built (thank you!) by the ACME group, as they have the machines and compilers.]

File
Contents
DIAMOND.EXE
Command-line tool to perform disk layout (uses FCI.LIB)
FDI.LIB
File Decompression Interface library (uses MDI.LIB).
EXTRACT.EXE
Command-line tool to expand files (uses FDI.LIB)
DDUMP.EXE
Tool to dump internal format of a Diamond cabinet file
FCI.LIB
File Compression Interface library (uses MCI.LIB).
MCI.LIB
MSZIP Memory Compression Interface library.
MDI.LIB
MSZIP Memory Decompression Interface library.
QCI.LIB
Quantum Memory Compression Interface library (32-bit ONLY)
QDI.LIB
Quantum Memory Decompression Interface library.
1.6	Diamond Goals
o	Provide world-class compression ratio and decompression speed
o	Simplify production of disk layouts for MS products
o	Provide command line tools and link libraries for all Microsoft platforms
2	Diamond Optimizing and Tuning
2.1	How Data Compression Works
There are many fine books that discuss this topic in gory detail, providing theoretical background as well as source code.  If you want this kind of detail go to the library!  I&#65533;m just going to give you a very high-level overview to help you build some intuition about the MSZIP and Quantum compressors supported by Diamond.

MSZIP (derived from the very popular PKZIP) and Quantum both perform lossless data compression -- you get the same bits when you decompress that you originally put into the compressor.  This is in contrast to schemes like JPEG and MPEG that are lossy schemes -- they are tuned for audio/video information and are able to discard a great deal of information at compression time because they can &#65533;recreate&#65533; the original image by interpolating (faking) the tossed information.  The resulting uncompressed audio/video is not exactly the same as the original, but it is pretty close, and the compression ratio can be 20:1 or higher (compared to the 2 or 3:1 ratio that MSZIP
and Quantum achieve on average).

Both MSZIP and Quantum are in the LZ771 family of compressors.  An LZ77 compressor takes a block of bytes, and makes one pass over it, looking for duplicated byte strings.  The compressed output of an LZ77 compressor includes raw bytes that were not matched, and match tokens that refer back to data previously encountered in the data block.  These match tokens contain a match offset and a match length.  The match offset tells the decompressor how many bytes back in the input to start copying bytes from, and the match length tells the decompressor how many bytes to copy.

For example, consider the following sentence:

	         1         2         3         4
	12345678901234567890123456789012345678901234
	The rain in Spain falls mainly on the plain.

To compress this sentence, we first identify the repeated sequences, which we write as <offset, length>.  The offset is the number of bytes to the left where the match starts, and the length is the number of bytes matched.  Note that while we are using ASCII characters in this example, this technique works for arbitrary, binary data.

	The rain <3,3>Sp<9,4>falls m<11,3>ly o<16,2>t<34,3>pl<15,3>.

Looks simple, doesn&#65533;t it!  Well, there are two tricky parts: 1) how do you find the matches?, and 2) how do you encode both the unmatched bytes as well as the match tokens compactly so that you actually use as little space as possible?  Well, we have reached the technical limits of what I&#65533;m going to explore in this document, so I won&#65533;t tell you how MSZIP and Quantum accomplish either of the above tricky parts.  However, I do want to discuss some essential difference between the two algorithms, so that you can intelligently choose between them.  

Fundamentally, the better your technique for finding matches, the slower your compressor will be.  By contrast, decompression time is almost entirely a function of the encoding scheme.  So even if you select the most intense Quantum compression settings, your decompression rate (bytes/sec) will be the same as for the least intense Quantum compression settings.

Finding matches can be very fast or very slow, depending on how good a job you do finding matches, and how far back in the data block are willing to look.  MSZIP only looks back 32K, and it examines all possible matches and selects the longest one.  Quantum, by contrast, is configurable (see CompressionMemory) -- you can specify that it look back anywhere from 1K to 2M!  Furthermore, Quantum also has a setting that controls how much it tries to optimize the compression (see CompressionLevel) -- at the highest settings Quantum will spend more time experimenting with alternating both the matches it selects and the encoding scheme it uses to find the one which produces the smallest compressed data output.  So, as you can see in the table below, while Quantum can compress much more tightly than MSZIP, it can also be much slower!
2.2	Comparison of MSZIP and Quantum
The following table compares MSZIP and Quantum (at various settings) as applied to the Chicago Build 106 disk layout (1240 files, 51,596,547 bytes).  These results were obtained with DIAMOND.EXE build 500 (6/6/94) running on a Compaq ProSignia Pentium 60MHz with 16Mb RAM under Windows NT 3.5 beta 1 (build 612).

Observations:
1.Decompression time is a constant multiple of the compressed data size!  Hence, using a more intensive Quantum compression setting (higher level and/or more memory) will speed up decompression, even though compression will slow down.





Compression
Decomp
Frac
Int
cbCmp
T
Lvl
Mem
cbCmp
secs
Mb/hr
secs
KB/sec
DMF
DMF
%mszip
Q
1
32Kb
22,018,180
984.79
179.88
527
95.61
12.83
13
101.5%
M


21,702,390
1167.52
151.73
139
362.50
12.65
13
100.0%
Q
2
32Kb
21,328,113
1405.55
126.03
579
87.02
12.43
13
98.3%
Q
2
256Kb
20,182,323
1900.73
93.20
501
100.57
11.76
12
93.0%
Q
3
256Kb
20,055,201
3230.59
54.83
490
102.83
11.69
12
92.4%
Q
3
1Mb
19,592,176
5018.31
35.30
477
105.63
11.42
12
90.3%
Q
4
256Kb
19,549,098
11685.62
15.16
471
106.98
11.39
12
90.1%
Q
7
256Kb
19,464,304
44937.93
3.94
465
108.36
11.34
12
89.7%
Q
5
1Mb
19,038,668
20920.93
8.47
482
104.54
11.09
12
87.7%
Q
7
1Mb
18,970,494
52973.25
3.34
476
105.86
11.05
12
87.4%

Column explanation
	T	Compression type (M=MSZIP, Q=Quantum)
	Lvl	Quantum compression level (1=lowest, 7=highest)
	Mem	Quantum compression memory
	cbCmp	Compressed data size (uncompressed size is 51,596,547)
	Csecs	Time to compress in seconds
	Mb/hr	Megabytes of uncompressed data compressed per hour
	Dsecs	Time to decompress in seconds
	KB/sec	Kilobytes of uncompressed data decompressed per second
	Fdmf	Number of DMF (1.68Mb) disks needed
	Idmf	Frac DMF rounded up to a whole disk number
	%msz	Comparison of compressed size to MSZIP compressed size (smaller is better)
2.3	Saving Diskettes
For a product shipped on floppy disks, it is very important to minimize the number of disks shipped per product!  If it costs Microsoft one dollar per disk, and we ship one million units, then an extra disk costs us US$1 million!  The following pseudo-code suggests a process you might follow as you strive to keep your Cost of Goods Sold (COGS) to a minimum:

	get initial product files;
	while (have not yet shipped)
		//** Figure out smallest possible size
		Compress file set using:
			CompressionType=Quantum
			CompressionLevel=7
			CompressionMemory=21
		If near a disk boundary
			Consider tossing files to save a disk (especially clipart & samples!)
		If near shipping
			Relax CompressionMemory and FolderSizeThreshold and CompressionLevel to
				improve access time	at decompress, as well as to speed up compress time.
	end-while
	Ship it!

NOTE:	As the above table shows, high settings of CompressionLevel cause Diamond to run very slowly!  The appropriate build group process is probably to produce regular disk layouts at lower CompressionLevel  settings (like 3 or 4) and CompressionMemory settings (like 18 == 256K), and then kick off the CompressionLevel=7, CompressionMemory=21 layout to run overnight or over a weekend.  This keeps the slow layout process from delaying the product group, but still lets the program manager/product manager responsible for COGS stay on top of how the product is shaping up!

[8/1/94: ACME 1.1 does not currently support disk 1 being a DMF disk, so you will have to use a 1.44M disk for disk 1, and then any remaining disks can be DMF.]
2.4	Tuning Access Time vs. Compression Ratio
Diamond introduces the concept of a folder to refer to a contiguous set of compressed bytes.  To decompress a file from a cabinet, FDI.LIB (called by SETUP.EXE and EXTRACT.EXE) finds the folder that the file starts in, and then must read and decompress all the bytes in that folder from the start up through and including the desired file.  

For example, if the file FOO.EXE is at the end of a 1.44Mb folder on a 1.44M diskette, then FDI.LIB must read the entire diskette and decompress all the data.  This is about the worst access time possible, but (assuming you are using Quantum with CompressionMemory=21) will yield the highest possible compression.  By contrast, if FOO.EXE was at the start of a folder (regardless of how large the folder is), then it would be read and decompressed with no extra overhead.

So, why don&#65533;t you always Set FolderFileCountThreshold=1?  Because then the compression history would be reset after every file, and your compression ratio would be quite poor!  Since Diamond has no idea what your compression ratio versus access time needs are, Diamond provides several variables and directives to give you very fine control over these issues:

Variable/Directive
More Compression;
Slower Access Time
Less Compression;
Faster Access Time
CompressionMemory
Bigger numbers
Lower numbers
CabinetFileCountThreshold
Bigger numbers
Lower numbers
FolderFileCountThreshold
Bigger numbers
Lower numbers
FolderSizeThreshold
Bigger numbers
Lower numbers
MaxCabinetSize
Bigger numbers
Lower numbers
.New Folder
Don&#65533;t use
Use often
.New Cabinet
Don&#65533;t use
Use often

The Diamond defaults are configured for a floppy disk layout, with the assumption that the most common scenario is a full setup that will extract most of the files, so these are the settings:

Variable/Directive
Value
CompressionMemory
256K
CabinetFileCountThreshold
Unlimited
FolderFileCountThreshold
Unlimited
FolderSizeThreshold
Same as MaxCabinetSize
MaxCabinetSize
Same as MaxDiskSize

For the MSDN source archive (>200Mb of sample source code, >30,000 files) that ships on a CD-ROM, the following values might be a reasonable tradeoff between compression and access time (NOTE: I&#65533;ve not tested these settings!):

Variable/Directive
Value
CompressionMemory
800K (Needs to be FolderSizeThreshold time the average compression ratio, which for source files we will assume is no more than 4:1).
CabinetFileCountThreshold
2000 (Since we have to call FDICopy() on a cabinet and walk through all the FILE headers, we want this small enough so that isn&#65533;t too much overhead, but large enough to keep the number of cabinets down.)
FolderFileCountThreshold
Unlimited (Let FolderSizeThreshold control folder size!)
FolderSizeThreshold
200K (Represents 600K-800K of source (assuming 3:1 or 4:1 compression ratio)
MaxCabinetSize
Unlimited (Let CabinetFileCountThreshold control the cabinet size!)

Of course, if you are tight for space on your CD-ROM, you&#65533;ll probably boost the FolderSizeThreshold and CompressionMemory settings!
2.5	Piecemeal DDFs for Localization and Different Disk Sizes
DIAMOND.EXE was designed to minimize the amount of duplicate information needed to generate product layouts for different languages and disk sizes.  A key feature is the ability to specify more than one DDF on the DIAMOND.EXE command line.  Using ACME, for example, the following set of DDFs might make sense:

acme.ddf
Standard ACME definitions to control the format of the output INF file
lang.ddf
Sets language-specific settings (SourceDir, for example)
disk.ddf
Sets the diskette sizes (CDROM, 1.2M, 1.44M, DMF168, etc.)
product.ddf
Lists all the files in the product, and uses variables set in the previous DDFs to customize its operation

You would use the following command line to process this set of DDFs:

	diamond /f acme.ddf /f lang.ddf /f disk.ddf /f product.ddf
3	Diamond Concepts
The key feature of Diamond is that it takes a set of files and produces a disk layout while at the same time attempting to minimize the number of disks required.  To understand how Diamond does this, you need to understand the following terms: cabinet, folder, and file.  Essentially, Diamond takes all of your files, lays the bytes down as one continues byte stream, compresses the entire stream, chopping it up into folders as appropriate, and then filling up one or more cabinets with the folders.

Cabinet	A normal file that contains pieces of one or more files, usually compressed.

Folder	A decompression boundary.  Large folders enable higher compression, because the compressor can refer back to more data in finding patterns.  However, to retrieve a file at the end of a folder, the entire folder must be decompressed.  So there is a tradeoff between achieved compression and the quickness of random access to individual files.
File	A file to be placed in the layout..
3.1	Decoupling File Layout and INF Layout
Diamond has two &#65533;modes&#65533; for generating the INF file; &#65533;unified&#65533; mode and &#65533;relational&#65533; mode.  In &#65533;unified&#65533; mode, the INF file is generated as file copy commands are processed.  This is the default, and minimizes the amount of effort needed to construct a DDF file.  However, this forces the INF file to list the files in the layout in exactly the same order as they are placed on disks/cabinets.  So, in &#65533;relational&#65533; mode the DDF has file copy commands to specify the disk layout, and file reference lines to specify the exact placement of file info lines, including the ability to list the same file more than once.  This ability is important for INF structures which use section headers (e.g., &#65533;[clipart]&#65533;, &#65533;[screen savers]&#65533;, etc.) to identify sets of files for particular functionality, and for which the same file may need to be included in more than one section.

Notes:
(1)	In "relational" mode, only the last setting of a particular InfXxx default parameter variable (both standard parameters like InfDate, InfTime, etc. and custom parameters) in the layout portion of the DDF is respected.
	Example:
	If you did ".set InfDate=12/05/92" at the start of the layout portion, and then did ".set InfDate=01/01/94" in the middle of the layout portion, the latter value would be used for the entire INF file.

(2) Any parameters on a reference line will override parameters on the corresponding file copy line.
	Example:
		;* layout portion
		foo bar /x=1
		...
		;* INF portion
		bar /x=2            ; INF file will have value 2

(3)	In "relational" mode, each file copy command in the layout portion of the DDF must be referenced at least once in a reference command in the INF portion of the DDF.  Any files that are not referenced will cause an error during pass 1.  The /inf=no parameter must be specified on any file copy commands for files which are going to be omitted from the INF file.

(4)	In "relational" mode, UniqueFiles must be ON, because the destination file name is used in the INF portion of the DDF to refer back to file information.

Example of a Relational DDF:

    ;** Set up INF formats before we do the disk layout, because Diamond
    ;   writes Disk and Cabinet information out as it is generated.
    .OPTION EXPLICIT                      ; Generate errors for undefined variables

    .Set InfDiskHeader="[disk list]"
    .Set InfDiskHeader1=";<disk number>,<disk label>"
    .Set InfDiskLineFormat="*disk#*,*label*"

    .Set InfCabinetHeader="[cabinet list]"
    .Set InfCabinetHeader1=";<cabinet number>,<disk number>,<cabinet file name>"
    .Set InfCabinetLineFormat="*cab#*,*disk#*,*cabfile*"

    .Set InfFileHeader=";*** File List ***"
    .Set InfFileHeader1=";<disk number>,<cabinet number>,<filename>,<size>"
    .Set InfFileHeader2=";Note: File is not in a cabinet if cab# is 0"
    .Set InfFileHeader3=""
    .Set InfFileLineFormat="*disk#*,*cab#*,*file*,*date*,*size*"


    .set GenerateInf=OFF        ; Do disk layout first

    ;** Setup files.  These don't need to be in the INF file, so we put
    ;   /inf=NO on these lines so that Diamond won't generate an error when
    ;   it finds that these files are not mentioned in the INF portion of
    ;   the DDF.

    .set Compress=OFF
    .set Cabinet=OFF
    setup.exe /inf=NO           ; This file doesn't show up in INF
    setup.inf /inf=NO           ; This file doesn't show up in INF

    ;** Files in cabinets
    ;
    .set Compress=ON
    .set Cabinet=ON

    ;* Put all bitmaps together to help compression
    a1.bmp                      ; Bitmap for client1.exe
    b1.bmp                      ; Bitmap for client1.exe
    c1.bmp                      ; Bitmap for client1.exe
    d1.bmp                      ; Bitmap for client1.exe
    a2.bmp                      ; Bitmap for client1.exe
    b2.bmp                      ; Bitmap for client2.exe
    c2.bmp                      ; Bitmap for client2.exe
    d2.bmp                      ; Bitmap for client2.exe
    shared.dll  /date=10/12/93  ; File needed by client1.exe and client2.exe
    client1.exe                 ; needs shared.dll
    client2.exe                 ; needs shared.dll


    .set GenerateInf=ON         ; OK, now we're doing the INF layout

    ;** Feature One files
    .InfBegin File
    [feature One]
    ;Files for feature one
    .InfEnd
    client1.exe
    shared.dll  /date=04/01/94  ; Override date
    a1.bmp
    b1.bmp
    c1.bmp
    d1.bmp

    ;** Feature Two files
    .InfBegin File

    [feature Two]
    ;Files for feature Two
    ;Note that shared.dll is also required by Feature One
    .InfEnd
    client1.exe
    shared.dll
    a2.bmp
    b2.bmp
    c2.bmp
    d2.bmp

    ;*** The End

The generated INF file would look something like this:
	[disk list]
    ;<disk number>,<disk label>
    1,"Disk 1"

    [cabinet list]
    ;<cabinet number>,<disk number>,<cabinet file name>
    1,1,cabinet.1

    ;*** File List ***
    ;<disk number>,<cabinet number>,<filename>,<size>
    ;Note: File is not in a cabinet if cab# is 0

    [feature One]
    ;Files for feature one
    1,1,client1.exe,12/12/93,1234
    1,1,shared.dll,04/01/94,1234
    1,1,a1.bmp,12/12/93,573
    1,1,b1.bmp,12/12/93,573
    1,1,c1.bmp,12/12/93,573
    1,1,d1.bmp,12/12/93,573

    [feature Two]
    ;Files for feature Two
    ;Note that shared.dll is also required by Feature One
    1,1,client1.exe,12/12/93,1234
    1,1,shared.dll,10/12/93,1234
    1,1,a2.bmp,12/12/93,643
    1,1,b2.bmp,12/12/93,643
    1,1,c2.bmp,12/12/93,643
    1,1,d2.bmp,12/12/93,643
4	DIAMOND.EXE
DIAMOND.EXE is designed to produce the final distribution files and cabinets for an entire product in a single run.  By contrast, the old Apps and Systems COMPRESS.EXE operated on a single file at a time.  The most common way to use DIAMOND.EXE is to supply a directives file that controls how files are compressed and stored into one or more cabinets.  
4.1	DIAMOND.EXE Syntax
There two primary forms of DIAMOND.EXE usage.  The first is used for compressing a single file, while the second is used for compressing multiple files.

DIAMOND	[/Vn] [/D variable=value ...] [/L directory] source [destination]
DIAMOND	[/Vn] [/D variable=value ] /F directives_file [...]
	
the parameters are described below

Parameter
Description
source
A file to be compressed.
destination
The name of the file to receive the compressed version of the source file.  If not supplied, a default destination name is constructed from the source file name according to the rules defined by the CompressedFileExtensionChar variable on page .  You can use /D CompressedFileExtensionChar=c on the command line to change the appended character.
/D variable=value
Set variable to be equal to value.  Equivalent to using the .Set command in the directives file.  For example, a single directive file could be used to produce layouts for different disk sizes by running Diamond once with different values of MaxDiskSize defined: /D MaxDiskSize=1.44M.  Both standard Diamond variables and custom variables may be defined in this way.  If .Option Explicit is specified in a directive file, then variable must be defined with a .Define command in a directive file.
/L directory
Specifies an output directory where the compressed file will be placed (most useful when destination is not supplied).
/F directives_file
A file containing commands for DIAMOND.EXE to execute.  If more than one directive file is specified (/F file1 /F file2 ...), they are processed in the order (left to right) specified on the command line.  Variable settings, open cabinets, open disks, etc. are all carried forward from one directive file to the next (just as if all of the files had been concatenated together and presented as a single file to Diamond).  For example, this is intended to simplify the work for a product shipped in multiple languages.  There would be a short, language-specific directives file, and then a single, large master directives file that covers the bulk of the product.
/Vn
Set debugging verbosity level (0=none,...,3=full)
4.2	DIAMOND.EXE Directive File Syntax
Before diving into the details of the syntax of the directives file, here is an example of what the Excel directives file might look like, just to give you a chance to build up some intuition:

;*** EXCEL DIAMOND Directive file
;
.Set DiskLabel1=Setup					; Label of first disk
.Set DiskLabel2=Program      			; Label of second disk
.Set DiskLabel3="Program Continued"	; Label of third disk
.Set CabinetNameTemplate=EXCEL*.CAB 	; EXCEL1.CAB, EXCEL2.CAB, etc.
.Set MaxDiskSize=1.44M	       		; 3.5" disks

;** Setup.exe and setup.inf are placed uncompressed in the first disk
.Set Cabinet=off
.Set Compress=off
bin\setup.exe		       			; Just copy SETUP.EXE as is
bin\setup.inf		       			; Just copy SETUP.INF as is
;** The rest of the files are stored, compressed, in cabinet files
.Set Cabinet=on
.Set Compress=on
bin\excel.exe		       			; Big EXE, will span cabinets
bin\excel.hlp
bin\olecli.dll
bin\olesrv.dll
...

Here are some additional notes on the general syntax and behavior of Diamond Directive Files
1.Diamond will place files on disks (and in cabinets) in the order they are specified in the directive file(s).
2.When ever a filename or directory is called for, you may supply either a relative (e.g., foo\bar, ..\foo) or an absolute (e.g., c:\banana, x:\slm\src\bin) path.
3.Optimal compression is achieved when files with similar types of data are grouped together.
4.Diamond is controlled in large part by setting variables.  Diamond has a many predefined variables, all of which have default values chosen to represent the most common case.  You can modify these variables, and you can define your own variables as well.
5.The value of a variable is retrieved by enclosing the variable name in percent (%) signs.  If the variable is not defined, an error is generated.  If you want an explicit percent sign, use two adjacent percent signs (%%).  Diamond will collapse this to a single percent sign (%).
6.Variable substitution is only done once.  For example, .Set A=One [A is &#65533;One&#65533;]; .Set B=%%A%% (B is &#65533;%A%&#65533;); .Set C=%B% (C is &#65533;%A%&#65533;, not &#65533;One&#65533;).
7.Variable substitution is done before any other line parsing, so variables can be used anywhere.
8.Variables values may include blanks.  Quote (&#65533;) or apostrophe(&#65533;) marks may be used in .Set statements to capture blanks.  If you want an explicit quote(&#65533;) or apostrophe(&#65533;), you can intermix these two marks (use one for bracketing so that you may specify the other), or, as with the percent sign above, you can specify two adjacent marks (&#65533;&#65533;) and Diamond will collapse this to a single mark(&#65533;).
9.All sizes are specified in bytes.
10.There are a few special values for common disks sizes (CDROM, DMF168, 1.44M, 1.2M, 720K, 360K) that can be used for any of the predefined Diamond variables that describe the attributes of a disk (MaxDiskSize, ClusterSize, MaxDiskFileCount).  Diamond has built-in knowledge about the correct values of these attributes for these common disk sizes.
11.Diamond does not check for 8.3 filename limitations directly, but rather depends upon the underlying operating system to do filename validity checking (this will allow Diamond to work with Long File Names, for example, on either FAT or HPFS or NTFS or OFS).
12.Diamond makes two passes of the directive file(s).  On the first pass, Diamond checks for syntax errors and makes sure that all of the files can be found.  This is very fast, and reduces the chance that the second pass, where the actual data compression occurs, will have any problems.  This is important because compression is very time consuming, so Diamond wants to avoid, for example, spending an hour compressing files only to find that a file toward the end of the directive file(s) cannot be found.
4.2.1	Command Summary
The following table provides a summary of the Diamond Directive File syntax.  Directives begin with a period (&#65533;.&#65533;), followed by a command name, and possibly by blank delimited arguments.  Note that a File Copy command is distinguished from a File Reference command by the setting of the GenerateInf variable.
Syntax
Description
;
Comment (anywhere on a DDF line)
src [dest] [/inf=yes|no] [/unique=yes|no] [/x=y ...]
File Copy command
dest  [/x=y ...]
File Reference command
.Define variable=[value]
Define variable to be equal to value (see .Option Explicit)
.Delete variable
Delete a variable definition
.Dump
Display all variable definitions
.InfBegin Disk | Cabinet | Folder
Copy lines to specified INF file section
.InfEnd
End an .InfBegin section
.InfWrite string
Write &#65533;string&#65533; to file section of INF file
.InfWriteCabinet string
Write &#65533;string&#65533; to cabinet section of INF file
.InfWriteDisk string
Write &#65533;string&#65533; to disk section of INF file
.New Disk | Cabinet | Folder
Start a new Disk, Cabinet, or Folder
.Option Explicit
Require .Define first time for user-defined variables
.Set variable=[value]
Set variable to be equal to value
%variable%
Substitute value of variable
<blank line>
Blank lines are ignored
4.2.2	Variable Summary
Standard Variables
Description
Cabinet=ON | OFF
Turns Cabinet Mode on or off
CabinetFileCountThreshold=count
Threshold count of files per Cabinet
CabinetNamen=filename
Cabinet file name for cabinet number n
CabinetNameTemplate=template
Cabinet file name template; * is replaced by Cabinet number
ClusterSize=bytesPerCluster
Cluster size on diskette (default is 512 bytes)
Compress=ON | OFF
Turns compression on or off
CompressedFileExtensionChar=char
Last character of the file extension for compressed files
CompressionLevel=1 | 2 | ... | 7
Quantum compression level (1 least compression; 7 most)
CompressionMemory=10 | 11 | ... | 21
Quantum decompression memory requirement (bytes can be used)
CompressionType=MSZIP | QUANTUM
Compression engine
DestinationDir=path
Default path for destination files (stored in cabinet file)
DiskDirectoryn=directory
Output directory name for disk n
DiskDirectoryTemplate=template
Output directory name template; * is replaced by disk number
DiskLabeln=label
Printed disk label name for disk n
DiskLabelTemplate=template
Printed disk label name template; * is replaced by disk number
DoNotCopyFiles= ON | OFF
Controls whether files are actually copied (ACME ADMIN.INF)
FolderFileCountThreshold=count
Threshold count of files per Folder
FolderSizeThreshold=size
Threshold folder size for current folder
GenerateInf=ON | OFF
Control Unified vs. Relation INF generation mode
InfXxx=string
Set default value for INF Parameter Xxx
InfCabinetHeader[n]=string
INF cabinet section header text
InfCabinetLineFormat[n]=format string
INF cabinet section detail line format
InfCommentString=string
INF comment string
InfDateFormat=yyyy-mm-dd | mm/dd/yy
INF date format
InfDiskHeader[n]=string
INF disk section header text
InfDiskLineFormat[n]=format string
INF disk section detail line format
InfFileHeader[n]=string
INF file section header text
InfFileLineFormat[n]=format string
INF file section detail line format
InfFileName=filename
Name of INF file
InfFooter[n]=string
INF footer text
InfHeader[n]=string
INF header text
InfSectionOrder=[D | C | F]*
INF section order (disk, cabinet, file)
MaxCabinetSize=size
Maximum cabinet file size for current cabinet
MaxDiskFileCount=count
Maximum count of files per Disk
MaxDiskSize[n]=size
Maximum disk size
MaxErrors=count
Maximum errors allowed before pass 1 terminates
ReservePerCabinetSize=size
Base amount of space to reserve for FCRESERVE data
ReservePerDataBlockSize=size
Amount of space to reserve in each data block
ReservePerFolderSize=size
Amount of additional space in FCRESERVE for each folder
RptFileName=filename
Name of RPT file
SourceDir=path
Default path for source files
UniqueFiles=ON | OFF
Control whether duplicate desintation file names are allowed
4.2.3	InfDisk/Cabinet/FileLineFormat Syntax and Semantics
The InfDiskLineFormat, InfCabinetLineFormat, and InfFileLineFormat variables are used to control the formatting of the &#65533;detail&#65533; lines in the INF file.  The syntax of the values assigned to these variables is as follows:

1)	The &#65533;*&#65533; character is used to bracket replaceable parameters.
2)	Two  &#65533;*&#65533; characters in a row (&#65533;**&#65533;) are replaced by a single  &#65533;*&#65533;.
3)	A replaceable parameter name may be one of the standard ones defined by Diamond, or it may be a custom parameter.  The value used for a parameter is found in the following order:
a)If a parameter is specified on a File Copy or File Reference command, the specified value is used.
b)If a variable InfXxxx is defined for this parameter, its value is used.
c)The parameter is a standard paramater, and its defined value is used.
4)	Braces "{}" may be used to indicate portions of text plus exactly one parameter that are omitted if the parameter value is blank.  For example, &#65533;{*id*,}*file*,*size*&#65533; will generate the following strings, depending upon the values of id, file, and size:

id
file
size
Output String

foo.dat
23
foo.dat,23
17
foo.dat
23
17,foo.dat,23
17

23
17,,23
4.2.4	INF Parameters
The following table lists the standard parameters that may be specified in INF line formats and on File Copy and File Reference commands.  The Disk, Cab, and File columns indicate which parameters are supported in the InfDiskLineFormat, InfCabinetLineFormat, and InfFileLineFormat, respectively.  In addition, the File column also indicates which parameters may be specified on the File Copy and File Reference commands.

Parameter
Disk
Cab
File
Description
attr


Yes
File attributes (A=archive, R=read-only, H=hidden, S=system)
cab#	

Yes
Yes
Cabinet number (0 means not in cabinet, 1 or higher is cabinet number)
cabfile 

Yes

Cabinet file name
csum    


Yes
Checksum
date    


Yes
File date (mm/dd/yy or yyyy-mm-dd, depending upon InfDateFormat)
disk#	
Yes
Yes
Yes
Disk number (1-based)
file    


Yes
Destination file name in layout (in cabinet or on a disk)
file#    


Yes
Destination file number in layout (first file is 1, second file is 2, ...); the order of File Copy Commands controls the file number, so in relational INF mode the order of File Reference Commands has no affect on the file number.
label   
Yes


Disk user-readable label (value comes from DiskLabeln, if defined, and otherwise is constructed from DiskLabelTemplate).
lang    


Yes
Language (i.e., VER.DLL info) in base 10, blank separated if multiple values 
size    


Yes
File size (only affects value written to INF file)
time


Yes
File time (hh:mm:ss[a|p])
ver


Yes
Binary File version (n.n.n.n base 10 format)
vers


Yes
String File version  -- can be different from ver!

Just as custom INF parameters can be defined by using the .Define and .Set command (e.g., .Set InfCustom=default value), the .Set command can also be used to override the values of these parameters.  This is most obviously useful for the date and time parameters, as it provides a simple way to &#65533;date stamp&#65533; all the files in a layout; and for the attr parameter, this provides a way to force a consistent set of file attributes (commonly used to clear the read-only and archive attribute bits).
4.2.5	Command Details
;
A comment line.  
A comment may appear anywhere in a directive file.  In addition, any line may include a comment at the end.  Any text on the line following the comment is ignored.
source [destination] [/INF= YES | NO] [/UNIQUE=YES | NO] [/x=y [/x=y ...]]
A File Copy Command; specifies a file to be placed onto a disk or cabinet.  If GenerateInf is OFF, then lines without leading periods are interpreted as File Copy Commands.

source is a file name, and may include a relative or absolute path specification.  The SourceDir variable is applied first, if specified.

destination is the name to store in the cabinet file (if Cabinet is On), or the name for the destination file (if Cabinet is Off).  The DestinationDir variable is used as a prefix.

/INF=YES | NO controls whether destination must specified in a Reference command in the INF section of the DDF.  If YES is specified (the default), then destination must be specified in at least one Reference command.  If NO is specified, then destination does not have to be specified in any Reference command.  This paramter is used only if Relational INF mode is selected (see the GenerateInf variable), as Unified mode does not support Reference commands.

/UNIQUE=YES | NO controls whether destination must be unique throughout the layout.  Specifiying this parameter on the file copy command overrides the default setting controlled by the UniqueFiles variable (which defaults to YES).  If Relational INF mode is selected (see the GenerateInf variable), then UniqueFiles must be YES.

/x=y permits standard and custom INF parameters to be applied to a file copy command.  These parameters are carried along with the file by Diamond and used to format file detail lines in the INF file.  In addition, the /Date, /Time, and /Attr parameters also control the values that are placed in the cabinet files or on the disk layout (for files outside of a cabinet).  This permits a great deal of flexibility in customizing the INF file format.  A parameter &#65533;x&#65533; is defined to have the value &#65533;y&#65533; (which may be empty).  Quotes can be used in &#65533;y&#65533; to include blanks or other special characters.  If a parameter &#65533;x&#65533; is also defined on a File Reference command, that setting overrides any setting for &#65533;x&#65533; specified on the referred to File Copy command.  See &#65533;&#65533; on page  for a list of standard parameters.

NOTE:	You must define a variable InfX if you are going to use /X=y on a File Copy (or File Reference) command.  If no such variable is defined, then /X=y will generate an error.  This behavior ensures that there is a default value for every parameter, and makes it easier to catch inadvertent typing errors.

If the destination is not specified, its default value depends upon the Cabinet and Compress variables, as indicated by the following table, using BIN\EXCEL.EXE as a sample source file name.  Note that the variable CompressedFileExtensionChar controls the actual character used to indicate a compressed file.  Note also that the DestinationDir variable is prefixed to the destination name before it is stored in the cabinet file.


Compress = OFF
Compress = ON
Cabinet = OFF
EXCEL.EXE -- uncompressed, not in a cabinet.
EXCEL.EX_ -- compressed, not in cabinet (actually, this is a cabinet with a single file!)2
Cabinet = ON
EXCEL.EXE -- uncompressed, in a cabinet.
EXCEL.EXE -- compressed, in a cabinet

Examples:
.Set Compress=OFF			; Turn off compression
.Set Cabinet=OFF			; No cabinet file
setup.exe /inf=no			; Setup is put on disk 1, won&#65533;t be in INF
setup.inf				; Classic chicken & the egg problem

.Set Compress=ON			; Turn compression on
readme.txt				; Placed on disk 1 as README.TX_
.Set Cabinet=ON			; Turn cabinet file creation on
bin\excel.exe			; Placed in cabinet as EXCEL.EXE
msdraw.exe msapps\msdraw.exe  ; Placed in cabinet as MSAPPS\MSDRAW.EXE
a.txt dup.txt /unique=no	; Another dup.txt is allowed
b.txt dup.txt /unique=no	; And here it is
destination [/x=y [/x=y ...]]
A File Reference Command; specifies that information for a file (previously specified in a File Copy command) is to be written to the File section of the INF file.  This command is only supported in Relational INF mode. If GenerateInf is ON, then lines without leading periods are interpreted as File Reference Commands.

destination is the name of a file previously specified in a File Copy command as the destination in the layout (not the source!).  Therefore, UniqueFiles is required to be ON.

/x=y permits standard and custom INF parameters to be applied to a file reference command.  These parameters are merged with any parameters specified on the referenced File Copy command, with parameters on the File Reference command taking precedence.

A parameter &#65533;x&#65533; is defined to have the value &#65533;y&#65533; (which may be empty).  Quotes can be used in &#65533;y&#65533; to include blanks or other special characters. .  See &#65533;&#65533; on page  for a list of standard parameters.

NOTE:	You must define a variable InfX if you are going to use /X=y on a File Reference (or File Copy) command.  If no such variable is defined, then /X=y will generate an error.  This behavior ensures that there is a default value for every parameter, and makes it easier to catch inadvertent typing errors.

Examples:
.Set GenerateInf=OFF		; Relational INF mode; file layout
setup.exe /inf=no			; Setup is put on disk 1, won&#65533;t be in INF
readme.txt
shared.dll /special=yes		; Custom parameter

.Set GenerateInf=ON		; INF section of DDF
.InfWrite [Common]
readme.txt
.InfWrite [One]
shared.dll /special=no		; Override parm on file copy command
.InfWrite [Two]
shared.dll				; Use /special value from file copy
.Define variable=[value]
Define variable to be equal to value.

To use variable, surround it with percent signs (%) -- %variable%.
Using an undefined variable is an error, and will cause Diamond to stop before pass 2.
value may include references to other variables.
Leading and trailing blanks in value are discarded.
Blanks may be enclose in quote (&#65533;) or apostrophe (&#65533;) marks.
Explicit percent signs (%), quotes (&#65533;), or apostrophes (&#65533;) must be specified twice.

NOTE:	If .Option Explicit is specified, then you must first use .Define to define any user-defined variables before you can use .Set to modify them.  For standard Diamond variables, .Define is not permitted, and only .Set may be used on.  If .Option Explicit is not specified, then .Define is equivalent to .Set.

Examples
.Define lang=ENGLISH			; Set language
.Define country=USA			; Set country
.Define SourceDir=%lang%\%country%	; SourceDir = [ENGLISH\USA]
.Define join=%lang%%country%		; join = [ENGLISHUSA]
.Define success=100%%			; success = [100%]
.Define SourceDir=			; SourceDir = []
.Define contraction=&#65533;don&#65533;t&#65533;		; contraction = [don&#65533;t]
.Define contraction=don&#65533;&#65533;t		; contraction = [don&#65533;t]
.Define someSpaces=  hi there		; someSpaces = [hi there]
.Define someMore=&#65533;  blue dog  &#65533;	; someMore = [  blue dog  ]
.Delete variable
Delete a variable definition.

You may only delete variables that have been created by .Define or .Set commands.  Standard Diamond variables may not be deleted.

Examples:
.Set myVariable=raisin
.Delete myVariable		; Delete myVariable
.Dump
Display the entire Diamond variable table.

This command can be used to aid debugging of complicated (or not so complicated) Diamond directive files.  Note that the dump will be displayed during pass 1 and again during pass 2.

Examples:
.Dump					; Dump variable table to stdout
.InfBegin DISK | CABINET | FILE
Start a block of one or more lines to write to the specified area of the INF file.

The lines in the block will be copied unmodified to the specified section of the INF file, so no Diamond variable substitution will be performed.  Similarly, Diamond will not strip comments.

Use .InfWrite, .InfWriteCabinet, or .InfWriteDisk if you need variable substitution.

Examples:
.InfBegin disk			; Text for disk section of INF file
;This is a comment for the disk section.  Diamond will not process
;this line, so, for example, %var% will not be substituted.
.InfEnd
.InfEnd
Terminate an .InfBegin block.

Examples:
.InfEnd				; Close an .InfBegin block
.InfWrite  string
Write string to the file area of the INF file.

Note that lines will have Diamond comments removed and variable values substituted.  If you want to avoid this processing, use the .InfBegin File command.  Leading whitespace is normally removed, but you can override this by placing whitespace in quotes (see examples below)

Examples:
.InfWrite [A Section Header]	; Text for file section, this comment
					; 	will not appear.

.InfWrite ;<disk>,<file>	; Diamond strips off the comments, so this
					; 	command just writes a blank line!

.InfWrite &#65533;;<disk>,<file>&#65533;	; Get that comment in the INF file

.InfWrite &#65533;  &#65533;%someVar%		; Get leading space on the INF line
.InfWriteCabinet  string
Write string to the cabinet area of the INF file.

Note that lines will have Diamond comments removed and variable values substituted.  If you want to avoid this processing, use the .InfBegin Cabinet command.

Examples:
.InfWriteCabinet 40%% off your favorite furniture ; %% collapse down to
					; one %, because Diamond does variable
					; substitution on the string.
.InfWriteDisk  string
Write string to the disk area of the INF file.

Note that lines will have Diamond comments removed and variable values substituted.  If you want to avoid this processing, use the .InfBegin Disk command.

Examples:
.InfWriteDisk The Rain in Spain falls Mainly on the Plain
.New Disk | Cabinet | Folder
Force a disk, cabinet, or folder break.

This is used to complete the current disk, cabinet, or folder, and start a new one.

Examples:
.New Disk			; Start a new disk
.New Cabinet		; Start a new cabinet
.New Folder			; Start a new folder
.Set variable=value
Set variable to be equal to value.

To use variable, surround it with percent signs (%) -- %variable%.
Using an undefined variable is an error, and will cause Diamond to stop before pass 2.
value may include references to other variables.
value may be empty, in which case variable is set to the empty string.
Leading and trailing blanks in value are discarded.
Blanks may be enclose in quote (&#65533;) or apostrophe (&#65533;) marks.
Explicit percent signs (%), quotes (&#65533;), or apostrophes (&#65533;) must be specified twice.

NOTE:	If .Option Explicit is specified, then you must first use .Define to define any user-defined variables before you can use .Set to modify them.  For standard Diamond variables, .Define is not permitted, and only .Set may be used on.

Examples
.Set lang=ENGLISH				; Set language
.Set country=USA				; Set country
.Set SourceDir=%lang%\%country%	; SourceDir = [ENGLISH\USA]
.Set join=%lang%%country%		; join = [ENGLISHUSA]
.Set success=100%%			; success = [100%]
.Set SourceDir=				; SourceDir = []
.Set contraction=&#65533;don&#65533;t&#65533;		; contraction = [don&#65533;t]
.Set contraction=don&#65533;&#65533;t			; contraction = [don&#65533;t]
.Set someSpaces=  hi there		; someSpaces = [hi there]
.Set someMore=&#65533;  blue dog  &#65533;		; someMore = [  blue dog  ]
4.2.6	Variable Details
The standard Diamond variables are listed below.  These variables are predefined, and each of them have default value, which is used if you do not set the variable from the command line (/D var=value) or prior to the time you explicitly set the variable with a .Define or .Set command in a directive file.  
You can create your own variables as well, using the .Define command if you specify .Option Explict, and the .Set command otherwise.
Cabinet=On | Off
Turns cabinet mode on or off.
Default:	.Set Cabinet=On		; Cabinet mode is ON

When cabinet mode is On, the following applies:
1)	Files are stored in a cabinet, whose name is taken from the CabinetNameTemplate variable
2)	If the compressed size of a file would cause the current Cabinet to exceed the current MaxCabinetSize variable, then as much of the compressed file as possible is stored in the current Cabinet, that Cabinet is closed, and a new Cabinet is created.  Note that it is possible for a large file to span multiple Cabinets!
3)	If the compressed size of a file (or set of files, if the files are small) would cause the current Folder to exceed the current MinFolderSize variable, these files are the last ones added to the current Folder, a new Folder is started for any subsequent files.3  Note that if the current Folder cannot fit in the current Cabinet, as much as possible of the Folder is stored in the current Cabinet, and the remainder of the Folder is stored in the next Cabinet.  This means that it is possible for several files to be continued from one Cabinet file to the next Cabinet file!

When cabinet mode is Off, the following applies:
1)	Files are stored in individual files
2)	If the destination file is not supplied, the default name is controlled by the compression mode (see the Compress variable)

Examples
.Set Cabinet=OFF			; Files not in cabinets...
.Set Compress=OFF			; ...and no compression.
setup.exe				; Setup program is simply copied to disk.
.Set Cabinet=ON			; Use a cabinet...
.SET Compress=ON			; ...and compress remaining files.
CabinetFileCountThreshold=count
Sets a goal for the maximum number of files in a cabinet.
Default:	.Set CabinetFileCountThreshold=0	; Default is no threshold

count is a threshold for the number of files to store in a cabinet.  Once this count has been reached, Diamond will close the current cabinet as soon as possible.  Due to the blocking of files for compression purposes, it is possible that the cabinet will contain more files than specified by this variable.

If count is 0, then there is no limit on the number files per cabinet.

Examples:
.Set CabinetFileCountThreshold=100	; Shoot for 100 files per cabinet
CabinetNamen=filename
The cabinet file name for the specified cabinet.
Default:			; By default none of these variables are defined

If this variable is not defined for a particular disk, then Diamond uses the CabinetNameTemplate to construct the cabinet name.

Examples:
.Set CabinetName1=one.cab
CabinetNameTemplate=template
Sets the cabinet file name template.
Default:	.Set CabinetNameTemplate=*.CAB	; 1.CAB, 2.CAB, ...

This template is used to construct the file name of each cabinet.  The "*" in this template is replaced by the cabinet number (1, 2, etc.). This variable is used only if no variable CabinetNamen exists for cabinet n.

NOTE:	Be sure that the expanded cabinet name does not exceed the limits for your file system!  For example, if you used &#65533;CABINET*.CAB&#65533;, and Diamond had to create 10 or more cabinets, then you would have cabinet names like CABINET10.CAB, which is 9.3, which is an invalid name in the FAT file system.  Unfortunately, Diamond would not detect this until it had already created 9 cabinets!

Examples:
.Set CabinetNameTemplate=EXCEL*.DIA ; EXCEL1.DIA, EXCEL2.DIA, etc.
.Set CabinetNameTemplate=*.         ; 1, 2, 3, etc.
ClusterSize=bytesPerCluster
The cluster size of the distribution media.
Default:	.Set ClusterSize=512	; 1.44M and 1.2M floppies have 512-byte clusters

This is used by Diamond to round up the sizes of files and cabinets to a cluster boundary, so it can determine when to switch to the next disk.

You can use a standard disk size from the following list, and Diamond will supply the known cluster size for that disk size:

	1.68M (same as DMF168)
	1.44M
	1.25M (Japanese NEC 3.5&#65533; drive capacity)
	1.2M
	720K
	360K
	DMF168 (same as 1.68M)
	CDROM

Examples:
.Set ClusterSize=1.44M 			; Use known 1.44M floppy info
.Set ClusterSize=DMF168			; Use known 1.68M DMF floppy info
Compress=ON | OFF
Turn file compression on or off.
Default:	.Set Compress=On	; Compression is on

While compression is usually on, you generally turn if off for the first few files on disk 1 (SETUP.EXE, for example).  This applies regardless of the Cabinet setting, so it is valid to store one or more uncompressed files in a Cabinet File.

Examples:
.Set Cabinet=OFF			; Files not in cabinets...
.Set Compress=OFF			; ...and no compression.
setup.exe				; Setup program is simply copied to disk.
.Set Cabinet=ON			; Use a cabinet...
.SET Compress=ON			; ...and compress remaining files.
CompressedFileExtensionChar=char
Last character in file name used when compressing an individual file.
Default:	.Set CompressedFileExtensionChar=_	; Default is an underscore ("_")

If Cabinet=OFF and Compress=ON , then Diamond will compress an individual file.  While the compressed files is stored in a Cabinet File, it has only a single file.  To maintain some consistency with existing setup compression products, the default compressed file name is constructed by taking the source file name and replacing the last character of the file extension with the setting of this variable.

Examples:
.Set CompressedFileExtensionChar=$	; SAMPLE.EXE => SAMPLE.EX$
						; SAMPLE.EX  => SAMPLE.EX$
						; SAMPLE.E   => SAMPLE.E$
						; SAMPLE.    => SAMPLE.$
						; SAMPLE     => SAMPLE.$
CompressionLevel=1 | 2 | ... | 7
Selects the Quantum compression level.
Default:	.Set CompressionLevel=2		; Default is level 2

The lowest setting is 1, and Quantum yields the least compression and runs the most quickly (for a given value of CompressionMemory).  The highest setting is 7, and Quantum yields the highest compression, but runs for quite a long time.

See the table in the CompressionType variable definition for a comparison of various settings.

This variable is ignored if Compress is OFF or if CompressionType is not set to Quantum.

Examples:
.Set CompressionLevel=7		; Set maximum compression level
CompressionMemory=10 | 11 | ... | 21 | 1024 | ... | 2097152
Selects the amount of memory required by Quantum at decompress time.
Default:	.Set CompressionMemory=18	; Default is 18 (256K)

The value can be specified as either a power of two exponent, or as a byte count.  If a number in the range 10 to 21 is specified, Diamond raises 2 to this power, yielding memory sizes in the range 1024 (1Kb) to 2,097,152 (2Mb) bytes.  If the number is in the range 1024 to 2097152, it is treated as a byte count, and is rounded up to a power of 2 (if it isn&#65533;t already).

See the table in the CompressionType variable definition for a comparison of various settings.

This variable is ignored if Compress is OFF or if CompressionType is not set to Quantum.

Examples:
.Set CompressionMemory=21	; Set maximum compression memory
CompressionType=MSZIP | QUANTUM
Select compression engine.
Default:	.Set CompressionType=MSZIP	; Default is MSZIP compressor

Diamond supports two different compression engines: Quantum and MSZIP.

Quantum is the more flexible of the two, and can achieve compressed file sizes of 10% to 15% smaller than MSZIP.  The CompressionLevel and CompressionMemory variables control how much compression is achieved and how much memory is required at decompression time.  See those variables for more details.  In general, Quantum is the compression engine of choice for most setup applications, since the Cost Of Goods Sold (COGS) savings can be substantial for any product that currently ships on more than six disks!

MSZIP is a PKZIP-compatible compression engine, achieving compressed file sizes almost identical to PKZIP v2.04g with the -ex switch.  MSZIP is generally faster at compressing than Quantum, but yields larger compressed file sizes.  MSZIP may be appropriate for applications where compression time is paramount.

See &#65533;&#65533; on page  for a table comparing Quantum and MSZIP performance.

Examples:
.Set CompressionType=Quantum	; Quantum compressor
DestinationDir=path
Path prefix to store in cabinet file for each file in the cabinet.
Default:	.Set DestinationDir=		; Default is no path prefix

path  is concatenated with a path separator (&#65533;\&#65533;) and the target file name on File Copy Commands to produce the file name that is stored in cabinet file.  EXTRACT.EXE will use this file name as the default name when the file is extracted.

Examples:
.Set DestinationDir=SYSTEM	; Following files get SYSTEM prefix
bin\ARIAL.TTF			; Name in cabinet is SYSTEM\ARIAL.TTF
.Set DestinationDir=		; No prefix
bin\ARIAL.TTF			; Name in cabinet is ARIAL.TTF
DiskDirectoryn=directory
The output directory name for the specified disk.
Default:			; By default none of these variables are defined

If this variable is not defined for a particular disk, then Diamond uses the DiskDirectoryTemplate to construct the disk directory.

Examples:
.Set DiskDirectory1=disk.one
DiskDirectoryTemplate=template
Set the output directory name template.  One directory is created for each disk of the layout.
Default:	.Set DiskDirectoryTemplate=DISK* ; Default is DISK1, DISK2, etc.

As Diamond processes a directive file, it will create one or more disk &#65533;images&#65533;.  Rather than using some specific disk format, however, Diamond simply creates one subdirectory for each disk and places the files for each disk in the appropriate directory.  If a &#65533;*&#65533; exists in this variable, then it is replaced with the disk number.  If no &#65533;*&#65533; is specified, then all files are placed in the single directory specified by this variable. 

This variable is used only if no variable DiskDirectoryn exists for disk n.

Examples:
.Set DiskDirectoryTemplate=C:\EXCEL6\DISK*  ; Put files in separate dirs
.Set DiskDirectoryTemplate=C:\EXCEL6	; Put all files in C:\EXCEL6
.Set DiskDirectoryTemplate=			; Put all files in current dir
DiskLabeln=label
The user-readable text string for the specified disk.
Default:			; By default none of these variables are defined

This label is stored in cabinet files that contain files that are split across disk boundaries, to simplify prompting for the appropriate disk to insert into the drive.  For example, if EXCEL.EXE started in 1.CAB and finished in 2.CAB, and a user asked to extract EXCEL.EXE from 2.CAB, EXTRACT.EXE can retrieve the printed label for the disk containing 1.CAB (say, Excel Program Disk 1) and tell the user to insert that disk and try again.

If this variable is not defined for a particular disk, then Diamond uses the DiskLabelTemplate to construct the disk label.

Examples:
.Set DiskLabel1=&#65533;Excel Setup Disk 1&#65533;
.Set DiskLabel2=&#65533;Excel Setup Disk 2&#65533;
DiskLabelTemplate=template
Set the printed disk label.  Used if individual DiskLabeln variables are not defined
Default:	.Set DiskLabelTemplate=&#65533;Disk *&#65533; ; Default is &#65533;Disk 1&#65533;, &#65533;Disk 2&#65533;, etc.

Sets the default user-readable disk label.  If a &#65533;*&#65533; exists in this variable, then it is replaced with the disk number.  This variable is used only if no variable DiskLabeln exists for disk n.

Examples:
.Set DiskLabelTemplate=&#65533;Excel Disk *&#65533;
DoNotCopyFiles=On | Off
Controls whether File Copy Commands actually copy files.
Default:	.Set DoNotCopyFiles=Off		; Files are copied

This option is intended to be used when Cabinet is OFF and Compress is OFF, as a means of generating an INF file very quickly.  It has no affect when Cabinet is ON or Compress is ON.

Examples
.Set DoNotCopyFiles=ON		; Make Diamond create the INF file quickly
FolderFileCountThreshold=count
Set the threshold on the number of files to store in a folder.
Default:	.Set FolderFileCountThreshold=0	; Default to no limit on count of files in a folder

Sets the threshold file count for the current folder.  When this threshold is exceeded, then the current folder is closed.  If any more files are to be processed, they will go into a new folder.

If Cabinet is OFF, this variable is ignored.

If count is 0, then there is no limit on the count of files in a folder.

Examples:
.Set FolderFileCountThreshold=50	; No more than 50 files per folder
FolderSizeThreshold=size
Set the threshold size for the current folder.
Default:	.Set MinMaxFolderSize=0	; Default to the maximum cabinet size

Sets the threshold size for the current folder.  When this threshold is exceeded, then the current folder is closed.  If any more files are to be processed, they will go into a new folder.  Diamond attempts to limit folders to the size specified by this variable, but in most cases folders will be a bit larger than this threshold.

If Cabinet is OFF, this variable is ignored.

If size is 0, then the threshold is the same as the maximum cabinet size.

Folders are compression/encryption boundaries.  The state of the compressor and cryptosystem are reset at folder boundaries.  To access a file in a folder, the folder must be decrypted and decompressed starting from the front of the folder and continuing through to the desired file.  Thus, smaller folder thresholds are appropriate for a layout where a small number of files needs to be randomly accessed quickly from a cabinet.  On the other hand, larger folder thresholds permit the compressor to examine more data, and so generally yield better compression results.  For a layout where the files will be accessed sequentially and most of the files will be accessed, a larger folder threshold is best.

Examples:
.Set FolderSizeThreshold=1M	; Aim for 1Mb folders
GenerateInf=ON | OFF
Controls Unified vs. Relational INF generation mode.
Default:    .Set GenerateInf=ON     ; Default to "unified" INF mode

If GenerateInf is ON when the first file copy command is encountered, then Unified INF mode is selected.  In this mode, file detail lines are written to the INF file as file copy commands are processed, so the order of file lines in the INF is exactly the same as the order of the files in the layout.

If GenerateInf is OFF when the first file copy command is encountered, then Relational INF mode is selected.  In this mode, file copy commands are processed, but INF file generation is delayed until GenerateInf is set to ON, and File Reference commands are used to select information on files in the layout to be placed in the INF file.

Unified mode is easier to use, since each file is specifed only once, and is most appropriate for quick usage of Diamond.

Relational mode is more complicated, since each file must be specified (at least) twice, but it provides very fine control of both the disk layout and the format of the INF file.  In particular, some INF files want to have sections to list the files associated with a certain feature, there may be many such sections, and some files may be required in more than one section.  Unified mode does not provide any method to generate such an INF file, but Relational mode does via the File Reference command.  

By separating the disk layout order from the INF file order, Diamond permits optimization of the file layout for compression vs. access time.  The layout section of the DDF contains file copy commands that control
precisely where files are in the layout.  The INF section of the DDF contains INF formatting information, including File Reference commands to pull in information about specific files from earlier File Copy commands in the layout section.

Notes:
(1)	Once GenerateInf is set to ON and at least one File Copy command has been processed, GenerateInf may not be set to OFF (i.e., in Relational Mode, all File Copy commands must be processed before any File Reference commands)

Examples:
;** Layout section - File Copy commands
.Set GenerateInf=OFF
foo.exe
bar.exe other.exe
foo.exe foo1.exe
....

;** INF section -- File Reference commands
.Set GenerateInf=ON
.WriteInf "[a section]"
foo.exe
other.exe
foo1.exe /rename=sys\foo.exe	; pass custom parameter
....
InfXxx=string
Sets the default value for an INF parameter.
Default:	[Not applicable]

Variables of this form (other than the standard ones in this list) can be used for two purposes:
a)To override the usual value of a standard INF parameter (like date, time, attr, etc.) for all the files (or a set of files) in the layout.
b)To define a custom INF parameter, and specify its default value.

Notes:
(1)	When in Relation INF mode, only the last value for a particular InfXxx variable will be carried over from the layout section to the INF section of the DDF.  In the following example:
	;** Layout section - File Copy commands
	.Set GenerateInf=OFF	; Select Relational INF
	.Set InfCustom=apple
	file.1
	.Set InfCustom=pear
	file.2
	;** INF section - File Reference commands
	.Set GenerateInf=ON
	file.1			; *custom* value is &#65533;pear&#65533;, not &#65533;apple&#65533;!
	file.2

Examples:
.Set InfDate=05/02/94		; Date stamp all files
.Set InfTime=06:00:00a		; Time stamp all files
.Set InfAttr=			; Turn off all attributes (esp. read-only)
.Set InfCustom=yes		; Define custom INF parameter
InfCabinetHeader[n]=string
Sets the header text for the cabinet section of the INF file.
Default:	.Set InfCabinetHeader="[cabinet list]"

This string is written to the INF prior to any cabinet detail lines. Diamond will also use any variables of the form InfCabinetHeadern where n is an integer with no leading zeros (0).  These additional lines will be printed out in increasing order after the InfCabinetHeader line.  Any .InfBegin Cabinet/.InfEnd lines will be printed as they are encountered, but in any event after all of these header lines.

Examples:
.Set InfCabinetHeader=&#65533;;Lots o&#65533; cabinets&#65533;

.Set InfCabinetHeader=		; No cabinet header

.Set InfCabinetHeader=&#65533;;Line 1 of cabinets&#65533;
.Set InfCabinetHeader1=&#65533;;Line 2 of cabinets&#65533;
.Set InfCabinetHeader2=&#65533;;Line 3 of cabinets&#65533;
InfCabinetLineFormat[n]=format string
Sets the detail line format for the cabinet section of the INF file.
Default:	.Set InfCabinetLineFormat=*cab#*,*disk#*,*cabfile*

This format is used to generate a line in the "cabinet" section of the INF.  If a numeric suffix n is specified in the variable name, then the specified format is used for cabinet number n.  If no such cabinet number-specific format is defined, then the value of the InfCabinetLineFormat variable is used.
See &#65533;&#65533; on page  for details on the format string..
See &#65533;&#65533; on page  for a list of the allowed parameter names.
InfCommentString=string
Sets the line comment string for the INF file.
Default:	.Set InfCommentString=";"

This is the string Diamond will use to prefix comment lines that it generates in the INF (the autogenerated diamond version/date/time lines, for example). 
InfDateFormat=YYYY-MM-DD | MM/DD/YY
Sets the date format used for dates written to the INF file.
Default:	.Set InfDateFormat=MM/DD/YY	; Default to normal US convention

This format is used to format the date parameter for the InfFileLineFormat used to write file detail lines to the INF file.

Examples:
.Set InfDateFormat=YYYY-MM-DD		; Use the preferred ACME format
InfDiskHeader[n]=string
Sets the header text for the disk section of the INF file.
Default:	.Set InfDiskHeader="[disk list]"

This string is written to the INF prior to any disk detail lines. Diamond will also use any variables of the form InfDiskHeadern where n is an integer with no leading zeros (0).  These additional lines will be printed out in increasing order after the InfDiskHeader line.  Any .InfBegin Disk/.InfEnd lines will be printed as they are encountered, but in any event after all of these header lines.

Examples:
.Set InfDiskHeader=&#65533;;Lots o&#65533; Disks&#65533;

.Set InfDiskHeader=		; No Disk header

.Set InfDiskHeader=&#65533;;Line 1 of Disks&#65533;
.Set InfDiskHeader1=&#65533;;Line 2 of Disks&#65533;
.Set InfDiskHeader2=&#65533;;Line 3 of Disks&#65533;
InfDiskLineFormat[n]=format string
Sets the detail line format for the disk section of the INF file.
Default:	.Set InfDiskLineFormat=*disk#*,*label*

This format is used to generate a line in the "disks" section of the INF.  If a numeric suffix n is specified in the variable name, then the specified format is used for disk number n.  If no such disk number-specific format is defined, then the value of the InfDiskLineFormat variable is used.
See &#65533;&#65533; on page  for details on the format string..
See &#65533;&#65533; on page  for a list of the allowed parameter names.
InfFileHeader[n]=string
Sets the header text for the file section of the INF file.
Default:	.Set InfFileHeader="[file list]"

This string is written to the INF prior to any file detail lines. Diamond will also use any variables of the form InfFileHeadern where n is an integer with no leading zeros (0).  These additional lines will be printed out in increasing order after the InfFileHeader line.  Any .InfBegin File/.InfEnd lines will be printed as they are encountered, but in any event after all of these header lines.
InfFileLineFormat[n]=format string
Sets the detail line format for the file section of the INF file.
Default:	.Set InfFileLineFormat=*disk#*,*cab#*,*file*,*size*

This format is used to generate a line in the "file" section of the INF. If a numeric suffix n is specified in the variable name, then the specified format is used for file number n (file numbers start at 1, and are based on the File Copy Commands, not the File Reference Commands).  If no such file number-specific format is defined, then the value of the InfFileLineFormat variable is used.
See &#65533;&#65533; on page  for details on the format string..
See &#65533;&#65533; on page  for a list of the allowed parameter names.
InfFileName=filename
Sets the name of the INF output file.
Default:	.Set InfFileName=SETUP.INF	; Default file name is SETUP.INF

Defines the file name for the INF file.  This file has disk, cabinet, and file information that is intended for use by a setup program during the setup process.

Examples:
.Set InfFileName=EXCEL.INF
InfFooter[n]=string
Sets the footer text for beginning of the INF file.
Default:	// Run Diamond and use the .Dump command to see the default footer

These strings are written to the INF file after all other information.  To disable this footer text, set InfFooter to the empty string (.Set InfFooter=).  Diamond will also use any variables of the form InfFootern where n is an integer with no leading zeros (0).  These additional lines will be printed out in increasing order after the InfFooter line, starting with InfFooter1.
The following special strings may be specified in InfFooter[n] values (note that the two percent signs are required, so that Diamond does not interpret these as variable references):

String
Description
%%1
The comment string -- each InfFooter[n] line should probably start with %%1.
%%2
The date and time Diamond was run to produce the INF file.
%%3
The version of Diamond use to produce the INF file.

Examples:
.Set InfFooter=			; Disable INF footer text
.Set InfFooter=&#65533;%%1 %2 %3&#65533;	; Short footer
.Set InfFooter=&#65533;%%1*****&#65533;	; Long footer
.Set InfFooter1=&#65533;%%1* %2&#65533;	; Long footer continued
.Set InfFooter2=&#65533;%%1* %3&#65533;	; Long footer continued
.Set InfFooter3=&#65533;%%1*****&#65533;	; Long footer continued
InfHeader[n]=string
Sets the header text for beginning of the INF file.
Default:	// Run Diamond and use the .Dump command to see the default header.

These strings are written to the INF file prior to any other information.  To disable this header text, set InfHeader to the empty string (.Set InfHeader=).  Diamond will also use any variables of the form InfHeadern where n is an integer with no leading zeros (0).  These additional lines will be printed out in increasing order after the InfHeader line, starting with InfHeader1.
The following special strings may be specified in InfHeader[n] values (note that the two percent signs are required, so that Diamond does not interpret these as variable references):

String
Description
%%1
The comment string -- each InfHeader[n] line should probably start with %%1.
%%2
The date and time Diamond was run to produce the INF file.
%%3
The version of Diamond use to produce the INF file.

Examples:
.Set InfHeader=			; Disable INF header text
.Set InfHeader=&#65533;%%1 %2 %3&#65533;	; Short header
.Set InfHeader=&#65533;%%1*****&#65533;	; Long header
.Set InfHeader1=&#65533;%%1* %2&#65533;	; Long header continued
.Set InfHeader2=&#65533;%%1* %3&#65533;	; Long header continued
.Set InfHeader3=&#65533;%%1*****&#65533;	; Long header continuedInfSectionOrder=[D | C | F]*	
Set the generation and relative order of the Disk, Cabinet, and File sections in the INF file.
Default:	.Set InfSectionOrder=DCF  ; Disk, then Cabinet, and then File

This variable controls what sections of the INF file are generated, and the order in which they appear.  Each of the letters &#65533;C&#65533; (cabinet), &#65533;D&#65533; (disk), and &#65533;F&#65533; (file) may be used at most once.   Any or all of these letters may be omitted, and the corresponding section of the INF file will not be generated.

Examples:
.Set InfSectionOrder=DF	; Disks, then files, omit the cabinet section
MaxCabinetSize=size
Set the maximum size for the current cabinet.
Default:	.Set MaxCabinetSize=0		; No limit, except MaxDiskSize

size is the maximum size for the current cabinet. If Cabinet is ON when this maximum is exceeded, then the current folder being processed will be split between the current cabinet and the next cabinet.  If Cabinet is OFF, then this variable is ignored.

Note that MaxDiskSize (or MaxDiskSizen, if specified) takes precedence over this variable.  Diamond never splits a cabinet file across a disk boundary, so a cabinet file will be no larger than the amount of free space available on the disk at the time the cabinet is created, even if this size is less than MaxCabinetSize.

If size is 0, then the cabinet size is limited only by the disk size (MaxDiskSize or MaxDiskSizen).

Examples:
.Set MaxCabinetSize=0		; Use disk size as limit
MaxDiskFileCount=count
Sets the maximum number of files that can be stored on a disk.
Default:	.Set MaxDiskFileCount=0	; Default is no limit

count is the maximum number of files to store on a disk.  Once this count has been reached, Diamond will close the current disk, even if space remains on the disk.  This variable is most useful when cabinet files are not being used (say, to simulate the old style setup where each file is indivdually compressed), and Diamond needs to understand the limit of the number of files that can be stored in the root directory of a floppy.

If count is 0, then there is no limit on the number files per disk.

You can use a standard disk size from the following list, and Diamond will supply the known FAT root directory limits for that disk size:

	1.68M (same as DMF168)
	1.44M
	1.25M (Japanese NEC 3.5&#65533; drive capacity)
	1.2M
	720K
	360K
	DMF168 (same as 1.68M)
	CDROM

The file count does not include any files inside cabinets.  Each cabinet counts as a single file for purposes of this count.

Examples:
.Set DiskFileCountMax=256	; Limit of 256 files per disk
.Set DiskFileCountMax=1.44M	; Use limit for 1.44M FAT floppy disk
MaxDiskSize[n]=size
Set the maximum default size for a disk.
Default:	.Set MaxDiskSize=1.44M		; Default is 1.44M floppy

size is the maximum default size for a disk.  This variable is used only for disks for which a variable MaxDiskSizen is not defined.

If Cabinet is OFF, and the next file to be layed out cannot fit on the current disk, then Diamond will move to the next disk.  If Cabinet is ON, then the current cabinet will use as much space on the current disk as possible.

If size is 0, then the disk size is unlimited.

You can use a standard disk size from the following list, and Diamond will use the correct disk size, down to the byte:

	1.68M (same as DMF168)
	1.44M
	1.25M (Japanese NEC 3.5&#65533; drive capacity)
	1.2M
	720K
	360K
	DMF168 (same as 1.68M)
	CDROM

Examples:
.Set MaxDiskSize=0		; No limit
.Set MaxDiskSize=CDROM		; All files are being placed on a CD-ROM

.Set MaxDiskSize1=720K		; First disk is 720K
.Set MaxDiskSize=1.44M		; ... rest are 1.44M
MaxErrors=count
Set the maximum number of errors allowed before pass 1 terminates.
Default:	.Set MaxErrors=20		; Default is 20 errors

count is the maximum number of errors to permit before terminating pass 1.

If count is 0, then an unlimited number of errors is allowed.

Examples:
.Set MaxErrors=0		; No limit
.Set MaxErrors=5		; Limit to just a few
ReservePerCabinetSize=size
Sets a fixed size to reserve in a cabinet for the FCRESERVE structure.
Default:	.Set ReservePerCabinetSize=0	; Default is to reserve no space

size is the amount of space to reserve in a cabinet for the FCRESERVE structure.  The total size of the FCRESERVE structure is the value of this variable plus the number of folders in the cabinet times the value of the ReservePerFolderSize variable.

size must be a multiple of 4 (to ensure memory alignment on certain systems).

A common use for this variable is to reserve space to store per-folder cryptosystem information, in the case where the cabinet is encrypted.  For example, some sort of checksum value might be stored here to permit validation that the key being used to decrypt the cabinet is actually the one that was used to encrypt the cabinet.

Diamond fills this reserved section with zeros.

Examples:
.Set ReservePerCabinetSize=8	; For use as a cryptosystem key checksum
ReservePerDataBlockSize=size
Sets the amount of space to reserve in each Data Block header.
Default:	.Set ReservePerDataBlockSize=0	; Default is to reserve no space

size is the amount of space to reserve in each Data Block header.  This space is located after the standard Data Block header and before the data for the data block.

size must be a multiple of 4 (to ensure memory alignment on certain systems).

One possible use for this variable is to reserve space to store a per-data block cryptosystem information, in the case where the cabinet is encrypted.4

Diamond fills this reserved section with zeros.

Examples:
.Set ReservePerCabinetSize=4	; Reserve 4 bytes per data block
ReservePerFolderSize=size
Sets the amount of additional space to reserve in the FCRESERVE structure for each folder in the cabinet.
Default:	.Set ReservePerFolderSize=0	; Default is to reserve no space

size is the amount of space to reserve in the FCRESERVE structure for each folder in the cabinet.  The total size of the FCRESERVE structure is the value of this variable times the value of the number of folders in the cabinet, plus the value of the ReservePerCabinetSize variable.

size must be a multiple of 4 (to ensure memory alignment on certain systems).

A common use for this variable is to reserve space to store a per-folder cryptosystem key, in the case where the cabinet is encrypted.

Diamond fills this reserved section with zeros.

Examples:
.Set ReservePerCabinetSize=8	; Size of an RC4 cryptosystem key
RptFileName=filename
Sets the name of the RPT output file.
Default:	.Set RptFileName=SETUP.RPT	; Default file name is SETUP.RPT

Defines the file name for the RPT file.  This file has summary information on the Diamond run.

Examples:
.Set RptFileName=EXCEL.RPT
SourceDir=path
The default path used to locate source files specified in File Copy Commands.
Default:	.Set SourceDir=		; Default is to look in the current directory

path  is concatenated with a path separator (&#65533;\&#65533;) and the source file name on the File Copy Command to produce the file name used to find the source file.

If path is empty, then the source file name specified on the File Copy Command is not modified.

Examples:
.Set SourceDir=C:\PROJECT  	; Find all source files in c:\project
UniqueFiles=ON | OFF
Controls whether destination file names in a layout must be unique..
Default:	.Set UniqueFiles="ON"	; File names must be unique

If UniqueFiles is ON, Diamond checks that all destination file names (names stored on disks or in cabinets) are unique, and generates an error (during pass 1) if they are not.  ON is the default, since using the same filename twice usually means that the same file was accidentaly included twice, and this would be a waste of disk space.

If UniqueFiles is OFF, Diamond permits duplicate destination file names.

The /UNIQUE parameter may be specified on individual File Copy commands to override the value of UniqueFiles.

If the GenerateInf variable is used to select Relational INF generation, then UniqueFiles must always be ON, since Diamond uses the destination filename as the unique key to link File Reference commands back to File Copy commands.
5	EXTRACT.EXE
Extract supports command-line extraction of files and copying of single files (the latter feature permits copying of cabinet (and other) files off of DMF disks on operating systems (like Windows 3.x) that do not support reading DMF disks directly.

Extract does not support any other compression system (i.e., old apps or systems compression).


extract [/y] [/A] [/D | /E] [/L location] [/R] cabinet_file [file_spec ...]
extract [/y] compressed_file [destination_file]
extract [/y] /C source destination

Switches:
/A	Process all files in a cabinet set, starting with the cabinet_file.
/C	Copy source file to destionation file or directory.
/D	Only produce a directory listing (do not extract).
/E	Force extraction.
/L	Use the directory specified by location, instead of the current directory, as the default location to place extracted files.
/R	Show RESERVED sections of cabinet file(s).  NOTE: This is undocumented in the command-line help!
/Y	Overwrite destination without prompting.  The default is to prompt if the destination file already exists, and allow the customer to: a) overwrite the file, b) skip the file, c) overwrite this file and all subsequent files that may already exist, or d) exit.

Parameters:
compressed_file	This is a cabinet file that contains a single file (example, FOO.EX_ containing FOO.EXE).  If destination_file is not specified, then the file is extracted and given its original name in the current directory.
destination_file	This can be either a relative path (".:, "..", "c:foo", etc.) or a fully qualified path, and may specify either a file (or files, if wild cards are included) or a directory.  If a directory is specified, then the file name stored in the cabinet is used.  Otherwise, destination_file is used as the complete file name for the extracted file.
cabinet_file	This is a cabinet file that contains two or more files.  If no file_spec parameter is specified, then a list of the files in the cabinet is displayed.  If one or more file_spec parameters are specified, then these are used to select which files are to be extracted from the cabinet (or cabinets).  Wild cards are allowed to specify multiple cabinets.
location	Specifies the directory where extracted files should be placed.
file_spec	Specifies files to be extracted from the cabinet(s).  May contain ? and * wild cards.  Multiple file_specs may be supplied.

Examples:

Command
Behavior
EXTRACT foo.ex_
Assuming foo.ex_ contained just the single file foo.exe, then foo.exe would be extracted and placed in the current directory.
EXTRACT foo.ex_ bar.exe
Assuming foo.ex_ contained just the single file foo.exe, then foo.exe would be extracted and placed in the current directory in the file bar.exe.
EXTRACT cabinet.1
Assuming cabinet.1 contains multiple files, then a list of the files stored in the cabinet would be displayed.
EXTRACT cabinet.1 *.exe
Extract all *.EXE files from cabinet.1 and place them in the current directory
6	Issues
6.1	Diamond does not do file type sorting
Compression is optimal when like data is grouped together.  However, integrating sorting into Diamond would complicate matters.  It would be better to have a separate, simple tool that will process a set of files and produce output suitable for a Diamond Directive File that has the files appropriately sorted.
BENS: I&#65533;ve got the starts of this tool already underway: FILETYPE.EXE
6.2	Diamond does not empty output directories
[bens 4/21/94] Latest idea is to have Diamond complain if an output directory is not empty; and have options to turn this warning off and on, and also to delete any files in the output directories.
Might look something like this:

.Set EmptyOutputDirectory= IGNORE | WARN | ERROR | DELETE

IGNORE	Diamond will not mind if an output directory already has files the first time Diamond write to it.
WARN	Diamond will warn if an output directory has files in it (list the files, I guess), but continue to run.
ERROR	Diamond will halt (hopefull on pass 1 -- Assume that no more than 20 disks will be generated, and test all the directories? -- have to do test on pass 2, only when we switch to a new disk and the directory changes!) if any output directory is empty.
DELETE	Diamond will delete any files it finds in an output directory the first time it writes to an output directory.

Now, the only issue that remains is what the default should be.  Either WARN or ERROR, and I lean to WARN.  DELETE is kind of a scary thing to have as a default.
6.3	How should MCI deal with incompressible blocks?
Either it should always &#65533;compress&#65533; them, and just return a slightly larger block than the source block, or it should return an indication of incompressiblity.  The key issue here is whether compression context should be maintained across an incompressible data chunk.
6.4	No .Include directive
This would be a bit of work, since the parser would have to be reentrant.  I think it is close, but we&#65533;ll leave this until clients ask for it.
6.5	No accounting for subdirectories on disks
We do not intend for Diamond to support MaxDiskSize correctly if uncompressed files are placed in subdirectories in the output disk directories, since this would force Diamond to have to keep track of how big the subdirectories are and when they get full.
6.6	Drag and Drop Extraction from Cabinet Files
The main point here is that it would be nice to have a GUI method of extracting files from cabinets, as PSS would presumably be able to guide a customer through this process much more quickly than if the EXTRACT.EXE tool were used.
The ideal solution is to have a DLL that plugs into the Chicago Expolorer, and allows you to tunnel into a cabinet.
We&#65533;ll pursue this as time permits.
7	Diamond File Cabinet Format (FCF)
The File Cabinet Format (FCF or FC Format) is read and written by the File Compression Interface (FCI) library, and supported by both COMPRESS.EXE and EXPAND.EXE (which use FCI to do most of their work).

FCF is simple enough to support a storing a single file without undo overhead, and also supporting multiple files very efficiently.

FCF is designed to be very space efficient on disk, and to be as fast as possible for decompress.  It is not designed to be easy for random file update operations.
7.1	FCF Details
The following table summarizes the layout of a file cabinet:

Name
Description
FCHEADER
Cabinet description
FCRESERVE
Reserved data area (optional, intended for encryption info)
FCFOLDER(s)
Folder description
FCFILE(s)
File description
Data Block 0
First compressed data block
Data Block 1
Second data block
...

Data Block N
Last compressed data block
FCITAIL
Tail signature

<<< See cabinet.h source file for full details >>>
7.2	Data Integrity Strategy
The FC Format must have some built-in integrity checks, since it is possible for customers to have damaged diskettes, or for accidental or malicious damage to occur.  Rather than doing an individual checksum for the entire cabinet file (which would have a dramatic impact on the speed of installation from floppy disk, since the entire file would need to be read), we will have per-component checksums, and compute and check them as we read the various components of the file:
1)	Checksum FCHEADER
2)	Store cabinet file length in FCHEADER (to detect file truncation)
3)	Checksum entire set of FCFOLDER structures
4)	Checksum entire set of FCFILE structures
5)	Checksum each (compressed) data block independantly

This approach allows us to avoid reading unnecessary parts of the file cabinet (though reading all of FCFOLDER and FCFILE structures would otherwise not be required in all cases), while still providing adequate integrity checking.
8	Implementation/Design Notes
8.1	Folders always begin and end at a file boundary
The folder concept exists precisely to permit random access to decompression of individual files.  To achieve maximum compression, you can configure Diamond to use a single folder for an entire set of cabinet files.  The downside to this approach, however, is that to extract any single file from the cabinet set, you must read and decompress from the beginning of the entire set of cabinet files!
So, a folder marks the beginning of a new compression/decompression boundary, where the compressor starts with no history.
Hence, a folder may cross a cabinet or disk boundary.
8.2	Allow compressed blocks to span Cabinet File boundaries
To achieve maximum compression, there is no reason we should not permit a compressed block to be split across two Cabinet Files!  This means that a Folder can span two (or more) Cabinet Files!
8.3	Byte Ordering & Source Code Portability
Diamond is portable to 32-bit and 16-bit platforms, but there is no provision in the code to support interchange between little-endian (x86, RISC) and big-endian (68K) machines, i.e., you must generate the cabinet files on a machine with the same endian property as the target machine.
8.4	Feedback, Reporting, & Logging
1) Error reporting
2) Information on quality of compression, especially for use by a "Layout Optimizer"
9	Future Enhancement Thoughts
1/13/94 bens: These are features we have considered, but do not intend to implement immediately.  Based on feedback from clients (Chicago and ACME) and also our tester (JesseV, who is going to write a DDF file for MS-DOS 6.2), we may implement some of these.  We also may add other, as yet unknown directives.
9.1	Add FreezeLayout=file variable
HIGH PRIORITY.
[4/21/94 - per dfrumin]  Toward the end of a product cycle, and if silent updates need to be made, minimizing the number of disks that change in the product can really help minimize the testing effort required.  By default, Diamond would force you to regenerate all the cabinets, and from the point where the first file changed, it would be very likely that that cabinet and all following cabinets would change.

Dan suggested a good idea: we reserve a little bit of space per cabinet file to allow for growth (amount is user configurable), and then we write out a control file that has information on exactly where file/folder boundaries are and -- most importantly -- where CFDATA blocks were split across cabinet boundaries.

When this control file is specfied on in a DDF or on the command line, Diamond would use this &#65533;break&#65533; information to force folder/cabinet splits at precise points, thus minimizing the amount of change to only the
cabinets where files actually changed.

If there is space on the last disk (or the first disk, for that matter), another approach to doing a silent update would be to put the new version of the file (or files), compressed, as a separate cabinet file, and modify the INF file.  This only works if the changed file(s) is(are) small.  If we incorporated &#65533;patch&#65533; technology into the setup program, then we could use this approach with extra space, but put patch files there, and modify the INF file to apply the patches after extracting the files.
9.2	Add DiskID INF Parameter
HIGH PRIORITY.
From 12/21/93 review: A new directive file variable (InfDiskID[n]) is defined to control the value of the INF parameter &#65533;diskid&#65533;, which can be specified in InfDiskLineFormat. This value can be used by a setup program determine that the desired disk is (or is not) in the drive.  The default is to use the name of the first (cabinet) file placed on disk n.
9.3	Add .If/.Else/.ElseIf/.Endif directives
LOW PRIORITY.
These directives are needed only if clients of Diamond need to do conditional stuff based on dynamic state during a layout run.  For example, if we supplied read-only variables to indicate BytesLeftOnDisk, BytesLeftInCabinet, CabinetNumber, DiskNumber, etc., then a directive file could do fancy things.
However, these seems like a pretty remote usage, at least at present.
Also, a client can get conditional execution by using the C preprocessor.

NOTE:	We would need to support expressions (at least logical operators like <, >, =(=), !=, <=, >=) if we implement these conditional directives.  We may also need to consider implementing some simple math expressions (+, -, *, /) on long integers.
9.4	Add .LET directive to do math
LOW PRIORITY.
This would simplify things like reserving space on a disk, e.g.:
	.Set MaxDiskSize1 = %MaxDiskSize% - 30000	; Save room for INF file
9.5	Allow wild-cards in File Copy Commands
LOW PRIORITY.
This can shorten the directives file substantially.  For example:
	FOO\*.EXE
	FOO\*.TXT
	FOO\*.BMP
This is an easy way to get similar types of file together, which should also improve compression!
9.6	Exclude certain files
LOW PRIORITY.
Assuming we support wild cards in File Copy Commands, this can shorten the directives file substantially.  For example:
	.Exclude *.OBJ
	.Exclude *.MAP
9.7	Allow directory recurse in copy specification
LOW PRIORITY.
This would be useful for a product like the VC++ with all of its sample code.  On the other hand, it would be nice to make it easy to get all the C files, then all the H files, then all the MAKEFILEs, as that might improve the compression ratio!

	src	dst	/RECURSE[=level]
9.8	Have option to check for invalid CD-ROM directory/file names
LOW PRIORITY.
The ISO CD-ROM file naming standards are more limited than the FAT file system.  For example, "!" is not allowed in file names, and directories cannot have ".".  Should check that files not stored in cabinets -- and cabinet file names -- obey these rules.  Names inside cabinets are OK, since it is the cabinet that is stored on the CD-ROM.
10	Memory Compression Interface (MCI)
<<< See MCI.H source file >>>
11	File Compression Interface (FCI)
<<< See FCI.H source file >>>
12	Encryption Support in Diamond
At this time [4/21/94], I have implemented the crypto hooks (reserved space in cabinet files, cryto APIs in FDI.LIB), and miguelc has started writing DIACRYPT.EXE and using FDI.LIB.
12.1	Motivation
"Ali Baba" is the project to distribute multiple MS products on a single CD-ROM.  One or more of these products is "locked" via a public key encryption system, and a transaction over a telephone line is needed to acquire a key used to unlock each product.

Since data compression must be applied before encryption5, we need a way to integrate encryption with Diamond.
12.2	Overview of Solution
We add a few new predefined variables to the Diamond Directive File that permit zero-filled space to be reserved in the cabinet file.  A post-processing tool is supplied that can encrypt a cabinet file that has had space reserved.  The File Decompression Interface (FDI) accepts a decryption function and key parameters, and calls the decryption function with the specified key when FDI encounters an encrypted cabinet.
12.2.1	Encryption is optional
The fchdrRESERVE_PRESENT flag in the Cabinet Header indicates whether the cabinet has reserved space for encryption6.  If a cabinet is encrypted, then all of the data blocks in the cabinet are encrypted.
12.2.2	An encrypted cabinet will have an FCRESERVE structure in the FCHEADER
The FCRESERVE structure immediately follows the fixed portion of the FCHEADER strcture if the fchdrRESERVE_PRESENT flag is set in the FCHEADER.flags field.  This structure contains the values of the following three variables:

	ReservePerCabinetSize=n
	ReservePerFolderSize=m
	ReservePerDataBlockSize=m

If ReservePerCabinetSize is non-zero, then there are that many bytes of zero-filled space immediately following the FCRESERVE structure (just before the variable-length strings portion of the FCHEADER structure).

The per-folder and per-data block reserved areas are stored with each folder and datablock.

One way to use these fields is as follows.  Each folder is encrypted with a different (stream cipher) key, and the set of keys for the entire cabinet is then encrypted with a master cabinet key.  The encrypted folder keys are stored with each folder, and the master key is also used to encrypt a known plaintext that is then stored in the per-cabinet reserved area.  This known plaintext can be used to verify that the user-supplied master key is indeed the one that was used to encrypt the cabinet.
12.2.3	Encryption context is assumed to be reset at folder boundaries
An important property of cabinet files is that the compression context is reset at the start of a folder -- indeed, this is the definition of a folder!  To extract a file from a folder, all the data from the beginning of the folder up to the desired file must be decompressed.  This property permits the author of a Diamond Directive File (by modifying the FolderSizeThreshold variable) to tradeoff compression ratio (better with larger folder sizes) against random file access speed (faster with smaller folder sizes).

To maintain this ability to skip over entire folders, a folder must also, by definition, be an encryption context boundary.  Thus, the tool that applies encryption to a folder must recognize folder boundaries and reset the cryptosystem state.  Similarly, FDI will assume the decryption state is reset at folder boundaries (though, in practice, this is managed by the decryption function, so FDI does not actually pay attention to this issue.)
12.2.4	Each data block may have an optional header
Some cryptosystems may need to expand the size of a data block by some fixed amount.  The directive file variable:
	ReservePerDataBlockSize=n
is used to reserve space in the data block, just after the normal data block header, but before the actual data.  The data block checksum will be computed by FDI on the reserved space and the data itself, so the encryption tool will have to update the data block checksums.
12.2.5	DIACRYPT.EXE applies encryption after DIAMOND.EXE
In Ali Baba, a set of disk &#65533;images&#65533; (sets of files and cabinets, actually) will be prepared, and than several masters will be created, each with a different key.  Depending on the volume of CD-ROMs to be produced, there could easily be twenty or more different CD images (differing only in the key(s) used to encrypt them).  Since Diamond layout can be very slow (due to the compression algorithm), having a separate tool, DIACRYPT.EXE, allows creating masters with different keys to be as fast as possible.  DIACRYPT could either produce one encryption at a time, or, if multiple keys were provided, it could produce multiple output images (assuming there is enough disk space!).
12.2.6	Standard Encryption Interface for DIACRYPT.EXE
There will be a standard interface that DIACRYPT.EXE uses to apply encryption to a set of cabinet files.  The Ali Baba group will supply encryption as a static link library that supports this interface.  [bens 4/21/94 The Ali Baba group are writing DIACRYPT.EXE, so this interface does not need to be a standard.]
12.2.7	Decryption is plugged into the File Decompression Interface
In contrast to the encryption step, a setup program must have access to the decryption code at the time setup is performed.  So, when FDI is initialized, a decryption function pointer and a key pointer are passed to FDI and stored in the FDI context.  The decryption function has a defined calling sequence, and includes Init, Decrypt, and End functions.  Decrypt is called on each data block in turn, and includes the data block number and the folder number.  The cryptosystem can use the data block number and/or folder block number to select the appropriate key(s) from the KeyList block.

See FDI.H for details on the crypto interfaces for FDI.
12.3	Design notes for setup programs
12.3.1	Crypto and non-crypto versions of ACME
The apps setup toolkit SETUP.EXE should come in two version, one linked with the chosen cryptosystem for Ali Baba, and another one without a cryptosystem for floppy-based setup.  The other choice would be to have a separate DECRYPT.DLL, but that would make hacking simpler than it needs to be.
13	Distribution Media Format (DMF) Disks
Mike Sliger has come up with a read-only disk format for 1.44Mb floppies (2.0Mb unformatted capacity) that permits us to add 3 sectors per track, yielding a 17.7% increase to a capacity of 1.68Mb.
History:	25-Aug-1993 bens Initial version

At this time [6/6/94] we have done testing on 100 machines internally at Microsoft, and conducted a 380 site beta test (culled from the Chicago beta test list) and recieved information on over 800 separate machines.  To work around problems in some BIOSes, we install an INT 13h ROM BIOS hook to split up any I/O request that crosses the sector 18 to sector 19 boundary.  This hook will not be required on Chicago or Windows NT 3.5, which will recognize and read DMF disks correctly.  IBM OS/2 2.x has been tested and found to read DMF disks correctly.

[More details are contained in Mike Sliger&#65533;s DMF spec.]
14	Compression Gain with Cabinet Files
History: 16-Aug-1993	bens	Initial version
While Cabinet File increase the achieved compression ratio over compressing files individually, they also have the draw back that the customer must use EXTRACT.EXE to see what files are contained in the cabinet file.  To help you evaluate the compression savings of using Cabinet Files, I copied the Windows 3.1 NETSETUP directory (473 files) from \\products2\release, and then compressed it with the ACME/Diamond COMPRESS.EXE in two ways.  First, I copied all of these files together to produce One Big File, and then compressed that.  This simulates the highest possible savings to be gained by using a Cabinet file instead of compressing individual files.7

To maximize compression efficiencies, the files were sorted by file extension (.EXE, .TTF, etc.) before being copied together to produce the big file..

NOTE: All timings are on a Compaq LTE 386s/20 w/6Mb RAM, 512K disk cache

Individual Files
One Big File
Difference
Description
14,448,164
14,448,164

Uncompressed Size
6,503,936
6,112,213
 391,723
Compressed Size
45.02%
42.30%
2.72%
cbComp/cbUncomp
105m 39s
98m 3s

Time to Compress 
n/a
15m 5s

Time to Decompress 

So in this example, compressing the files together netted an additional savings of 2.72% of the total uncompressed file size.  In other tests, the benefit from cabinet files (with MSZIP compression) has been as much as 6%.
15	DIAMOND.EXE Output Report Formats
History: 29-Aug-1993	bens	Initial version

NOTE: This section is obsolete, but remains for reference.  It will be deleted shortly.
	[bens 5/2/94]

1) LAYOUT.TXT - This a nicely formatted report that shows what files went where.  It has information about on disk or in cabinet sizes and compression ratios (which are estimates in the case where files are compressed together in folders in cabinets).

In the following list, "+" indicates info that is also output in LAYOUT.INF (etc.) for use by a Setup program, and "*" indicates info that is only displayed in LAYOUT.TXT.

		   Includes:
		   o Summary information
		     + Number of disks/cabinets
		     * Number of folders
		     * Number of files split across disks/cabinets
		     * Average file size
		     * Min and Max file size
		     * Average compression ratio
		     * Min and Max compression ratio
		     * Space used on each disk
		     * Start time, end time, total time
		   o Disk information
		     + Number of files on disk (including in cabinets)
		     + List of cabinet numbers on disk
		     + Total number of uncompressed bytes
		     + Total number of compressed bytes
		   o Cabinet information
		     + Cabinet number
		     + Cabinet name
		     + Disk number containing cabinet
		     + Number of files
		     + Total number of uncompressed bytes
		     + Total number of compressed bytes
