What this is:
	This is a tool/file-format/library intended for read-write access to compressed data. The purpose is not to compete with traditional archivers, which are intended primarily for storage and transmission of files, but rather to be effective in cases where traditional archivers are ineffective (for example, persistent stores, or collections of hetrogenous data, or generic file-formats based around collections of distinct objects potentially too large to be effectively rewritten on each save).
	Part of the name, and part of the original design inspiration, comes from a few different formats: ZIP, WAD2 (used in Quake and Half-Life for storage of textures and a few other uses), and from traditional filesystems.


Design Tradoffs:

	Stream vs Graph based File-Structure. In my case, I chose a graph structure primarily because it was in-line with my design goals. Given data may be relocated, it is useful to be able to do this easily. Stream-structured file formats (ZIP, TAR, ...) are particularly bad at this. Formats like WAD, WAD2, and PACK allow relocation of contents (albeit as implemented these formats are usually read-only, or have limited write abilities, for example, appending new data to the end of the file).

	Byte vs Block based layout. A byte based layout was chosen primarily because the data may be compressed, and as such will rarely fit nicely into fixed-size blocks. Likewise, the individual items are expected to be small (typically single or double digits of kB), so a block structure would fall between reducing the effectiveness of compression to significantly expanding space requirements.

	Space Management Metadata. There is currently a lack of metadata related to management of free space. The reason is to allow more simplistic implementation of a reader/writer to be possible without the added complexity of needing to maintain large amounts of metadata. Instead it is assumed that it is reasonable to rebuild the needed data at the time the archive is loaded. Eventually it may make sense to add such metadata, but as a more or less optional feature.

	Fixed length names and the 32 char name limit. As the format is not intended for traditional archiving, this limit is likely acceptable IMO. It is assumed that nearly all filenames will fall below this limit, and those that don't will be mangled into a presumably unique form (if this mangling is consistent, the file may still be accessible, however, it will not be possible to restore its full name).
	The reason for the fixed size is that it allows more readily accessing directory contents, and does not necissarily require a parsing step. Likewise, with a single central directory, it is important that free items can be located quickly.

	Names without paths. The reason for this is that for a lot of processing teqniques, it is necessary to process the names seperate from the containing directory, or to effectively work in terms of a particular directory and not the files. For many tasks, it would be needed to rebuild this style of representation anyways. IMO, it made more sense to do it this way (if albeit it does require a recursive operation to work with files...).

	Central Directory. This allows unifying metadata and simplyifying the implementation. It is assumed for now, however, that the directory is small enough to reasonably fit in memory. Since the directory tree is built around a linked-list structure, it is assumed that individual directories are small enough to be reasonably accessed in this way.
	One possible feature here could be to representing larger directories in terms of a tree structure which maps back to the central directory.


Space Management:

	Space is presently managed via an AVL tree. The offsets of each span are used as the keys into this tree, and each leaf also holds the size of the span.
	AVL trees are balanced binary trees. Each side of the tree can have a height difference of at most 1 from the other side. Inserts are performed via stepping down the tree until a leaf is found, either replacing the leaf (in the case where the keys match) or spliting the leaf (in the case where the keys differ). During unwinding, balance is checked and the tree is rebalanced if the sub-trees have become sufficiently unbalanced.


Directory Index Trees

Directory Index Trees would be used for optimizing directory lookups. Several designs are possible here, eg, AVL trees or B-Trees. An AVL tree would be easier to implement, if albeit not as fast.

In this case, Left and right values < 0 are leaves, and map to directory entries. Values > 0 map to nodes, and 0 is a reserved value.
The middle value is different in that it always points directly to a directory entry (the entry in question is used as a pivot, and does not appear in any of the sub-trees).

DINode {
u32 left;		//values less than pivot
u32 right;		//values greater than pivot
u32 mid;		//pivot
byte ldepth;	//depth of left side
byte rdepth;	//depth of right side
byte _resv[2];
}

Node 0 is special.



String to Index Trees

	If needed eventually, I am imagining that string to index trees could be implemented as a variation of balanced binary trees. The idea is that the tree is accessed using a string, which is mapped to an index, which presumably maps back to the string. As a result, it is not needed to store the full string in the tree, but only the index.

SINode {
u32 left;		//values less than pivot
u32 right;		//values greater than or equal to pivot
u32 mid;		//middle value
byte ldepth;	//depth of left side
byte rdepth;	//depth of right side
byte _resv[2];
}

Values < 0 on either side will be leaves.

On insertion, stepping and balancing will be similar to that of AVL trees.
