09-29-2009, 02:23 PM
(This post was last modified: 09-29-2009, 02:26 PM by Endymion.)
Hi! Is there any way to choose destination for compressed file, using LinuzIso plugin? If no, it will be nice addition and shouldn't be hard to implement...
Also, can I use some 3rd party compressor like 7-zip for producing fully LinuzIso-compatible archives (BZ2)? If yes, is there any restrictions in settings?
I believe 7-zip doesn't produce LinuzIso-compatible archives, unless there are specific settings in 7-zip that need adjusting.
Also, wouldn't compressing ISOs add another layer to Pcsx2 ?
Intel E7500 @ 4.00ghz 400 fsb / Asus P5QL Pro / 4Gb Kingston RAM / PNY nvidia 9800GT 512Mb / Creative X-Fi Music 24 / Vista 64 SP2/
Compressing ISOs requires a special block-level compression that allows for efficient random-access to the ISO data. ?Default gzip/bzip/7zip compression doesn't do it. Linuzappz Cdvdiso plugin has built in gzip/bzip compressors that build the necessary block-level indexes for random access, but it also means a drastic compression quality penalty. Typically NTFS Compression beats Linuzappz gzip compression ratios, and the bzip compression is buggy (I've had it crash on me).
Someday I hope to implement a 7zip compressor into the PCSX2 built in iso loader. I have an idea for doing it that should retain better compression ratios, but it will be noticeably slower. But I don't see much point in using anything beside 7zip since the other compression algos typically can't beat NTFS built in compression options when it comes to block-level random access compression.
Jake Stine (Air) - Programmer - PCSX2 Dev Team
09-29-2009, 07:48 PM
(This post was last modified: 09-29-2009, 07:49 PM by KrazyTrumpeter05.)
You know...I've always wondered. How exactly does compression work?
I've always just assumed that it uses some equation/programming function to cut down the file size, and then to decompress it just rebuilt everything using that same process in reverse. Is that a fair assumption?
09-29-2009, 08:08 PM
(This post was last modified: 09-29-2009, 08:11 PM by Air.)
Most lossless compression algos are just variations on "dictionary-based patterning." Explained:
You analyze the data, find repeated patterns of fixed or varied length, and file them into a dictionary. The dictionary usually a bit-backed 1-to-4 byte indexer, so like you can replace multiple occurrences of a 12-byte string with a single 2 byte indexer. On decompression it looks up the index in the dictionary and expands it.
The main differences between compression algos like gzip, 7zip, etc. are the dictionary storage methods, the size of the dictionaries allowed, and the pattern searching methods. The latter is an implementation difference in the compressor only, which is why some apps (like 7zip) can make much smaller gzip files than other gzip-supporting apps. These files are still 100% backwards compatible with any existing gunzip utility -- they're just more efficient because the compressor used a more cpu-intensive process for pattern matching.
Traditionally the dictionary size and pattern searching algos have been limited by computer power. Each new compression algo allows larger dictionary sizes over previous ones. 7zip, for example, has a mode that uses like 500 megs of memory to compress files. If WinZip had a similar option, it would be able to compress a lot better too. But in addition to that, 7zip also has more complicated bitpacking methods for it's dictionary, which shave a few bits off every dictionary index and entry (it adds up at the end of a couple gigs!) That too is usually a function of CPU power, since complicated bitpacking can require a lot more CPU and memory to decode back into original form. Older formats like gzip would have avoided it because they simply couldn't afford to be so aggressive.
Jake Stine (Air) - Programmer - PCSX2 Dev Team
Ok, I think I understood most of that.
Basically what happens is the compressor finds data that is repeated or whatever throughout the file, saves one copy in the archive, and replaces all instances of it with a bit of data that says "Hey, when you rebuild yourself, but X piece of data here." Right?
Yep so if you have 111111111111111111111111111111111000000000000000000000000111111111111111111111111111 you'll get something like:
20 1s,10 0s and 15 1s, grouping it together (of course this is the simplest pattern possible). So when you want to uncompress it you'll read the index and see 20 1s ->11111111111111111111 , 10 0s ->0000000000 etc etc
Cool, good to know my initial assumptions were on the right track, even though the actual process is more complicated =P