Posts: 20.327
Threads: 405
Joined: Aug 2005
Reputation:
554
Location: England
Good to see you Unknown Certainly worth keeping in mind those settings, I never really bothered messing with it, i just ran the batch as above and let it run, the block size didn't even occur to me, but compatibility wise i've had pretty good success, there has been one or two games that didn't like it, but I can't remember what they were now, but once I find out, I'll give the 16k blocks (or meg, depending on what that represents ) a try and let you know if it resolves the problems
Sponsored links
Posts: 2
Threads: 0
Joined: Nov 2015
Reputation:
1
11-16-2015, 04:40 AM
(This post was last modified: 11-16-2015, 04:42 AM by [Unknown].)
(11-01-2015, 03:42 PM)refraction Wrote: Good to see you Unknown Certainly worth keeping in mind those settings, I never really bothered messing with it, i just ran the batch as above and let it run, the block size didn't even occur to me, but compatibility wise i've had pretty good success, there has been one or two games that didn't like it, but I can't remember what they were now, but once I find out, I'll give the 16k blocks (or meg, depending on what that represents ) a try and let you know if it resolves the problems
Okay, thanks. One thing maxcso does is try very hard to use up all the CPU cores of your system - a *very* buggy driver might, I guess, theoretically have trouble with this. You can always try using --threads=N to specify your own thread count, such as 1. It defaults to the number of cores (also counting hyperthreading.)
I've just released maxcso v1.6.0 primarily based on seeing that batch script. You can now do the following:
Code: @ECHO OFF
FOR %%I IN (*.iso) DO (
maxcso --block=16384 "%%I" && del "%%I"
)
The difference is, if maxcso (for any reason, such as disk full) fails to write the cso file, this will prevent it from deleting the source file. It's just a bit safer.
Additionally, if the source file is larger than 2GB, it will now default to 16384 (if --block is not specified) automatically. There's no good reason to use 2048 (the default) for > 2GB files.
The block size is in bytes, by the way. The larger block size shouldn't solve any bugs/issues, but it'll compress better and should give better compression as well as decompression performance.
-[Unknown]
Posts: 1.909
Threads: 28
Joined: Jun 2010
Reputation:
75
11-16-2015, 03:53 PM
(This post was last modified: 11-16-2015, 03:53 PM by avih.)
Hmm.. wasn't this thread title "New GZIP ISO compression"?!? (now it's without "gzip")
Posts: 6.069
Threads: 68
Joined: May 2010
Reputation:
167
Location: Grenoble, France
(07-19-2015, 09:27 AM)karasuhebi Wrote: Why not? You mentioned it in the OP, giving me hope. XD Is it difficult to implement like .7z?
Personally I hope xz will be done instead. It supports block (so no need to create index). Xz is nearly same algo as (7z). And maybe in a distant future, a 4 GB cache will be do able for free (so decompression speed won't matter).
Posts: 1.909
Threads: 28
Joined: Jun 2010
Reputation:
75
11-16-2015, 07:55 PM
(This post was last modified: 11-16-2015, 07:58 PM by avih.)
(11-16-2015, 06:41 PM)gregory Wrote: Personally I hope xz will be done instead. It supports block (so no need to create index). Xz is nearly same algo as (7z). And maybe in a distant future, a 4 GB cache will be do able for free (so decompression speed won't matter).
Agreed on both.
But decompression speed will still matter, unless you intend to decompress the whole image before the game starts, which is not going to be instant however you look at it. But, if and after that happens, we can pretty much ensure zero media access time, which would be really nice.
What would be possible though, is allocate the entire image size in ram, decompress on demand (whenever the game reads) into the relevant memory space, without ever deleting from the "cache", and keep a background thread running which keeps reading ahead (and filling the cache) from the last read point onwards.
This could actually be quite nice.
Posts: 1.909
Threads: 28
Joined: Jun 2010
Reputation:
75
11-16-2015, 08:10 PM
(This post was last modified: 11-16-2015, 08:52 PM by avih.)
Though when I think of it, I'm not sure how to make it work without an index.
gzip is also block based, where the resulting gzip file has, iirc fixed size blocks, where each of them could hold different sizes of data (depending on how much the data is compressible), so in order to know which block contains a specific location of the input file, you either have to scan the blocks until you reach one which contains it, or use an index.
I'm not sure how would xz improve that procedure, or, TBH, how the cso thingy does it without an index (assuming the index is not inherent feature of the cso format).
[edit]
Ah. CSO does have an index built into the file: https://github.com/unknownbrackets/maxcs...DME_CSO.md
So xz, like gzip, doesn't have an index feature AFAIK, so we still need to generate one after the file was compressed.
But the fact that xz could be block based means that it's possible (with an index). Without block compression (e.g. 7z), even an index won't help, since there's nothing to index, since the entire file is conceptually a single block (as far as I understand), i.e. you can't get the last byte of the original file without decompressing everything from the begining.
[edit 2]
Actually, xz does seem to support (have?) index built in. If it does, then it's this feature which would make it work for random access without an external index file, rather than the fact that it's block based. For random access an index must be used, and if the file format supports an index, then there's no need for an external one - http://tukaani.org/xz/format.html
[edit 3]
gzip is actually not block based. You need to decompress everything to get the last byte. So the index is effectively recording the decompressor entire state on every index "checkpoint". Each such state is iirc 32K. So the more index checkpoint one creates, the bigger the index file would be.
This is not possible with 7z since the state is way too big to record it as an index checkpoint, so it becomes impractical to record the states and still keep the index reasonably small.
But on formats which are block based, then the index doesn't have to contain the entire state, but rather only where the block starts and ends (in the compressed and uncompressed file - overall 4 numbers only, or 3 if the blocks have identical size), so this makes it possible to choose how big (or small) the blocks are without worrying about the index size - which will always be relatively very small because each "checkpoint" is only 4 numbers (compared to 32K with gzip, and huge with 7z).
Posts: 1.909
Threads: 28
Joined: Jun 2010
Reputation:
75
11-16-2015, 10:39 PM
(This post was last modified: 11-16-2015, 10:41 PM by avih.)
I did some tests to get rough estimates of compression ratios.
I didn't evaluate decompression speeds.
CSO and gzip at default settings, 7z at ultra.
Original (ICO PAL): 886.5 M
CSO: 744.5 M - 84%
gzip: 727.5 M (720.5M + 7M index) - 82%
xz 16K blocks: 714 M - 80.5%
xz 32K blocks: 709 M - 80%
xz 64K blocks: 705.5 M - 79.5%
xz 128K blocks: 703 M - 79.3%
xz 256K blocks: 701.3 M - 79.1 %
xz 512K blocks: 698.3 M - 78.8%
xz 1M blocks: 691.7 M - 78%
xz 2M blocks: 687.5 M - 77.5%
xz 4M blocks: 685.5 M - 77.3 %
xz 8M blocks: 677.5 M - 76.4%
7z ultra (for reference): 656.2 M - 74%
So there's not a huge difference between the methods. If we're assuming using xz with 128K-256K blocks (because bigger blocks will surely slow down access time), then with xz we'll get ~3% better than gz and ~5% better than cso. Not very much better, but it does add up. And also, assuming xz does include the index built in, then we won't need to create it ourselves.
It's worth a shot IMO, assuming the starts align. Has nice potential.
Posts: 6.069
Threads: 68
Joined: May 2010
Reputation:
167
Location: Grenoble, France
I don't know if the block index is recomputed but the verbose list option give you the full info. Here an example with 4MB block. For me block index is built-in. Code is open source so it would be easy to see how the API is used.
xz -l samurai_shodown.iso.xz -v
(command is only 0.004 seconds on my system)
0.00s user 0.00s system 73% cpu 0.004 total
Code: samurai_shodown.iso.xz (1/1)
Streams: 1
Blocks: 118
Compressed size: 267.2 MiB (280,182,808 B)
Uncompressed size: 471.5 MiB (494,379,008 B)
Ratio: 0.567
Check: CRC64
Stream padding: 0 B
Streams:
Stream Blocks CompOffset UncompOffset CompSize UncompSize Ratio Check Padding
1 118 0 0 280,182,808 494,379,008 0.567 CRC64 0
Blocks:
Stream Block CompOffset UncompOffset TotalSize UncompSize Ratio Check
1 1 12 0 826,728 4,194,304 0.197 CRC64
1 2 826,740 4,194,304 1,432,708 4,194,304 0.342 CRC64
1 3 2,259,448 8,388,608 1,274,268 4,194,304 0.304 CRC64
1 4 3,533,716 12,582,912 1,258,216 4,194,304 0.300 CRC64
1 5 4,791,932 16,777,216 3,146,752 4,194,304 0.750 CRC64
1 6 7,938,684 20,971,520 3,238,596 4,194,304 0.772 CRC64
1 7 11,177,280 25,165,824 3,162,220 4,194,304 0.754 CRC64
1 8 14,339,500 29,360,128 3,103,728 4,194,304 0.740 CRC64
1 9 17,443,228 33,554,432 3,183,436 4,194,304 0.759 CRC64
1 10 20,626,664 37,748,736 3,176,432 4,194,304 0.757 CRC64
1 11 23,803,096 41,943,040 3,002,076 4,194,304 0.716 CRC64
1 12 26,805,172 46,137,344 3,076,620 4,194,304 0.734 CRC64
1 13 29,881,792 50,331,648 3,139,768 4,194,304 0.749 CRC64
1 14 33,021,560 54,525,952 3,152,960 4,194,304 0.752 CRC64
1 15 36,174,520 58,720,256 3,223,312 4,194,304 0.768 CRC64
1 16 39,397,832 62,914,560 3,286,380 4,194,304 0.784 CRC64
1 17 42,684,212 67,108,864 3,181,104 4,194,304 0.758 CRC64
1 18 45,865,316 71,303,168 3,126,760 4,194,304 0.745 CRC64
1 19 48,992,076 75,497,472 3,233,092 4,194,304 0.771 CRC64
1 20 52,225,168 79,691,776 3,226,288 4,194,304 0.769 CRC64
1 21 55,451,456 83,886,080 3,193,536 4,194,304 0.761 CRC64
1 22 58,644,992 88,080,384 3,208,496 4,194,304 0.765 CRC64
1 23 61,853,488 92,274,688 3,177,600 4,194,304 0.758 CRC64
1 24 65,031,088 96,468,992 3,098,884 4,194,304 0.739 CRC64
1 25 68,129,972 100,663,296 3,164,620 4,194,304 0.755 CRC64
1 26 71,294,592 104,857,600 3,162,408 4,194,304 0.754 CRC64
1 27 74,457,000 109,051,904 3,233,660 4,194,304 0.771 CRC64
1 28 77,690,660 113,246,208 3,231,196 4,194,304 0.770 CRC64
1 29 80,921,856 117,440,512 3,287,244 4,194,304 0.784 CRC64
1 30 84,209,100 121,634,816 3,259,448 4,194,304 0.777 CRC64
1 31 87,468,548 125,829,120 3,193,492 4,194,304 0.761 CRC64
1 32 90,662,040 130,023,424 3,249,788 4,194,304 0.775 CRC64
1 33 93,911,828 134,217,728 3,254,144 4,194,304 0.776 CRC64
1 34 97,165,972 138,412,032 3,252,232 4,194,304 0.775 CRC64
1 35 100,418,204 142,606,336 3,112,280 4,194,304 0.742 CRC64
1 36 103,530,484 146,800,640 3,016,108 4,194,304 0.719 CRC64
1 37 106,546,592 150,994,944 2,833,848 4,194,304 0.676 CRC64
1 38 109,380,440 155,189,248 2,985,240 4,194,304 0.712 CRC64
1 39 112,365,680 159,383,552 3,129,044 4,194,304 0.746 CRC64
1 40 115,494,724 163,577,856 3,296,076 4,194,304 0.786 CRC64
1 41 118,790,800 167,772,160 3,293,616 4,194,304 0.785 CRC64
1 42 122,084,416 171,966,464 3,233,924 4,194,304 0.771 CRC64
1 43 125,318,340 176,160,768 3,244,300 4,194,304 0.774 CRC64
1 44 128,562,640 180,355,072 3,078,304 4,194,304 0.734 CRC64
1 45 131,640,944 184,549,376 2,857,188 4,194,304 0.681 CRC64
1 46 134,498,132 188,743,680 2,981,416 4,194,304 0.711 CRC64
1 47 137,479,548 192,937,984 3,033,752 4,194,304 0.723 CRC64
1 48 140,513,300 197,132,288 3,325,816 4,194,304 0.793 CRC64
1 49 143,839,116 201,326,592 3,101,988 4,194,304 0.740 CRC64
1 50 146,941,104 205,520,896 3,143,560 4,194,304 0.749 CRC64
1 51 150,084,664 209,715,200 3,182,700 4,194,304 0.759 CRC64
1 52 153,267,364 213,909,504 3,152,172 4,194,304 0.752 CRC64
1 53 156,419,536 218,103,808 3,190,508 4,194,304 0.761 CRC64
1 54 159,610,044 222,298,112 3,209,988 4,194,304 0.765 CRC64
1 55 162,820,032 226,492,416 3,093,204 4,194,304 0.737 CRC64
1 56 165,913,236 230,686,720 2,258,816 4,194,304 0.539 CRC64
1 57 168,172,052 234,881,024 2,442,272 4,194,304 0.582 CRC64
1 58 170,614,324 239,075,328 3,133,204 4,194,304 0.747 CRC64
1 59 173,747,528 243,269,632 2,944,868 4,194,304 0.702 CRC64
1 60 176,692,396 247,463,936 3,242,988 4,194,304 0.773 CRC64
1 61 179,935,384 251,658,240 3,189,772 4,194,304 0.761 CRC64
1 62 183,125,156 255,852,544 3,226,088 4,194,304 0.769 CRC64
1 63 186,351,244 260,046,848 3,202,988 4,194,304 0.764 CRC64
1 64 189,554,232 264,241,152 3,095,920 4,194,304 0.738 CRC64
1 65 192,650,152 268,435,456 3,048,948 4,194,304 0.727 CRC64
1 66 195,699,100 272,629,760 3,233,480 4,194,304 0.771 CRC64
1 67 198,932,580 276,824,064 3,166,920 4,194,304 0.755 CRC64
1 68 202,099,500 281,018,368 3,256,820 4,194,304 0.776 CRC64
1 69 205,356,320 285,212,672 3,224,560 4,194,304 0.769 CRC64
1 70 208,580,880 289,406,976 3,123,812 4,194,304 0.745 CRC64
1 71 211,704,692 293,601,280 2,121,548 4,194,304 0.506 CRC64
1 72 213,826,240 297,795,584 2,502,800 4,194,304 0.597 CRC64
1 73 216,329,040 301,989,888 1,981,840 4,194,304 0.473 CRC64
1 74 218,310,880 306,184,192 2,294,236 4,194,304 0.547 CRC64
1 75 220,605,116 310,378,496 2,256,628 4,194,304 0.538 CRC64
1 76 222,861,744 314,572,800 1,604,572 4,194,304 0.383 CRC64
1 77 224,466,316 318,767,104 1,343,604 4,194,304 0.320 CRC64
1 78 225,809,920 322,961,408 2,123,544 4,194,304 0.506 CRC64
1 79 227,933,464 327,155,712 3,290,480 4,194,304 0.785 CRC64
1 80 231,223,944 331,350,016 1,832,088 4,194,304 0.437 CRC64
1 81 233,056,032 335,544,320 883,312 4,194,304 0.211 CRC64
1 82 233,939,344 339,738,624 1,055,812 4,194,304 0.252 CRC64
1 83 234,995,156 343,932,928 1,063,848 4,194,304 0.254 CRC64
1 84 236,059,004 348,127,232 931,352 4,194,304 0.222 CRC64
1 85 236,990,356 352,321,536 927,760 4,194,304 0.221 CRC64
1 86 237,918,116 356,515,840 848,308 4,194,304 0.202 CRC64
1 87 238,766,424 360,710,144 922,040 4,194,304 0.220 CRC64
1 88 239,688,464 364,904,448 800,376 4,194,304 0.191 CRC64
1 89 240,488,840 369,098,752 864,352 4,194,304 0.206 CRC64
1 90 241,353,192 373,293,056 775,836 4,194,304 0.185 CRC64
1 91 242,129,028 377,487,360 812,352 4,194,304 0.194 CRC64
1 92 242,941,380 381,681,664 725,104 4,194,304 0.173 CRC64
1 93 243,666,484 385,875,968 923,728 4,194,304 0.220 CRC64
1 94 244,590,212 390,070,272 804,912 4,194,304 0.192 CRC64
1 95 245,395,124 394,264,576 760,372 4,194,304 0.181 CRC64
1 96 246,155,496 398,458,880 1,207,096 4,194,304 0.288 CRC64
1 97 247,362,592 402,653,184 683,156 4,194,304 0.163 CRC64
1 98 248,045,748 406,847,488 446,724 4,194,304 0.107 CRC64
1 99 248,492,472 411,041,792 997,748 4,194,304 0.238 CRC64
1 100 249,490,220 415,236,096 624,248 4,194,304 0.149 CRC64
1 101 250,114,468 419,430,400 183,104 4,194,304 0.044 CRC64
1 102 250,297,572 423,624,704 122,120 4,194,304 0.029 CRC64
1 103 250,419,692 427,819,008 485,788 4,194,304 0.116 CRC64
1 104 250,905,480 432,013,312 1,729,088 4,194,304 0.412 CRC64
1 105 252,634,568 436,207,616 1,600,812 4,194,304 0.382 CRC64
1 106 254,235,380 440,401,920 1,405,776 4,194,304 0.335 CRC64
1 107 255,641,156 444,596,224 1,438,992 4,194,304 0.343 CRC64
1 108 257,080,148 448,790,528 1,022,028 4,194,304 0.244 CRC64
1 109 258,102,176 452,984,832 111,516 4,194,304 0.027 CRC64
1 110 258,213,692 457,179,136 85,244 4,194,304 0.020 CRC64
1 111 258,298,936 461,373,440 852,276 4,194,304 0.203 CRC64
1 112 259,151,212 465,567,744 3,184,004 4,194,304 0.759 CRC64
1 113 262,335,216 469,762,048 3,212,648 4,194,304 0.766 CRC64
1 114 265,547,864 473,956,352 2,377,236 4,194,304 0.567 CRC64
1 115 267,925,100 478,150,656 2,271,844 4,194,304 0.542 CRC64
1 116 270,196,944 482,344,960 3,978,324 4,194,304 0.949 CRC64
1 117 274,175,268 486,539,264 3,479,460 4,194,304 0.830 CRC64
1 118 277,654,728 490,733,568 2,527,156 3,645,440 0.693 CRC64
Posts: 314
Threads: 4
Joined: Sep 2012
Reputation:
9
Location: Corte
Pinging, and wondering again for LZMAH now that we are considering xz.
7-zip plugin is here
Posts: 1.909
Threads: 28
Joined: Jun 2010
Reputation:
75
(11-29-2015, 06:53 PM)mirh Wrote: Pinging, and wondering again for LZMAH now that we are considering xz.
7-zip plugin is here
Very unlikely to happen:
1. it uses a LOT of memory during compression by its own admission.
2. Not widely used and therefore likely to be immature.
3. Very unoptimized for small chunks access with random access (which is what PCSX2 needs):
Quote:Known Problems
LZHAM's decompressor is like a drag racer that needs time to get up to speed. LZHAM is not intended or optimized to be used on "small" blocks of data (less than ~10,000 bytes of *compressed* data on desktops, or around 1,000-5,000 on iOS). If your usage case involves calling the codec over and over with tiny blocks then LZMA, LZ4, Deflate, etc. are probably better choices.
|