Exactly. I would think that a layered cache scheme probably could have the best of the two worlds (a good size of cached content on disk, and fast uncompressed image hit from memory). But using mmap'ed file instead of doing own layered cache seems less hassle. Also, you are doing decompression on a background thread, which shouldn't be much a choke on 4S and later devices (probably, again). I need to have some tests to backup my claims obviously :)
If you're using libdispatch on the 4S you might consider using a serial queue (for NSOperationQueue just set the maxConcurrentOperationCount to 1) as it starts to eat the CPU pretty quickly if you're trying to load a lot of images.
Even on the 5 I was surprised by how much it helped to have the images cached (no anecdotal data on the 5S as it came out way after we started caching)