What is great about Flash is that it is essentially "dark silicon", i.e. silicon that is not actively switching transistors most of the time, meaning less power consumption, which equals less heat. This is because the bit-saving charge is held in nonvolatile capacitors and does not need to be refreshed, read, or written to unless a user program needs to read and/or modify the relevant data. Chip stacking has the unfortunate downside of insulating the inner layers so that the ability to effectively cool the overall package is severely diminished. At 1/2 watt - 2 watts in aggregate (or 0 watts when inactive), Flash chips/modules are in no danger of overheating. This means Flash is the type of chip product that is most amenable to chip stacking, and therefore it is no surprise that chip stacking has only really taken off with the recent attempt of Flash manufacturers to beat Moore's law at chip density (see that 20nm single-chip thick flash chips are only now reaching 16Gbytes).
Here we can see a beautiful picture of Toshiba stacking 8 Flash chips on top of each other. All told that is 1.4 mm thick including plastic packaging!
Note another reason that Flash is amenable to stacking: it doesn't support high bandwidth, so pins on the edges of the chips are sufficient to support the reads and writes. This is truly profound and in conjunction with its dark silicon characteristics we get we get the dynamic duo that enables chip stacking today: Stacked flash needs no cooling improvements or 3D interconnection of the dies. Slam dunk.
The 3D integrated circuits of tomorrow will have an improved ability to provide communication between the chips at higher bandwidth, (see here, here, here) but heat will continue to be an issue.