DRAM and Flash store bits in fundamentally the same way: charge (on-bit) or a lack of charge (off-bit) is stored in a capacitor (a very small battery) which will be tested later to detect whether the charge is present. MLC Flash uses the same technique but varies the level of charge stored in the capacitor in order to get more than 1-bit per capacitor.
Flash chips hold a lot more data than DRAM. This sounds intuitive when you remember that Flash goes in hard drives, which are much bigger than the memory (DRAM) of a typical computer. But the difference is more striking when the two chips are placed side-by-side, because the Flash chip is basically the same size as the DRAM chip (here's a neat picture showing how two Flash chips fit in one SD card). Case in point, I was browsing Flash integrated circuits (ICs) and stumbled upon a monster at Micron. The Micron catalog shows the MT29F256G08CUCBBH3-12 coming in at 32 gigabytes. For a frame of reference, the best DRAM chips (Micron catalog) hold 512 megabytes. That's a difference of 64x!
The transistors are similar sizes for the latest DRAM and Flash, so the capacity difference is achieved by putting multiple flash layers on top of each other (this is on its way to achieving 128GB Flash chips). 3D chip stacking has a traditional problem of overheating because each layer consumes power and the layers insulate themselves, causing internal temperature to escalate. Stacking Flash solves this problem because capacitors don't consume power, and Flash doesn't lose its charge for years, so most of the layers are not in use at a given moment. Thus, Flash chips are utilizing their dark silicon to achieve extreme densities.
On the other hand, Flash is slow and only supports low data transfer speeds. For the example above, the Flash chip runs at 166mhz at one byte per cycle whereas DDR3 DRAM achieves an effective 1600mhz (800mhz DDR) at one byte per cycle. So DRAM chips allow about 10x the transfer rates of Flash chips.
Lastly, there is not much benefit in fetching data from Flash in chunks much smaller than 4KB, meaning about 4,000 cycles over an 8-bit bus. This is why SSD speed is sometimes measured in IOPs, where a 166mhz Flash operating on 4KB blocks with an 8-bit interconnect provides about 40,000 IOPs (with a really good controller). In contrast, DDR does generally have a minimum of 8 transfers per access, which, over an 8-bit connection, is 8 bytes per access. At an effective 1600mhz this is 200,000,000 IOPs, or about 5000x as many as the Flash chip. Thus, DRAM allows the memory bandwidth to be dedicated to many accesses of smaller chunks of data instead of only very large chunks like Flash.
The differences between DRAM and Flash are indeed striking, with Flash providing about two orders of magnitude greater density and DRAM providing about 4 orders of magnitude more operations per second.
Edit: You can see a follow-up post here.