Way back when, in early 2015, Nvidia said that it would be able to deliver Pascal GPUs with 32 GB of HBM memory on the package that delivered 1 TB/sec of bandwidth into and out of that GPU memory. What really happened was that Nvidia was only able to get 16 GB of memory on the package and only delivered 720 GB/sec of bandwidth with that on-package HBM with the Tesla P100 card. No one is making promises about the amount of GPU memory or bandwidth coming with Volta, as you can see above. What Nvidia has said, way back in 2015, is that the Volta GP100 GPUs would deliver a relative single precision floating general matrix multiply (SGEMM) of 72 gigaflops per watt, compared to 40 SGEMM gigaflops per watt for the Pascal GP100.
If you use that ratio, and then cut it in half for double precision, then a Volta GPU held at a constant 300 watts (the same as the Pascal package) would have a little over 9.5 teraflops of double precision performance, and four of them would deliver 38.2 teraflops of oomph, the vast majority of the more than 40 teraflops of performance expected in the Summit node.