The one I am a big fan of is PGAS/SPMD, which takes advantage of the very low latency and high bandwidth shared memory between the cores on a chip and multiple chips interconnected. The other "nice" model is Actor model, as each core is fully independent, and makes use of true concurrency. The longer term vision is true MIMD/MPMD, where you have isolated and fully independent programs running on a single or group of cores, but it is not something we are planning on working on initially, as our first customers are primarily interested in just running a single large application over all of the cores.