Performance is reproducibly worse in applications that like to swap data back and forth such as manipulating a texture currently live in video memory (requires copying back to system memory and writing it back)Can you provide an example of a reproducible benchmark I can use to test this?