Saturday, January 28, 2012

Performance profiling in Visual Studio under Virtual Machine (Sampling VS Instrumentation)


The bad news is that Visual Studio’s own, excellent Performance Profiler does not support Sampling under any virtual machine environment – only Instrumentation.

It will not fail, but will not be able to collect any performance data if you try to do sampling while running in a virtual machine (like Parallels Desktop or VMWare Player / Workstation).

Visual Studio - Branch Mispredition with Sampling
Sampling accesses the hardware directly, so it is very fast and detailed: it gives you line level performance audit and even can give branch misprediction results if you are interested (to see the latter one, go to Properties of the Performance Explorer, go to Sampling section, switch to Performance Counter them open the Branch Events tree node).



On the other hand, instrumentations needs a special build, it basically injects data collection functions into your compiled code. Hence the runtime performance of the instrumented code is generally a magnitude (10x) slower than with Sampling and not detailed at all: only function level, and it only measures the time spent in each function. For instance, if you are writing a multi-threaded app, it is likely that you will spend some time in Thread.Sleep to synchronize your threads. This is idle time for the CPU so sampling does not really care, but it will be the biggest “problem” according to the Instrumentation - which can be really misleading.

So, if you want to measure performance, boot directly into Windows rather than any virtual machine.

1 comment:

  1. I've tried both sampling and instrumented profiling and besides speed of execution, the major difference that I see is that instrumented code provides absolute timings granular to .01 msec, whereas sampling will only tell you the percentage of time is spent in each function. So for a quick glance at the most time-consuming functions, sampling is ok but when absolute length of time within a function is needed, you need to use instrumentation. As far as the performance impact of instrumentation, in my CPU-bound app I see that simply adding instrumentation to your code reduces performance by a factor of 2 and turning on profiling further reduces it by a factor of 3.5, therefore the total impact to the code is a factor of 7 which is close to this blog post's estimate of 10x.

    ReplyDelete