C++/CLI 函式指針與 .NET 委託的性能

August 31, 2015

對於我的 C++/CLI 項目，我只是嘗試衡量 C++/CLI 函式指針與 .NET 委託的成本。
我的期望是，C++/CLI 函式指針比 .NET 委託更快。因此，我的測試分別計算了 5 秒內 .NET 委託和本機函式指針的呼叫次數。
結果
現在結果（並且仍然）令我震驚：
.NET 委託： 在 5003 毫秒內執行 910M，結果為 152080413333030
函式指針： 在 5013 毫秒內執行 347M 次，結果為 57893422166551
這意味著，本機 C++/CLI 函式指針的使用幾乎比在 C++/CLI 程式碼中使用託管委託慢 3 倍。***怎麼可能？***在性能關鍵部分使用介面、委託或抽像類時，我應該使用託管構造嗎？
測試程式碼
連續呼叫的函式：
__int64 DoIt(int n, __int64 sum)
{
   if ((n % 3) == 0)
       return sum + n;
   else
       return sum + 1;
}
呼叫該方法的程式碼嘗試使用所有參數以及返回值，因此沒有任何東西被優化掉（希望如此）。這是程式碼（用於 .NET 代表）：
__int64 executions;
__int64 result;
System::Diagnostics::Stopwatch^ w = gcnew System::Diagnostics::Stopwatch();

System::Func&lt;int, __int64, __int64&gt;^ managedPtr = gcnew System::Func&lt;int, __int64, __int64&gt;(&DoIt);
w-&gt;Restart();
executions = 0;
result = 0;
while (w-&gt;ElapsedMilliseconds &lt; 5000)
{
   for (int i=0; i &lt; 1000000; i++)
       result += managedPtr(i, executions);
   executions++;
}
System::Console::WriteLine(".NET delegate:       {0}M executions with result {2} in {1}ms", executions, w-&gt;ElapsedMilliseconds, result);
與 .NET 委託呼叫類似，使用 C++ 函式指針：
typedef __int64 (* DoItMethod)(int n, __int64 sum);

DoItMethod nativePtr = DoIt;
w-&gt;Restart();
executions = 0;
result = 0;
while (w-&gt;ElapsedMilliseconds &lt; 5000)
{
   for (int i=0; i &lt; 1000000; i++)
       result += nativePtr(i, executions);
   executions++;
}
System::Console::WriteLine("Function pointer:    {0}M executions with result {2} in {1}ms", executions, w-&gt;ElapsedMilliseconds, result);
附加資訊
使用 Visual Studio 2012 編譯
.NET Framework 4.5 是目標
發布版本（執行計數與調試版本保持成比例）
呼叫約定是 __stdcall（當項目使用 CLR 支持編譯時不允許使用 __fastcall）
已完成所有測試：
.NET 虛擬方法：在 5004 毫秒內執行 1025M，結果為 171358304166325
.NET 委託：在 5003 毫秒內執行 910M，結果為 152080413333030
虛擬方法：在 5006 毫秒內執行 336M，結果為 56056335999888
函式指針：在 5013 毫秒內執行 347M 次，結果為 57893422166551
函式呼叫：在 5001 毫秒內執行 1459M，結果為 244230520832847
內聯函式：在 5000 毫秒內執行 1385M 次，結果為 231791984166205
對“DoIt”的直接呼叫在這裡由“函式呼叫”表示，它似乎被編譯器內聯，因為與對內聯函式的呼叫相比，執行計數沒有（顯著）差異。
對 C++ 虛方法的呼叫與函式指針一樣“慢”。託管類（引用類）的虛擬方法與 .NET 委託一樣快。
更新： 我挖得更深一點，似乎對於使用非託管函式的測試，每次呼叫 DoIt 函式時都會轉換到本機程式碼。因此，我將內部循環包裝到另一個我強制編譯非託管的函式中：
#pragma managed(push, off)
__int64 TestCall(__int64* executions)
{
   __int64 result = 0;
   for (int i=0; i &lt; 1000000; i++)
           result += DoItNative(i, *executions);
   (*executions)++;
   return result;
}
#pragma managed(pop)
另外我像這樣測試了 std::function ：
#pragma managed(push, off)
__int64 TestStdFunc(__int64* executions)
{
   __int64 result = 0;
   std::function&lt;__int64(int, __int64)&gt; func(DoItNative);
   for (int i=0; i &lt; 1000000; i++)
       result += func(i, *executions);
   (*executions)++;
   return result;
}
#pragma managed(pop)
現在，新的結果是：
函式呼叫：在 5000 毫秒內執行 2946M 次，結果為 495340439997054
std::function：在 5018 毫秒內執行 160M，結果為 26679519999840
std::function 有點令人失望。

您正在看到“雙重打擊”的成本。DoIt() 函式的核心問題是它被編譯為託管程式碼。委託呼叫非常快，通過委託從託管程式碼轉到託管程式碼並不復雜。函式指針很慢，但是編譯器會自動生成程式碼，首先從託管程式碼切換到非託管程式碼，然後通過函式指針進行呼叫。然後在一個存根中結束，該存根從非託管程式碼切換回託管程式碼並呼叫 DoIt()。
大概您真正要衡量的是對本機程式碼的呼叫。使用 #pragma 強制將 DoIt() 生成為機器程式碼，如下所示：
#pragma managed(push, off)
__int64 DoIt(int n, __int64 sum)
{
   if ((n % 3) == 0)
       return sum + n;
   else
       return sum + 1;
}
#pragma managed(pop)
您現在將看到函式指針比委託更快

引用自：https://stackoverflow.com/questions/13443250

C++/CLI 函式指針與 .NET 委託的性能

結果

測試程式碼

附加資訊

相關問答

C++/CLI 包裝返回 std::shared_ptr 的函式

使用靜態儲存持續時間銷毀本機對象

.net 事件的性能影響

Visual C++ 2010 中作為 CLR (.NET) 委託/事件處理程序的 Lambda 表達式

填充數據表時，datareader 比 dataset 快嗎？

您有任何提高 ReSharper 和/或 Visual Studio 性能的技巧嗎？