IOCP again
-
Hi guys. I am stuck again with IOCP again. The problem is that, i just cannot create more then one worker thread. If i'll create more then one worker thread, for example 10 and on the other side there are 10 client connections, everything crashes in a random places, or it crashes because buffer for WSARecv is not allocated while in debugger i can see perfectly allocated buffer, or it crashes because socket is invalid - i just dont get it. While it is only one thread - it is working perfectly, when there more threads, it becomes unstable, and when there are 10 or more, it crashes after 1 second. Is there some magic trick i am missing again? Thanks
011011010110000101100011011010000110100101101110 0110010101110011
-
Hi guys. I am stuck again with IOCP again. The problem is that, i just cannot create more then one worker thread. If i'll create more then one worker thread, for example 10 and on the other side there are 10 client connections, everything crashes in a random places, or it crashes because buffer for WSARecv is not allocated while in debugger i can see perfectly allocated buffer, or it crashes because socket is invalid - i just dont get it. While it is only one thread - it is working perfectly, when there more threads, it becomes unstable, and when there are 10 or more, it crashes after 1 second. Is there some magic trick i am missing again? Thanks
011011010110000101100011011010000110100101101110 0110010101110011
It sounds like there are more issue than can be easily solved via a forum post. I'd suggest you look at some of the IOCP posts here. In particular, Len Holgate wrote a good series: A reusable, high performance, socket server class - Part 1[^] A reusable, high performance, socket server class - Part 2[^] A reusable, high performance, socket server class - Part 3[^] Handling multiple pending socket read and write operations[^]
...cmk The idea that I can be presented with a problem, set out to logically solve it with the tools at hand, and wind up with a program that could not be legally used because someone else followed the same logical steps some years ago and filed for a patent on it is horrifying. - John Carmack
-
Hi guys. I am stuck again with IOCP again. The problem is that, i just cannot create more then one worker thread. If i'll create more then one worker thread, for example 10 and on the other side there are 10 client connections, everything crashes in a random places, or it crashes because buffer for WSARecv is not allocated while in debugger i can see perfectly allocated buffer, or it crashes because socket is invalid - i just dont get it. While it is only one thread - it is working perfectly, when there more threads, it becomes unstable, and when there are 10 or more, it crashes after 1 second. Is there some magic trick i am missing again? Thanks
011011010110000101100011011010000110100101101110 0110010101110011
Could only guess without seeing the code. Some comments and random thoughts though... The IOCP knows nothing about how many threads there are beyond how many are waiting on ::GetQueuedCompletionStatus(). There's no correlation between number of sockets (one for each client) associated with the IOCP and the number of worker threads. In fact, one thread per client is wrong and exactly what IOCPs are meant to eliminate. So something is up in your code (obviously? :)). You can't share completion packets (OVERLAPPED structs) between sockets. If the threads are accessing any common data objects then synchronization must be used. For what it's worth, here's an example of an OVERLAPPED struct from one of my apps. Maybe it will spark something on your end... Notes: CClientConnection is a class that holds client information and also holds the socket handle and associated information. pData is the I/O buffer, and is reallocated for each I/O operation. In my protocol I send a header of a known size first. That header has the number of following data bytes to expect. dwDataLength and dwDataLengthProcessed are used tohandle situations where a single WSASend/WSARecv call doesn't process the total number of bytes requested. This MUST be done! There's no guarantee send and recv operations will send or receive all the bytes in one call. Successful completion could mean just a single byte!
// OVERLAPPEDOP_xxx used for overlapped IO
#define OVERLAPPEDOP_NOOP 0
#define OVERLAPPEDOP_SOCKACCEPT 1 // CompletionKey = SOCKET, pData = 0
#define OVERLAPPEDOP_SOCKSEND 2 // CompletionKey = SOCKET, pData = NETCOMMPACKET*
#define OVERLAPPEDOP_SOCKRECVHEADER 3 // CompletionKey = SOCKET, pData = NETPACKETHEADER*
#define OVERLAPPEDOP_SOCKRECVPACKET 4 // CompletionKey = SOCKET, pData = NETCOMMPACKET*
#define OVERLAPPEDOP_EXITTHREAD 5 // CompletionKey = 0// MANAGEROP_xxx used for server manager thread operations
#define MANAGEROP_ADDCLIENTCONNECTION 1 // pData = NEWUSERSOCKETINFO*
#define MANAGEROP_REMOVECLIENTCONNECTION 2 // lParam1 = CClientConnection*
#define MANAGEROP_REMOVEIOCPTHREAD 3 // lParam1 = CIOCPHandlerThread*
#define MANAGEROP_BROADCASTMESSAGE 4 // lParam1 = NETCOMMPACKETLITE* (alloc'd as BYTE*), lParam2 = CClientConnection* client to exclude//-------------------------------------------------------------------------
#pragma pack( push, SRVRMGR_OVERLAPPEDpack, 1 )
//-------------------------------------------------------------------------
struct -
Could only guess without seeing the code. Some comments and random thoughts though... The IOCP knows nothing about how many threads there are beyond how many are waiting on ::GetQueuedCompletionStatus(). There's no correlation between number of sockets (one for each client) associated with the IOCP and the number of worker threads. In fact, one thread per client is wrong and exactly what IOCPs are meant to eliminate. So something is up in your code (obviously? :)). You can't share completion packets (OVERLAPPED structs) between sockets. If the threads are accessing any common data objects then synchronization must be used. For what it's worth, here's an example of an OVERLAPPED struct from one of my apps. Maybe it will spark something on your end... Notes: CClientConnection is a class that holds client information and also holds the socket handle and associated information. pData is the I/O buffer, and is reallocated for each I/O operation. In my protocol I send a header of a known size first. That header has the number of following data bytes to expect. dwDataLength and dwDataLengthProcessed are used tohandle situations where a single WSASend/WSARecv call doesn't process the total number of bytes requested. This MUST be done! There's no guarantee send and recv operations will send or receive all the bytes in one call. Successful completion could mean just a single byte!
// OVERLAPPEDOP_xxx used for overlapped IO
#define OVERLAPPEDOP_NOOP 0
#define OVERLAPPEDOP_SOCKACCEPT 1 // CompletionKey = SOCKET, pData = 0
#define OVERLAPPEDOP_SOCKSEND 2 // CompletionKey = SOCKET, pData = NETCOMMPACKET*
#define OVERLAPPEDOP_SOCKRECVHEADER 3 // CompletionKey = SOCKET, pData = NETPACKETHEADER*
#define OVERLAPPEDOP_SOCKRECVPACKET 4 // CompletionKey = SOCKET, pData = NETCOMMPACKET*
#define OVERLAPPEDOP_EXITTHREAD 5 // CompletionKey = 0// MANAGEROP_xxx used for server manager thread operations
#define MANAGEROP_ADDCLIENTCONNECTION 1 // pData = NEWUSERSOCKETINFO*
#define MANAGEROP_REMOVECLIENTCONNECTION 2 // lParam1 = CClientConnection*
#define MANAGEROP_REMOVEIOCPTHREAD 3 // lParam1 = CIOCPHandlerThread*
#define MANAGEROP_BROADCASTMESSAGE 4 // lParam1 = NETCOMMPACKETLITE* (alloc'd as BYTE*), lParam2 = CClientConnection* client to exclude//-------------------------------------------------------------------------
#pragma pack( push, SRVRMGR_OVERLAPPEDpack, 1 )
//-------------------------------------------------------------------------
structHey Mark. The worst thing is, i am sure that i am doing everything the exact same way but something is always wrong. I can give you the code only if i could send it to you - it is a solution with 5 projects and there are a LOT of lines, if you could take a look at it and point out mistakes i would appreciate it so much!!! (because i desperately need help with this :( ), can i ask you for this? please? My email: info[at]machinized[dot]com or maybe you can post yours, then i can send you the code. I am in pain :(
011011010110000101100011011010000110100101101110 0110010101110011
-
Hi guys. I am stuck again with IOCP again. The problem is that, i just cannot create more then one worker thread. If i'll create more then one worker thread, for example 10 and on the other side there are 10 client connections, everything crashes in a random places, or it crashes because buffer for WSARecv is not allocated while in debugger i can see perfectly allocated buffer, or it crashes because socket is invalid - i just dont get it. While it is only one thread - it is working perfectly, when there more threads, it becomes unstable, and when there are 10 or more, it crashes after 1 second. Is there some magic trick i am missing again? Thanks
011011010110000101100011011010000110100101101110 0110010101110011
Here is a code, in short: 1. Extended overlapped:
struct IOContext
{
WSAOVERLAPPED m_Overlapped;enum IOOperation { IOAccept, IORead, IOWrite, IOConnect, IODisconnect, IODefault, IOTerminate, IOTerminateStopService };
// some more data here
//................
IOOperation m_IOOperation;
IOSocket * m_pSocket; // this is wrapper class for WSA functions
LPBYTE m_pbIoRecvBuffer; // recv buffer
DWORD m_cbIoRecvBuffer; // currently received bytes
DWORD m_sbIoRecvBuffer; // bytes to receiveLPBYTE m\_pbIoSendBuffer; // send buffer DWORD m\_cbIoSendBuffer; // currently sent bytes DWORD m\_sbIoSendBuffer; // bytes to send bool GrowRecvBuffer(); bool FlushRecvBuffer(); bool GrowSendBuffer(); bool FlushSendBuffer(); bool AllocSendBuffer(DWORD dwSize);
}
bool IOContext::GrowRecvBuffer()
{
m_sbIoRecvBuffer += 2048;
m_pbIoRecvBuffer = (LPBYTE)::realloc(m_pbIoRecvBuffer,
m_sbIoRecvBuffer);
if(m_pbIoRecvBuffer == NULL)
{
return false;
}
return true;
}bool IOContext::FlushRecvBuffer()
{
m_sbIoRecvBuffer = 0;
m_cbIoRecvBuffer = 0;
if(m_pbIoRecvBuffer)
{
::free(m_pbIoRecvBuffer);
m_pbIoRecvBuffer = NULL;
}
return true;
}bool IOContext::GrowSendBuffer()
{
return true;
}bool IOContext::FlushSendBuffer()
{
m_cbIoSendBuffer = 0;
m_sbIoSendBuffer = 0;
if(m_pbIoSendBuffer)
{
::free(m_pbIoSendBuffer);
m_pbIoSendBuffer = NULL;
}
return true;
}bool IOContext::AllocSendBuffer(DWORD dwSize)
{
m_sbIoSendBuffer = dwSize;
if(m_pbIoSendBuffer)
{
::free(m_pbIoSendBuffer);
m_pbIoSendBuffer = NULL;
}
m_pbIoSendBuffer = new BYTE[m_sbIoSendBuffer];
return m_pbIoSendBuffer ? true : false;
}2. Creating a worker thread, bind listen are just regular. Next is accept function, which is called when FD_ACCEPT event occurs:
int myclass::StreamAccept()
{
int WSAStatus = 0;
DWORD dwFlags = 0;IOContext \* pIoContextEx = new IOContext(); try { ::RtlSecureZeroMemory(pIoContextEx, sizeof(IOContext)); // associate our listen socket with IO port if(m\_bCorePortUpdated == false) { this->m\_pCorePort->Update(m\_pCoreSocket->Socket(), (ULONG\_PTR)pIoContextEx); m\_bCorePortUpdated = true; } // this will return accepted socket pIoContextEx->m\_pSocket = m\_pCoreSocket->WSAAccept(NULL, NULL); // associate new socket with completion port this->UpdateCorePort(pIoContextEx->m\_pSocket, pIoContextEx); // se
-
Here is a code, in short: 1. Extended overlapped:
struct IOContext
{
WSAOVERLAPPED m_Overlapped;enum IOOperation { IOAccept, IORead, IOWrite, IOConnect, IODisconnect, IODefault, IOTerminate, IOTerminateStopService };
// some more data here
//................
IOOperation m_IOOperation;
IOSocket * m_pSocket; // this is wrapper class for WSA functions
LPBYTE m_pbIoRecvBuffer; // recv buffer
DWORD m_cbIoRecvBuffer; // currently received bytes
DWORD m_sbIoRecvBuffer; // bytes to receiveLPBYTE m\_pbIoSendBuffer; // send buffer DWORD m\_cbIoSendBuffer; // currently sent bytes DWORD m\_sbIoSendBuffer; // bytes to send bool GrowRecvBuffer(); bool FlushRecvBuffer(); bool GrowSendBuffer(); bool FlushSendBuffer(); bool AllocSendBuffer(DWORD dwSize);
}
bool IOContext::GrowRecvBuffer()
{
m_sbIoRecvBuffer += 2048;
m_pbIoRecvBuffer = (LPBYTE)::realloc(m_pbIoRecvBuffer,
m_sbIoRecvBuffer);
if(m_pbIoRecvBuffer == NULL)
{
return false;
}
return true;
}bool IOContext::FlushRecvBuffer()
{
m_sbIoRecvBuffer = 0;
m_cbIoRecvBuffer = 0;
if(m_pbIoRecvBuffer)
{
::free(m_pbIoRecvBuffer);
m_pbIoRecvBuffer = NULL;
}
return true;
}bool IOContext::GrowSendBuffer()
{
return true;
}bool IOContext::FlushSendBuffer()
{
m_cbIoSendBuffer = 0;
m_sbIoSendBuffer = 0;
if(m_pbIoSendBuffer)
{
::free(m_pbIoSendBuffer);
m_pbIoSendBuffer = NULL;
}
return true;
}bool IOContext::AllocSendBuffer(DWORD dwSize)
{
m_sbIoSendBuffer = dwSize;
if(m_pbIoSendBuffer)
{
::free(m_pbIoSendBuffer);
m_pbIoSendBuffer = NULL;
}
m_pbIoSendBuffer = new BYTE[m_sbIoSendBuffer];
return m_pbIoSendBuffer ? true : false;
}2. Creating a worker thread, bind listen are just regular. Next is accept function, which is called when FD_ACCEPT event occurs:
int myclass::StreamAccept()
{
int WSAStatus = 0;
DWORD dwFlags = 0;IOContext \* pIoContextEx = new IOContext(); try { ::RtlSecureZeroMemory(pIoContextEx, sizeof(IOContext)); // associate our listen socket with IO port if(m\_bCorePortUpdated == false) { this->m\_pCorePort->Update(m\_pCoreSocket->Socket(), (ULONG\_PTR)pIoContextEx); m\_bCorePortUpdated = true; } // this will return accepted socket pIoContextEx->m\_pSocket = m\_pCoreSocket->WSAAccept(NULL, NULL); // associate new socket with completion port this->UpdateCorePort(pIoContextEx->m\_pSocket, pIoContextEx); // se
Before I look any further than the I/O buffer code... It looks like you have a mix of ::free() and "new" calls for allocation operations. Those can't be mixed. In c++ I would just use new/delete and forget the realloc stuff, but you're free to use the c library functions if you prefer, but you can't mix in new/delete!
Mark Salsbery Microsoft MVP - Visual C++ :java:
-
Before I look any further than the I/O buffer code... It looks like you have a mix of ::free() and "new" calls for allocation operations. Those can't be mixed. In c++ I would just use new/delete and forget the realloc stuff, but you're free to use the c library functions if you prefer, but you can't mix in new/delete!
Mark Salsbery Microsoft MVP - Visual C++ :java:
Thats basically doesnt matter, you can mix it. What you cannot is HeapAlloc and free or malloc and HeapFree for example. It seems to me that its kind of not possible to have memory allocations with IOCP. No matter what i do, app always crashes randomly. And there is always heap corruption. If there is static buffer - 1000 connections without a problem, if there is dynamic memory allocation - even one connection crashes.
011011010110000101100011011010000110100101101110 0110010101110011
-
Thats basically doesnt matter, you can mix it. What you cannot is HeapAlloc and free or malloc and HeapFree for example. It seems to me that its kind of not possible to have memory allocations with IOCP. No matter what i do, app always crashes randomly. And there is always heap corruption. If there is static buffer - 1000 connections without a problem, if there is dynamic memory allocation - even one connection crashes.
011011010110000101100011011010000110100101101110 0110010101110011
csrss wrote:
Thats basically doesnt matter, you can mix it.
What??? "Basically", it does matter. In practice, however, it may not matter... at least maybe currently. It's a horrible programming practice to mix memory allocation function families. There is absolutely NO guarantee that any given library implementation versions will remain compatible. But if you like to live dangerously...:)
csrss wrote:
It seems to me that its kind of not possible to have memory allocations with IOCP
IOCP doesn't know anything about memory allocations or your buffers or anything. You are responsible for that and all standard multithread rules apply. Regardless of all that, it should be relatively easy to debug. Your heap is getting trashed. Running in debugger when it crashes you should be able to go to any thread and check call stacks to see where it's failing.
Mark Salsbery Microsoft MVP - Visual C++ :java: