Captury Markerless Mocap Setup and Plugin Integration Issues

Markerless Mocap Setup and Plugin Integration

CapturyLiveLink_5.2
API Docs
Official Tutorial

Captury Testing and Usage

Open CapturyReplay

image-20251212111952293

Import lanqiu Mocap Data

The data storage path must not contain any Chinese characters, otherwise the import will fail.

image-20251212111946140

Open the ShotInfo and retarget views.

image-20251212111936943

image-20251212111930610

Load Target

image-20251212111921791

Select Skeleton

image-20251212111913119

Set Data Source

image-20251212111904207

unknown and unknown-2 correspond to the two characters respectively.

image-20251212111717722

Blocking Socket Issue in the Original Captury Plugin

Problem Description

During plugin integration (the plugin was already connected and data was flowing through), I ran into an issue:

Users reported that switching away from the Captury signal source to another source caused a noticeable freeze.

The freeze consistently lasted around 20 seconds.

image-20251205150209762image-20251205150637519

This problem only occurred when the Captury signal source was not connected. If the signal was already connected, switching sources caused no freeze at all.

Root Cause Analysis

Here’s the logic behind that operation: when the user switches to another signal source, I destroy the existing CapturyLiveLink to keep the LiveLink signal pool clean.

The “refresh signal” button also triggers this issue, because under the hood, refreshing means deleting the old CapturyLiveLink and then creating a new connection.

To prevent data race conditions at the lower level, I enforce a strict rule: only one CapturyLiveLink source is allowed at any time.

So whenever the user switches to another source, I always destroy the old one. The problem occurs during that destruction — specifically, destroying an old source that was never successfully connected triggers this ~20-second freeze.

But why does it only freeze when the source is “not connected” and not under normal circumstances? At this point I wasn’t sure, so my initial hunch was that the issue was somewhere in the underlying disconnect logic.

Using Unreal Insights, I confirmed that after Captury_disconnect was called, the entire system stalled for over thirty seconds.

image-20251205153031066

So it was confirmed: the disconnect operation was causing the freeze.

Then, while the game was hanging (during those tens of seconds), I paused the IDE to see where the game thread was stuck. It was sitting at:

1
WaitForSingleObject(receiveThread, INFINITE);

This line was something I had added a few days earlier. I was concerned about multi-threading issues if two CaptureLiveLinks were running simultaneously — lower-level contention could cause crashes — so I wrote it to wait for the background worker thread to fully exit before considering the shutdown complete.

That’s when I realized: the problem must be inside the receiveThread loop.

So I started adding logs throughout the call chain to track down exactly which interface was causing the hang. I also added more exit mechanisms (stopReceiving, bExternalShutdownRequested) to see if I could eliminate the freeze.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
#ifdef WIN32
DWORD RemoteCaptury::receiveLoop()
#else
void* RemoteCaptury::receiveLoop()
#endif
{
bool handshaking = !handshakeFinished;
// UE_LOG(LogCapturyRemote , Warning, TEXT("01 starting receive loop %d %s"), testcount++, *(FDateTime::Now().ToString()));
//log("starting receive loop\n");
while (!stopReceiving
&& !bExternalShutdownRequested
&& (!handshaking || !handshakeFinished)) {

// UE_LOG(LogCapturyRemote , Warning, TEXT("02 receive loop iteration %d %s"), testcount++, *(FDateTime::Now().ToString()));

if (!receive(sock)) {
if (sock == -1
&& !stopReceiving
&& !bExternalShutdownRequested )
{
deleteActors();
cameras.clear();
numCameras = -1;

// UE_LOG(LogCapturyRemote , Warning, TEXT("03 socket closed, reconnecting %d %s"), testcount++, *(FDateTime::Now().ToString()));
if (isStreamThreadRunning) {
stopStreamThread = 1;

#ifdef WIN32
WaitForSingleObject(streamThread, 1000);
#else
void* retVal;
pthread_join(streamThread, &retVal);
#endif
}
// UE_LOG(LogCapturyRemote , Warning, TEXT("03 a socket closed, reconnecting %d %s"), testcount++, *(FDateTime::Now().ToString()));

// UE_LOG(LogCapturyRemote , Warning, TEXT("04-1 %d %s"), testcount++, *(FDateTime::Now().ToString()));
while (!stopReceiving) {
// stop reconnecting
sock = openTcpSocket();
//UE_LOG(LogCapturyRemote , Warning, TEXT("04-2 %d %s"), testcount++, *(FDateTime::Now().ToString()));
if (sock != -1)
break;

sleepMicroSeconds(1000);
}
// UE_LOG(LogCapturyRemote , Warning, TEXT("04-3 %d %s"), testcount++, *(FDateTime::Now().ToString()));

if (streamWhat != CAPTURY_STREAM_NOTHING)
Captury_startStreamingImagesAndAngles(this, streamWhat, streamCamera, (int)streamAngles.size(), streamAngles.data());
//UE_LOG(LogCapturyRemote , Warning, TEXT("05 %d %s"), testcount++, *(FDateTime::Now().ToString()));

handshaking = false; // this is a lie but makes it go into the normal loop
}
}
}
//log("stopping receive loop\n");
// UE_LOG(LogCapturyRemote , Warning, TEXT("06 %d %s"), testcount++, *(FDateTime::Now().ToString()));
return 0;
}

The logs revealed the culprit:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
Display LogCaptury CapturyLiveLink: request shutdown
Warning LogCapturyRemote Captury_stopStreaming() called
Warning LogCapturyRemote Captury_disconnect() called
Warning LogCapturyRemote RemoteCaptury::disconnect() called

Warning LogCapturyRemote 01 starting receive loop 267 2025.12.04-18.57.03
Warning LogCapturyRemote 02 receive loop iteration 268 2025.12.04-18.57.03
Warning LogCapturyRemote 03 socket closed, reconnecting 269 2025.12.04-18.57.03
Warning LogCapturyRemote 03 a socket closed, reconnecting 270 2025.12.04-18.57.03
Warning LogCapturyRemote Captury_disconnect() called
Warning LogCapturyRemote 04-1 271 2025.12.04-18.57.03
Warning LogCapturyRemote RemoteCaptury::disconnect() called
Warning LogCapturyRemote openTcpSocket() 01 2025.12.04-18.57.03
Warning LogCapturyRemote openTcpSocket() 02 2025.12.04-18.57.03
Warning LogCapturyRemote openTcpSocket() 03 2025.12.04-18.57.03
Display LogCaptury CapturyLiveLink: request shutdown
Warning LogCapturyRemote Captury_stopStreaming() called
Warning LogCapturyRemote Captury_disconnect() called
Warning LogCapturyRemote RemoteCaptury::disconnect() called
Warning LogCapturyRemote RemoteCaptury::disconnect() **: waiting for receiveThread 2025.12.04-18.57.05
Warning LogCapturyRemote 04-2 272 2025.12.04-18.57.24
Warning LogCapturyRemote 04-3 273 2025.12.04-18.57.24
Warning LogCapturyRemote 06 274 2025.12.04-18.57.24
Warning LogCapturyRemote RemoteCaptury::disconnect() **: receiveThread finished 2025.12.04-18.57.24

Display LogCaptury CapturyLiveLink: request shutdown
Warning LogCapturyRemote Captury_stopStreaming() called
Warning LogCapturyRemote Captury_disconnect() called
Warning LogCapturyRemote RemoteCaptury::disconnect() called

The openTcpSocket() + number entries and RemoteCaptury::disconnect() lines are from the game thread.

When I issued the shutdown command, the remote thread was stuck between 04-1 and 04-2, and it was still alive. So everyone was waiting for the remote thread.

The function sandwiched between those two log entries is sock = openTcpSocket();.

What this function does: it creates a TCP socket and attempts to establish a connection with the remote address.

Here is the original code:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
SOCKET RemoteCaptury::openTcpSocket()
{
log("opening TCP socket\n");

SOCKET sok = socket(AF_INET, SOCK_STREAM, IPPROTO_TCP);
if (sok == -1)
return (SOCKET)-1;

if (localAddress.sin_port != 0 && bind(sok, (sockaddr*) &localAddress, sizeof(localAddress)) != 0) {
closesocket(sok);
return (SOCKET)-1;
}

if (::connect(sok, (sockaddr*) &remoteAddress, sizeof(remoteAddress)) != 0) {
closesocket(sok);
return (SOCKET)-1;
}

// set read timeout
setSocketTimeout(sok, 500);

#ifndef WIN32
char buf[100];
log("connected to %s:%d\n", inet_ntop(AF_INET, &remoteAddress.sin_addr, buf, 100), ntohs(remoteAddress.sin_port));
#endif

return sok;
}

After adding logs, I confirmed the function was hanging at:

if (::connect(sok, (sockaddr*) &remoteAddress, sizeof(remoteAddress)) != 0)

This if check calls into the Windows socket API. At this point, the culprit was identified.

Deep Dive into the Underlying Logic

image-20251205123239125

This section explains two things clearly:

  1. Why a blocking connect hangs for tens of seconds when the connection fails;
  2. How this behavior relates to blocking/non-blocking I/O, buffers, and I/O multiplexing.

1. Why Blocking connect Can Stall for a Long Time

The original openTcpSocket uses a blocking socket with a blocking connect:

1
2
3
4
5
6
SOCKET sok = socket(AF_INET, SOCK_STREAM, IPPROTO_TCP);
...
if (::connect(sok, (sockaddr*)&remoteAddress, sizeof(remoteAddress)) != 0) {
closesocket(sok);
return (SOCKET)-1;
}

In blocking mode:

  • connect enters the kernel and waits for the TCP three-way handshake to complete or fail;
  • If the remote server is not running / the IP is unreachable / packets are dropped, the system keeps sending SYN packets and waiting for a response according to its own retry policy;
  • During this time, the thread calling connect is suspended until the kernel decides the connection has failed and returns a timeout;
  • This timeout is typically on the order of tens of seconds (which matches the ~20s I actually observed).

Looking at receiveLoop in context:

  • receiveLoop keeps calling sock = openTcpSocket() in a reconnect loop;
  • When the Captury server isn’t running, every call to openTcpSocket() blocks inside connect until the system timeout fires;
  • My disconnect() is sitting at WaitForSingleObject(receiveThread, INFINITE), waiting for this thread to exit naturally;
  • So the game thread gets indirectly dragged down for tens of seconds by a single blocking connect.

In other words, this isn’t a logic deadlock — it’s purely the kernel’s blocking connection wait doing its job.

2. Blocking I/O, Buffers, and I/O Multiplexing (Extended Context)

The blocking issue doesn’t only happen with connect — it also applies to read / write (or recv / send in WinSock):

  • In blocking mode:
    • recv will wait indefinitely when there’s no data in the buffer;
    • send will wait indefinitely when the send buffer is full;
    • The thread hangs in these system calls until data arrives or space opens up.

The diagrams below illustrate what’s happening: an application thread is accessing a kernel buffer

image-20251205142827661
  • On read: if there’s temporarily no data in the buffer, blocking read/recv stalls the thread;
  • On write: if the buffer is full, blocking write/send stalls and waits for the kernel to flush old data.
image-20251205144248299 image-20251205144319439

Under the “blocking model” (the original code), if the network thread is directly blocked on these calls:

  • It can’t respond to exit signals in time (stopReceiving / bExternalShutdownRequested, etc.);

  • Logic like disconnect() that needs to “wait for the thread to exit cleanly” gets dragged along with it. The diagram below illustrates the multi-threaded blocking model: when there are many concurrent connection calls, multiple threads are spawned, each handling one connection. When a service drops, only then does that thread release. This is a very old-school approach, because threads are actually precious resources — a single machine only has so many of them. (In my project specifically, since I only allow one Captury thread at a time and the game thread has to wait for it, the entire game freezes. Though I had other reasons for this design decision.)

    image-20251205191914919

IO multiplexing (select / poll / epoll / WSAPoll) is essentially doing one thing:

  • Instead of hanging the thread on read/write/connect, it:
    • Waits on a set of file descriptors for “readable / writable / error” events;
    • Only when the kernel reports “this fd is ready” does it actually call read/recv or write/send;
    • This wait supports custom timeouts (e.g. 10ms, 100ms, 1s), and you can check exit conditions every time it wakes up.

3. The Role of Non-Blocking + ioctlsocket Here

ioctlsocket(sok, FIONBIO, &NonBlocking) switches the socket from “blocking mode” to “non-blocking mode”.

In non-blocking mode:

  • connect no longer hangs the thread. Instead:
    • It returns immediately with an error code (e.g. WSAEWOULDBLOCK / WSAEINPROGRESS), meaning “connection is in progress”;
    • You then use select / WSAPoll to poll whether the connection succeeded or failed, and you control the maximum wait time yourself;
  • read/recv and write/send also return immediately with EWOULDBLOCK when there’s no data or the buffer is full, letting your own logic decide what to do next instead of freezing the thread.

To summarize this section:

  • Root cause: A blocking connect to an unreachable target triggers a system-level long timeout, suspending the thread for tens of seconds in the kernel;
  • Amplifier: My receiveLoop frequently calls openTcpSocket() in the reconnect path, and disconnect() must wait for the thread to exit;
  • Why switching to non-blocking: Use ioctlsocket(FIONBIO) + select to control the wait duration and exit timing yourself — turning “an uncontrollable 20-second timeout” into “a controllable sub-1-second failure”, allowing disconnect() to return quickly.

image-20251205123239125

Non-blocking is implemented by “modifying the socket mode” — the code sets the socket to non-blocking after socket() but before connect(), so connect() executes in non-blocking mode. Once the connection completes, the socket is restored to blocking mode.

  • So my modified code isn’t fully non-blocking — it’s a hybrid approach.

Solution

The issue here is that we’re calling the low-level WinSock connect.
It’s a Windows system call — there’s no way to inject conditional checks inside it.

Our options are:

  • Switch to a different calling strategy (non-blocking + manually controlling timeout via select / WSAPoll);
  • Or add logic around the connect call, like checking stopReceiving before calling it, or forcefully calling closesocket from another thread after the call.

The second option isn’t great — it requires maintaining an extra thread just to kill this thread. That sounds like a mess. So I went with the first approach: rework openTcpSocket to use a non-blocking socket.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
SOCKET RemoteCaptury::openTcpSocket()
{
// UE_LOG(LogCapturyRemote , Warning, TEXT("openTcpSocket() 01 %s"), *(FDateTime::Now().ToString()));
// Create a socket. If creation fails (returns INVALID_SOCKET), return immediately with an error state
SOCKET sok = socket(AF_INET, SOCK_STREAM, IPPROTO_TCP);
if (sok == INVALID_SOCKET)
return INVALID_SOCKET;

// UE_LOG(LogCapturyRemote , Warning, TEXT("openTcpSocket() 02 %s"), *(FDateTime::Now().ToString()));
// If localAddress port is non-zero, try binding to the specified local address. If binding fails, close the socket and return error
if (localAddress.sin_port != 0 && bind(sok, (sockaddr*)&localAddress, sizeof(localAddress)) != 0) {
closesocket(sok);
return INVALID_SOCKET;
}

// UE_LOG(LogCapturyRemote , Warning, TEXT("openTcpSocket() 03 %s"), *(FDateTime::Now().ToString()));
// 1. Switch to non-blocking mode first
u_long NonBlocking = 1;
if (ioctlsocket(sok, FIONBIO, &NonBlocking) == SOCKET_ERROR) {
closesocket(sok);
return INVALID_SOCKET;
}

// 2. Initiate connect
int ConnectResult = ::connect(sok, (sockaddr*)&remoteAddress, sizeof(remoteAddress));
if (ConnectResult == SOCKET_ERROR) {
int Err = WSAGetLastError();
if (Err != WSAEWOULDBLOCK && Err != WSAEINPROGRESS && Err != WSAEALREADY) {
// Real error — fail immediately
closesocket(sok);
return INVALID_SOCKET;
}

// 3. Connection is in progress — use select to wait for a short period
const int MaxWaitMillis = 1000; // 1-second limit
fd_set WriteSet;
FD_ZERO(&WriteSet);
FD_SET(sok, &WriteSet);

struct timeval Tv;
Tv.tv_sec = MaxWaitMillis / 1000;
Tv.tv_usec = (MaxWaitMillis % 1000) * 1000;

int Sel = select((int)(sok + 1), nullptr, &WriteSet, nullptr, &Tv);
if (Sel <= 0) {
// Timeout or select error — treat as failure
closesocket(sok);
return INVALID_SOCKET;
}

// 4. select says the socket is writable — double-check with SO_ERROR
int so_error = 0;
int optLen = sizeof(so_error);
if (getsockopt(sok, SOL_SOCKET, SO_ERROR, (char*)&so_error, &optLen) == SOCKET_ERROR || so_error != 0) {
closesocket(sok);
return INVALID_SOCKET;
}
}

// If we reach here, connect has completed — restore blocking mode
u_long Blocking = 0;
ioctlsocket(sok, FIONBIO, &Blocking);

// UE_LOG(LogCapturyRemote , Warning, TEXT("openTcpSocket() 04 %s"), *(FDateTime::Now().ToString()));

setSocketTimeout(sok, 500);
// UE_LOG(LogCapturyRemote , Warning, TEXT("openTcpSocket() 05 %s"), *(FDateTime::Now().ToString()));

return sok;
}

This function creates a client socket, performs the connect in non-blocking mode, then restores blocking mode once the connection is established.

References

  1. Multi-threaded Network Details: Socket Blocking vs Non-Blocking

Think of it as two categories: a Message Bus Provider discoverable by the Finder, and a custom/dedicated Source that connects directly via IP and port.

Both ultimately feed data into the LiveLink Client, but they differ significantly in how they’re discovered, connected, and operated at the network level.

Finder + Message Bus Direct IP Connection
Discovery and Connection Uses UE’s UDP Messaging / Message Bus for broadcast/multicast Ping/Pong.
ULiveLinkMessageBusFinder::GetAvailableProviders() can only list “Providers that respond with FLiveLinkPongMessage“.
Advantage: auto-discovery, no manual IP entry.
The Source implements its own network protocol (TCP/UDP/custom port); typically you manually enter an IP (and possibly port) in the UI or Blueprint.
Captury in my project falls into this category: ConnectCapturyLiveLinkSource(const FString& IpAddress, bool bUseTCP, ...) takes an IP as its entry point and supports transport options like TCP/compression/tags.
Advantage: doesn’t depend on the Message Bus ecosystem; protocol is fully under your control.
Network Dependencies and Reachability Depends on: UDP Messaging, correct NIC binding, multicast/broadcast allowed, Windows Firewall rules.
Common issues: wrong NIC selected, cross-subnet/VLAN/AP isolation causing discovery to fail.
Depends only on: the target IP/port being reachable (more like traditional client/server).
Easier to route across subnets/NAT (as long as network policy allows), but you need to handle port exposure and firewall/NAT rules.
Data Source Form and Protocol Semantics The “data source” is typically a Provider service that exposes “discoverable Provider + a set of Subjects”.
Communication semantics include “discovery/handshake/versioning” (e.g. Pong carries version, machine name, etc.).
The “data source” is just a data stream from a device/application on a given port — connect and you receive data.
Semantics are entirely defined by the plugin (e.g. Captury’s bStreamCompressed, bStreamARTags as capability flags).
Lifecycle and Stability Provider online/offline changes surface through the discovery mechanism; but “discovered” doesn’t mean “stably connected” — still subject to Message Bus environment fluctuations. Connection state is more straightforward (connected/disconnected/reconnecting); reconnect strategy and timeouts are generally handled by the plugin itself.
Blueprint/Code Usage Differences (in my project) Typical flow: Finder retrieves Provider list (== discoverable) → select Provider → create corresponding Source. Typical flow: directly call ConnectCapturyLiveLinkSource(IpAddress, bUseTCP, ...) to create the Source.
The meta=(Latent, ... Duration="0.2") tag indicates this is a deferred/awaited Blueprint call; Duration acts as a wait/timeout window (exact behavior is implementation-defined), but it is not “LAN discovery”.