Packet encryption in multiplayer games – part 2

The following is continuation of my recent Packet encryption in multiplayer games – part 1 article.

In this article I’m going to present sample implementation of the encryption helper that might be used to establish connection using Diffie-Helman algorithm and then send and receive messages using Blowfish algorithm. All based on OpenSSL implementation.

Please note that this code is written for educational purposes only and that it is probably far from quality acceptable for exchanging data that needs to be really secure. Also, it is worth reminding that in general the algorithms used here, i.e. Blowfish and Diffie-Helman are considered relatively easy to hack (far from RSA). Using them is more of a “no trespassing” sign to the hackers than the actual protection.

Below is how you’re supposed to use the code.

Step 1
First off, let’s declare some buffer and initialize OpenSSL library.

Note: We’re going to test sending messages of both 8-byte aligned and non-8-byte aligned sizes below. This is particularly interesting due to Blowfish algorithm operating on chunks of 8 bytes.

#include "EncryptionHelper.h"

void EncryptionHelper_UnitTest()
{

const int UNIT_TEST_BUFFER_SIZE = 256;

// Make it 8-byte aligned

const int UNIT_TEST_MSG0_SIZE = UNIT_TEST_BUFFER_SIZE - 8;

// Make it non 8-byte aligned

const int UNIT_TEST_MSG1_SIZE = UNIT_TEST_BUFFER_SIZE - 9;

int length;
unsigned char buffer[UNIT_TEST_BUFFER_SIZE];

// Initialize OpenSSL

EncryptionHelper_StartupOpenSSL();

Step 2
Next initialize encryption helper for Alice and Bob (why Alice and Bob?). If used for multiplayer game, Alice would only be created on one machine whereas Bob would only be created on another machine. It’s up to user to decide who is Alice and who is Bob but in order to get things working with this library it is necessary that one of them is Alice and the other one is Bob.

Note: the ‘check’ is similar to an assertion macro; the difference is that it evaluates expression regardless of whether assertions are enabled or not.

// Initialize Alice and Bob

EncryptionHelper* alice = EncryptionHelper_Create(true);
EncryptionHelper* bob = EncryptionHelper_Create(false);
assert(alice && bob);
check(EncryptionHelper_IsAlice(alice));
check(!EncryptionHelper_IsAlice(bob));

Step 3
We’re now going to generate “exchange data” on Alice side. This “exchange data” contains Alice’s public key as well as 2 parameters, P and G, as needed by Diffi-Helman algorithm. Once message is successfully sent (from Alice to Bob), we can mark it as sent.

// Alice generates exchange data and sends it to Bob

check(EncryptionHelper_GetExchangeData(alice, buffer, UNIT_TEST_BUFFER_SIZE, &length));

// [ sending… ]

EncryptionHelper_MarkExchangeDataSent(alice);
check(EncryptionHelper_IsExchangeDataSent(alice));

Step 4
Once Bob receives “exchange data” from Alice and verifies its correctness, he can then generate his own “exchange data” (containing Bob’s public key) and send it to Alice. Once sent, Bob marks message as sent.

// Bob receives exchange data and sends another exchange data to Alice

check(EncryptionHelper_ReceiveExchangeData(bob, buffer, length));
check(EncryptionHelper_IsExchangeDataReceived(bob));
check(EncryptionHelper_GetExchangeData(bob, buffer, UNIT_TEST_BUFFER_SIZE, &length));

// [ sending… ]

EncryptionHelper_MarkExchangeDataSent(bob);
check(EncryptionHelper_IsExchangeDataSent(bob));

Step 5
Once Alice receives “exchange data” from Bob, the communication starts. We’re now ready to send and receive messages between Alice and Bob.

// Alice receives exchange data from Bob

check(EncryptionHelper_ReceiveExchangeData(alice, buffer, length));

// Authentication done! // Communication begins…

Step 6
Below Bob sends test message to Alice, Alice receives it and verifies that the message is correct.

// Bob encrypts data for Alice; Alice decrypts the data

for (int i = 0; i < UNIT_TEST_MSG0_SIZE; i++)
	buffer[i] = i;
check(EncryptionHelper_Encrypt(bob, buffer, UNIT_TEST_BUFFER_SIZE, UNIT_TEST_MSG0_SIZE, &length));
check(EncryptionHelper_Decrypt(alice, buffer, length, &length));
for (int i = 0; i < length; i++)
	check(buffer[i] == i);

Step 7
And now communication going the other way – Alice sends message to Bob.

// Alice encrypts data for Bob; Bob decrypts the data

for (int i = 0; i < UNIT_TEST_MSG1_SIZE; i++)
	buffer[i] = i;
check(EncryptionHelper_Encrypt(alice, buffer, UNIT_TEST_BUFFER_SIZE, UNIT_TEST_MSG1_SIZE, &length));
check(EncryptionHelper_Decrypt(bob, buffer, length, &length));
for (int i = 0; i < length; i++)
	check(buffer[i] == i);

Step 8
When done, deinitialize encryption helpers for both Alice and Bob.

// Shut down Alice and Bob

EncryptionHelper_Destroy(alice);
EncryptionHelper_Destroy(bob);

}

Get the source code from here EncryptionHelper.cpp and here EncryptionHelper.h and have fun!

Note: The code has only been tested on windows. Obviously, you’ll need to download OpenSSL library to compile it.

Advertisements
Posted in network programming | Leave a comment

Packet encryption in multiplayer games – part 1

Sending unencrypted data between peers in multiplayer game can make your game vulnerable to many kinds of hacker attacks.

However, as surprising as it may sound not all multiplayer games require data encryption to stay secure. In some cases only part of the data transmitted requires it and sometimes encryption isn’t needed at all. Instead of encrypting, it’s often enough to verify that the packet received from remote peer isn’t “hacked”. Now, what that means is a very game specific thing. For example, in terms of FPP shooter games like Quake or Unreal such packet verification might, for example, include checking whether the player’s velocity is within reasonable range. If the player was attempting to move faster than the game allows, we could simply consider his/her game being hacked.

It’s great if you’re able to validate data packet in this way and, if possible, it’s probably the best way of detecting if your multiplayer game gets hacked. However, in the case where full packet validation can’t easily be done for some reason or when you not want packet content to be easily readable by anyone (e.g. contains secret text chat), your next option is to encrypt the data using one of the popular encryption algorithms.

There’s large number of methods available but you could probably split them into 2 groups: symmetric key based and asymmetric key based methods. The first group uses one key to encrypt and decrypt the data while the second requires 2 keys – one (public) key for encryption and one (private) key for decryption. For more explanation see this article on Tech-FAQ.

Since asymmetric key based methods are more secure, ideally we’d just be using full-blown asymmetric key based RSA algorithm for all of our communication. It certainly is very safe method given that:
(a) the keys are large enough (e.g. 2048+ bits)
(b) we are able to verify the public key of remote peer.

While (a) might only create performance challenge, the (b) is difficult to solve in general. The question one shall ask here is: how can I verify that the public key I have received from remote peer hasn’t been hacked halfway through? This problem is well known as MITM (“Man In The Middle” attack) and the way modern world deals with it is via Public Key Infrastructure (PKI). The key element of PKI are Certificate Authorities which issue reliable certificates to others. The whole PKI is organized in a tree like structure. Some Certification Authorities issue root certificates to other Certification Authorities, and then Certification Authorities issue certificates to anyone else. To be able to make use of PKI one needs root certificate that would let them get certificate of any other entity. Such root certificates are typically stored hidden in hardware and/or software (e.g. in your web browser).

One might say all this isn’t very good solution because it is based on assumption that we have some non-hacked root certificate. But in the very darkest scenario you could imagine your hardware and software could be totally hacked and so all your root certificates could be modified! That is true but no matter how much you complain about this solution, it seems this is the best one that humanity came up with so far when it comes to reliable encrypted communication over the internet.

So… ideally what you might want to do when making very secure multiplayer game (that is one that encrypts all of its data) is you should first get certificate issued by CA. Then you should use it to get individual peers to exchange public keys via your certified “master server”. Once that’s done you can start secure communication directly between peers.

Getting certificate from CA sounds scary if you’re a small developer that wants to release multiplayer PC game (assuming you’re not using any 3rd party services offering secure authentication). So, instead of going hardcore you can stick with just using some asymmetric key based method and assume (somewhat naively) that the public key won’t be hacked while exchanging them between peers. Additionally, to somewhat improve security, you could at least do some cheap and simple encryption when exchanging public keys just so they’re not plain readable. However, no matter what, keep in mind that the most common MITM attack is done on a game running locally. And honestly, when the game is running locally there’s no way of preventing it from being hacked – this is simply because the hacker can see your application process’ memory and so he or she can do whatever they want with messages you send or receive. There is general solution to these problems which is: running your game on dedicated servers but that is beyond the topic of this post.

Note also that an asymmetric key based encryption might be an overkill for your game due to unacceptable performance. For this reason some multiplayer games, in particular fast paced “real real-time” games, decide to use different approach – a combination of symmetric and asymmetric key based methods. An asymmetric one is typically only used to initially exchange (or establish) symmetric key that is then used for regular data encryption using symmetric key based encryption. Note: while it might be okay for computer games it is most likely not okay for businesses where money is involved.

A common combination of asymmetric and symmetric key based methods used in multiplayer games is, for example, Diffie-Helman and Blowfish algorithms. Fortunately implementations of both of them are freely available, for example in OpenSSL library.

In the next part of the article I’m going to present a small C++ library that could be used to facilitate both key-exchange and encryption of the data using mentioned algorithms.

Update: part 2 available here.

Posted in network programming | Leave a comment

Good bye Blue Tongue (and time for indie game development)

It was a big news in game development industry when 3 days ago two Australian studios, Blue Tongue and Studio Oz, and one studio from US (Pheonix) were shut down by THQ. As part of these events around 200 people were made redundant. Me and my wife happened to be two of them.

I’ve been working for Melbourne based studio Blue Tongue for the last 4 years. Relocated from Poland in 2007, started working as an engine programmer. A year later my wife joined Blue Tongue as a gameplay programmer. Yes, if you’re wondering, we had a lot of common topics 😉

Now that Blue Tongue is gone let me just say few words about time spent there. It was wonderful time. I happened to be be part of “shared technology” team primarily working on a brand new tech for PS3, XBox360 and PC. I learned a lot of new tech stuff during this time which isn’t surprise given I spent 4 years there. But the main thing for me there was to learn that it’s actually possible for game development studio to be extremely employee friendly. Something which based on many articles on the internet you might think can’t be real.

Blue Tongue, from my point of view, was unbelievably motivated bunch of people. Excited about what they’re doing, always friendly and willing to help – whether they were your co-worker or boss it didn’t make any difference. It is just my guess, but I think morale level at Blue Tongue studio was much higher than in majority of other studios around the world. Also, from the programmer point of view, I loved the fact that major architectural decisions were always result of thorough research and discussion of the whole team where everyone had their say. In short, studio culture was great and I’m sure I’ll be missing it wherever is going to be my next place.

The last thing I want to say here is I hope many of the ex-BTE people are going to create their own small game development businesses and be successful there. As for myself, I definitely want to take this “opportunity” (if you can call it that way) and try and do some indie game development for a while. Let’s see how “easy” indie game dev really is in terms of getting enough money for your bills and some food. Whether it works out you never know but surely this is going to be exciting times… at least for the next couple of months 🙂

Now, again, good luck to all ex-Blue Tongue people and I hope our ways cross again in some future!

Posted in Uncategorized | 5 Comments

Simple 2D soft bodies in Box2D

Box2D is an excellent, widely used and completely free 2D physics engine. It has support for variety of 2D shapes and joints but there’s no out of the box support for soft bodies and so if you want one you have to do it yourself.

I’ve made an experiment and implemented simple round soft bodies using a couple of circles linked using distance joints, then thought why not to share it. Here’s how you use the code:

b2ExSoftCircleBodyDef def;
def.numParts = 10; // Number of linked internal circles
def.radius = 10.0f;
def.center = b2Vec2(0.0f, 15.0f);
def.softness = 0.5f; // Softness within 0..1 range
b2ExSoftCircleBody* body = b2ExSoftCircleBody_Create(world, &def);

And this is the result:

This slideshow requires JavaScript.

Get the demo and source code from GitHub and use it freely 🙂

Posted in physics | Leave a comment

C-like singletons

It is my impression that template based singleton patterns for C++ like this one are very popular and widely used. Why? I believe it’s mostly because OOP design patterns in people’s minds often take precedence over code simplicity and prevent them from thinking even for a while of what approach is best as a whole.

Having read this post on gamedev.net yesterday I decided to quickly present my preferred way of implementing singletons in C/C++.

For a start let me say that I’m a big fan of simple and minimalistic coding style which when I’m coding in C++ often results in code that looks almost like pure C i.e. mostly just functions and C structures. This is funny by the way because if you asked me a couple of years ago I’d probably say something different. Well, people change and I did change too in that matter (largely thanks to great coworkers at BlueTongue). Having experienced variety of approaches, from heavy object oriented designs to pure C-like data oriented designs, I now choose the latter whenever possible. It’s just a lot simpler and easier to maintain and it pays off when implementing large systems. And there’s added bonus that it’s often significantly more efficient approach, especially on consoles.

Now, back to main topic. Let’s assume we need some text localization manager. Here’s my way of doing it:

LocalizationMgr.h:

//! Starts up localization manager

void LocalizationMgr_Startup(const char* language = "EN");

//! Shuts down localization manager

void LocalizationMgr_Shutdown();

//! Loads translation set

bool LocalizationMgr_LoadTranslationSet(const char* setName);

//! Localizes ‘sourceText’ phrase; returns NULL on failure

const char* LocalizationMgr_Translate(const char* sourceText);

 


As you can see the header file only contains the very minimum interface. As a user of that system you don’t even get to see the internals of the manager – it’s all hidden in a source file as can be seen below.

LocalizationMgr.cpp:

struct LocalizationMgrData
{
  bool m_isInitialized;
  char m_language[16];

  LocalizationMgrData() : m_isInitialized(false) {}
};

static LocalizationMgrData s_data;

void LocalizationMgr_Startup(const char* language)
{
  assert(!s_data.m_isInitialized);
  strcpy(s_data.m_language, language);
  s_data.m_isInitialized = true;
}

void LocalizationMgr_Shutdown()
{
  assert(s_data.m_isInitialized);

// TODO: Unload all translation sets – irrelevant for this sample

  s_data.m_isInitialized = false;
}

bool LocalizationMgr_LoadTranslationSet(const char* setName)
{
  assert(s_data.m_isInitialized);

// TODO: Load translation set – irrelevant for this sample

}

const char* LocalizationMgr_Translate(const char* sourceText)
{
  assert(s_data.m_isInitialized);

// TODO: Return translation of the ‘sourceText’ phrase – irrelevant for this sample

}

As you can see the source file contains data structure of type LocalizationMgrData declared as a static variable. This way it’s only visible from that single cpp. If it was to be visible from multiple source or header files, I’d simply put it into LocalizationMgr_Private.h. An important thing is that this data does not have to be parsed by the compiler when you #include LocalizationMgr.h – hence the compilation process is much faster. This really pays off when you design all of your game engine systems like that.

[Update] As per Arseny‘s comment, it’s worth pointing out that there’s slightly safer approach to declaring your data than what’s presented above. By making your s_data a pointer (i.e. static LocalizationMgrData* s_data) you benefit in 2 ways:
(a) you avoid static initialization order problems
(b) your code will most likely just crash if you use a function while system isn’t initialized; that is especially useful in Release builds where asserts won’t hit but it’s also useful if you simply forget to put an assert
The downside to that is that you need to dynamically allocate / deallocate memory in your Startup and Shutdown functions but this is a must anyway if you’re dealing with more complex data. In my example I was only using primitive types (char, bool etc.) which is why it was “okay”.

Now, here’s summary of my approach:

Pros:

  • Header file contains the very minimum code from the user’s point of view
    • easy to read for a programmer
    • fast to compile for a compiler
  • Explicit rather than implicit singleton initialization (must call Startup / Shutdown “manually”)

Cons:

  • All singleton “attributes” must be accessed via s_data
  • Have to put assert(s_data.m_isInitialized) in each singleton function in order to make sure it’s initialized when used
  • No easy way to support singleton polymorphism (in my opinion not a problem in 99% cases)
  • Code documentation generation tools like Doxygen won’t be able to figure out that all functions with LocalizationMgr_ prefix belong to same logical module; my solution to that is to use grouping feature using /defgroup around my header file but this isn’t perfect obviously; now, I’m also a big fan of self-documenting code, so ideally one wouldn’t need the docs at all, just looking at the header file should be enough 🙂
  • Similar problem as above with Intellisense; you won’t be able to type in “module” (e.g. LocalizationMgr) name’s prefix, press enter, and then start typing in remainder of the function name (e.g. Translate)

[Update] The 2 issues above can be solved in C++ by putting all of your functions into namespaces (e.g. namespace being LocalizationMgr and function being Translate) as suggested by Arseny in the comments.

As you can see there’s still some disadvantages to my approach but the (a) and (b) on the pros list are still more important to me than all of the cons. Which approach you choose is up to you, but before you make that decision ask yourself what your requirements are and which solution meets them best. For me, as with majority of programming related topics, KISS principle is a major one!

Posted in game engine, general programming | 2 Comments

Light Pre-Pass vs Deferred Renderer – Part 1

My questions

I have recently decided I want to spend some time experimenting with various approaches to renderers being used across PC and console games these days. In particular I wanted to look more closely at pros and cons of “traditional” Deferred with respect to Light Pre-Pass renderer which is slowly becoming more and more popular.

(1) What are the differences between the 2 approaches?
(2) What are the main limitations of each approach and what tricks do people do to workaround these?

These were my key questions which I can hopefully answer now, at least to some extent. If you just want to learn more about both techniques you may still find some useful information here.

Quick intro

Before I start I want to do a quick reminder on how each method works in a nutshell:

“Standard” deferred shading is a 2-stage process:
(1) draw (opaque) geometry storing its attributes (i.e. position as depth, normals, albedo color, specular color and other material properties) in a number of full screen buffers (typically 3 or 4)
(2) for each light source, draw its volume and accumulate lit surface color into final render target

I recommend you take a look at these great Killzone 2 slides for more detailed overview of sample deferred renderer implementation.

Light pre-pass is a 3-stage process:
(1) draw (opaque) geometry storing its attributes necessary for lighting (i.e. position as depth, normals and specular power)
(2) for each light source, draw its volume and accumulate just the lighting into an additional render target
(3) draw (opaque) geometry again; evaluate full material shader and apply lighting sampled from a buffer generated in step 2; this step is de facto forward shading

Read more on light pre-pass on Wolfgang Engel’s blog or in his presentation. Also, take a look at this highly informative presentation by Insomniac from GDC’09.

My answers

Now that we know the basics, let’s go over each of my initial questions and try to give answers to each:

(1) What is the key difference between the 2 approaches?

Number and cost of drawing stages
The first obvious difference is that light pre-pass requires 3, not 2 stages. On one hand this sounds like more expensive because you have 2 geometry passes in light pre-pass instead of 1 in deferred shading but on the other hand both of these geometry passes are cheaper. In light pre-pass, the first geometry pass only outputs what’s necessary for lighting phase: depth, normals and optionally specular power. The second geometry pass (3rd stage) doesn’t typically need normals and it may take advantage of early z cull, including hierarchical z cull on some hardware – hence the depth has already been written. Unfortunately the first geometry pass isn’t going to be as fast as just a depth pre-pass, especially on hardware with support for double speed depth pass – this is because we also output normals and specular power. For this reason, with light pre-pass, it might even be worthwhile to use custom CPU/SPU based geometry occlusion system.

On the other hand light pre-pass has considerably lower bandwidth and memory requirements – no heavyweight 4-render-target output and sampling needed any more. In fact, light pre-pass might even be doable without MRT (only if you can read depth straight from depth buffer i.e. X360, PS3 or DX10/11).

There’s an interesting post mortem by one guy who implemented both light pre-pass and deferred renderers in their engine on 3 platforms: X360, PS3 and PC. He says he got better performance with deferred renderer mostly due to having single geometry pass.

Material variety
One of the benefits of light pre-pass is it gives you a bit more freedom in terms of material variety (compared to deferred shading). It means you can run different shader for each geometry and so, you can calculate final lit surface color differently. Instead of having single pass per light as in deferred shading, here we’re having single pass per mesh. Having said that, you’re still quite limited in terms of variety in lighting models because available lighting attributes have already been calculated (during 2nd pass) – these typically only include accumulated Phong shading factors being N dot L and (R dot V) ^ SpecularPower (typically with light color and attenuation factors applied).

MSAA
Another nice feature of light pre-pass is that it works much better (again, compared to deferred shading) with MSAA. It’s still not going to be 100% correct MSAA (unless DX10/11 level features are used) because lighting buffer used in the last drawing pass won’t be sampled at MSAA’ed resolution (talking about light buffer sampler here). However, it’s fairly trivial to get MSAA’ed geometry with non-MSAA lighting which may yield pretty good results.

(2) What are the main limitations of each approach and what tricks do people do to workaround these?

Transparency
With both deferred shading and light pre-pass transparency is still typically done using old school forward shading after opaque geometry. There’s no efficient way around this, at least none that I know of.

MSAA
With deferred shading it’s become standard to use some kind of post-processing to smooth the edges. This can be as simple as finding edges based on depth and normals and blurring these – see GPU Gems 2 chapter on how they did it in STALKER. Another popular, purely color based, approach to antialiasing is called Morphological Antialiasing (or just MLAA) – see Realtime Rendering blog for more info. It’s even been added as an on/off feature to Radeon HD 6000-series cards and it’s also widely used on PS3 via its SPU processors. Other interesting color based antialiasing methods include FXAA or SRAA. Now, as mentioned before, with light pre-pass we may just be able to stick with geometry-only-MSAA. Otherwise any of mentioned post-processing AA techniques still apply.

Specular color in light pre-pass
If you’re aiming at just using single light buffer, that means you have 4 attributes 3 of which are used for RGB components of the dot(N, L) * LightColor * LightAttenuation and so, you only have 1 left for specular lighting. Having 1 instead of 3 attributes we choose to just store single specular lighting luminance value (for example, luminance = 0.3 * R + 0.59 * G + 0.11 * B). But that means we loose specular lighting color which, if there’s non-white lights in the scene, may result in highly undesirable looks. Fortunately there’s this cool “trick” that let’s you reconstruct light color from diffuse lighting components. It’s not perfect as it only works correctly for a single light but it’s definitely better than nothing – see Max’s Programming Blog for a nice explanation of it.

Materials / lighting models variety
Both approaches are very limited in terms of material / lighting models variety. In a traditional forward renderer the shader had access to both light and material properties and so it could implement any kind of fancy material. In deferred renderer material properties are accessed first, then what the lighting pass gets is just fixed set of generic attributes (e.g. albedo color, specular color). Now, with light pre-pass, we have a bit different situation. One could say it’s even less flexible because we’re pretty much locked with Phong diffuse / specular shading model. Unfortunately, implementing something more complex like Oren-Nayar isn’t easily doable with light pre-pass as is the case with any other “non-standard” material / lighting models.

The only way I can think of one could support multiple material / lighting models in their deferred renderer (doesn’t apply to light pre-pass) is by storing material id during the first pass in a G-buffer and then branching in a shader code based on its value. It may not be too bad performance wise due to coherency among samples but this means your shader code gets more complex (messy!) too. Neverthless, I think this could be quite a good approach to multiple material / lighting models.

That’s it for now. I feel like the more you investigate the topic the more difficult it is to choose the best approach for specific needs. I plan to discuss such choice dilemmas too in Part 2, stay tuned!

Posted in deferred shading, light pre-pass, rendering | 9 Comments

Non-blocking packet-oriented sending and receiving data via TCP

Let’s get straight into the topic, here’s our goal for today:
Want to be able to send (and receive) whole data packets without blocking and without having separate “send thread” (or “receive thread”) using TCP. Similar to UDP in that whole packet is either fully sent (or fully received) except we want that to work with TCP.

The first suggestion here is to use non-blocking sockets. That’s obviously a good choice, however, there’s a larger problem to consider to do with the fact that send socket function might not send the whole buffer it’s given.

Now, let’s imagine the following scenario. Application wants to send message that is 20 bytes (assume it has successfully created TCP socket and it has set it up to be non-blocking before). Then it calls send, but it only sends 15 bytes. What are we going to do with remaining 5 bytes?

We could keep calling send until we eventually send the whole message but you never know how much time you could spend doing that. And remember that our primary goal was that all is non-blocking.

Or we could leave it up to user. User would then have to remember how many bytes were successfully sent and thus how much remaining data still needs to be sent. The user would also need to store somewhere the part of message that failed to send. But having to do that would make it highly inconvenient for the user. Wouldn’t it?

Here’s one solution to the problem. The main idea to get around described issue is to encapsulate the process of sending by some helper that would either fully succeed (all bytes sent) or fully fail (zero bytes sent). To be able to do that, our helper needs to have its own buffer of the size at least as large as the size of largest message we’re planning to be sending. This is so that in the event of failing to send part of the message, we can still store the rest of it in a buffer and try to send it later. The only extra requirement is that user frequently calls some Tick() function to make sure any pending data eventually gets sent.

And that’s it. The main idea has been described, here’s what mentioned helper class might look like in C/C++:

class TCPHelper
{
public:
  // Creates helper for a given non-blocking TCP socket
  TCPHelper(int socket, int sendBufferSize);
  // Sends the data
  bool Send(const void* data, int dataSize, int& socketError);
  // Tries to send pending data
  void Tick(int& socketError);
private:
  bool Buffer(const void* data, int dataSize);

  int m_socket;

  byte* m_buffer;
  int m_bufferSize;
  int m_bufferCapacity;
};

The constructor just sets the socket and allocates internal send buffer:

TCPHelper::TCPHelper(int socket, int sendBufferSize) :
  m_socket(socket), m_bufferSize(0), m_bufferCapacity(sendBufferSize)
{
  m_buffer = (byte*) malloc(sendBufferSize);
}

The TCPHelper::Send() function first tries to send any previous pending data, then it tries to buffer the data and eventually send it.

bool TCPHelper::Send(const void* data, int dataSize, int& socketError)
{
  assert(dataSize <= m_bufferCapacity);

  Tick(socketError);
  if (socketError < 0) return false;

  if (!Buffer(data, dataSize)) return false;

  Tick(socketError);
  if (socketError < 0) return false;

  return true;
}

The TCPHelper::Buffer() just copies the data into our buffer:

bool TCPHelper::Buffer(const void* data, int dataSize)
{
  if (m_bufferSize + dataSize > m_bufferCapacity) return false;
  memcpy(m_buffer + m_bufferSize, data, dataSize);
  m_bufferSize += dataSize;
  return true;
}

And finally the TCPHelper::Tick() function that sends any pending data stored in a buffer:

void TCPHelper::Tick(int& socketError)
{
  // Nothing to send?

  if (m_bufferSize == 0)
  {
    socketError = 0;
    return;
  }

  // Send the data

  socketError = send(m_socket, m_buffer, m_bufferSize, 0);
  if (socketError < 0)
  {
    // EAGAIN or EWOULDBLOCK are not critical errors
    if (socketError == EAGAIN || socketError == EWOULDBLOCK)
      socketError = 0;
    return;
  }

  // Pop the remaining data to the front of the buffer

  m_bufferSize -= socketError;
  if (m_bufferSize > 0)
    memmove(m_buffer, m_buffer + socketError, m_bufferSize);

  // Indicate no error

  socketError = 0;
}

In all of the presented functions, socketError always receives either 0 (for no critical socket errors) or negative number being regular socket error. In case of TCPHelper::Send() function, when it returns false and socketError is 0, it means we failed to buffer the message on either socket or custom buffer level.

Note that I have omitted all parts of the code that I would normally implement but were irrelevant for the main topic – for example TCPHelper class destructor. We also don’t properly handle an attempt to send again after there was critical socket error (i.e. when socketError received negative value) and there’s also significant inefficiency to do with always buffering the data first, then sending it. In real life, most of the time immediate send without buffering is possible.

Now that we have done sending part, we can apply similar approach to implementing receiving of the data. To do that we’d have to add to TCPHelper separate buffer for received data and implement one additional method

bool TCPHelper::Receive(void* data, int dataSize, int& socketError)

that either receives all dataSize bytes and returns true or fails to receive any and returns false.

You can get complete C/C++ code implementing both sending and receiving here: TCPHelper.h, TCPHelper.cpp. The only major improvement that I can think of here is avoiding memmove and instead using some kind of ring buffer / fifo queue to push and pop the data.

Posted in network programming | Tagged , | Leave a comment