Wednesday, January 9, 2013

Cross-platform communication using Google's Protocol Buffers Revisited

In my previous blog entry regarding cross-platform communication using Protobuf, I made some very elementary mistakes. The most glaring of which was my improper use of TCP.
Because TCP recv() allows for partial returns, we must make sure that we loop and recv() all of the message. This is TCP sockets 101 and it is kind of embarrassing that I missed it. Note: TCP send() can also have partial returns, but I have not bothered handling this, even though it is trivial.
The other mistake that I made was that I always assumed that a varint32 was 4 bytes in size. This is incorrect, since it is, in fact, a VARINT!

These two mistakes necessitated changes in the C++ sending and receiving.
First, I wrote a function to parse out the delimiting varint and to fill a buffer with the bytes specified by that varint:
/* 
   reads a varint delimited protocol buffers message from a TCP socket
   returns message in buffer, and returns number of bytes read (not including delimiter)
*/
int recvDelimProtobuf(int sock, unsigned char **buffer){
    //read the delimiting varint byte by byte
    unsigned int length=0;
    int recv_bytes=0;
    char bite;
    int received=recv(sock, &bite, 1, 0);
    if(received<0)
        return received;
    else
        recv_bytes += received;
    length = (bite & 0x7f);
    while(bite & 0x80){
        memset(&bite, 0, 1);
        received=recv(sock, &bite, 1, 0);
        if(received<0)
            return received;
        else
            recv_bytes += received;
        length|= (bite & 0x7F) << (7*(recv_bytes-1));
    }

    //receive remainder of message
    recv_bytes=0;
    *buffer=(unsigned char *)malloc(sizeof(unsigned char) * length);
    while(recv_bytes < length){
        received=recv(sock, *buffer + (sizeof(unsigned char) * recv_bytes), length-recv_bytes, 0);
        if(received<0)
            return received;
        else
            recv_bytes+=received;
    }
    return recv_bytes;
}

The rest of the code now looks like this:
    //allocate packet buffer
    unsigned char *buffer;
	int received=recvDelimProtobuf(clientSock, &buffer);

	//read varint delimited protobuf object in to buffer
	google::protobuf::io::ArrayInputStream arrayIn(buffer, received);
	google::protobuf::io::CodedInputStream codedIn(&arrayIn);
	google::protobuf::io::CodedInputStream::Limit msgLimit = codedIn.PushLimit(received);
	client.ParseFromCodedStream(&codedIn);
	codedIn.PopLimit(msgLimit);

	//purge buffer
	free(buffer);

This fixes the receiving side. To fix the sending side, we need to change the message size to actually reflect the size of the varint + message.
	int varintsize = google::protobuf::io::CodedOutputStream::VarintSize32(serverAck.ByteSize());
	int ackSize=serverAck.ByteSize()+varintsize;
	char* ackBuf=new char[ackSize];

	//write varint delimiter to buffer
	google::protobuf::io::ArrayOutputStream arrayOut(ackBuf, ackSize);
	google::protobuf::io::CodedOutputStream codedOut(&arrayOut);
	codedOut.WriteVarint32(serverAck.ByteSize());

	//write protobuf ack to buffer
	serverAck.SerializeToCodedStream(&codedOut);
	send(clientSock, ackBuf, ackSize, 0);
	delete(ackBuf);

Thanks to Johan Anderholm for helping me with this solution. As he mentions in a comment on my original blog entry, there is another elegant solution that uses boost::asio. I have not tried this but I am sure it works just as well!