Message Queues — MessagePack vs JSON for Serialization
How our messages are stored in message queues is an important parameter that affects the efficiency and performance of the system. The JSON we use in APIs is easy for humans to read and write. However, in message queues we are providing data that enables communication between systems or small parts of systems. With the compactness that comes from MessagePack’s binary format, using MessagePack in the message queues of our distributed systems, where high performance, low latency and budget are critical, could be an excellent choice due to its benefits in network efficiency, faster serialisation and deserialisation.
MessagePack stores data in a compact binary format, which means smaller sizes compared to text-based formats such as JSON. Its binary structure allows for faster serialisation and deserialisation, and the compactness associated with binary serialisation methods allows it to communicate with smaller packets over the network. This contributes to both efficiency and performance.
Observation
I have created a definition for a financial instrument with three fields: Symbol, Price and Volume, by keying the fields to 0, 1 and 2 (for example, ‘Symbol’ is stored as ‘0’ during serialisation, which reduces the size of the serialised data). Based on this definition, I generated data sets ranging from 1 to 1,000,000 items to observe serialisation performance and data size.
[MessagePackObject]
public record Stock
{
[Key(0)] [JsonPropertyName("0")] public string Symbol { get; init; }
[Key(1)] [JsonPropertyName("1")] public double Price { get; init; }
[Key(2)] [JsonPropertyName("2")] public double Volume { get; init; }
public Stock(string symbol, double price, double volume)
{
Symbol = symbol;
Price = price;
Volume = volume;
}
[SerializationConstructor]
public Stock()
{
}
}
How much does it affect performance?
Below are the benchmark results for serialisation and deserialisation of data containing different numbers of objects, ranging from 1 to 1,000,000 objects. On average, serialisation and deserialisation using MessagePack performs approximately 411% better.
| Method | Mean | Error | StdDev | Median | Gen0 | Gen1 | Gen2 | Allocated |
|------------------------------------------------------ |-----------------:|-----------------:|-----------------:|-----------------:|-----------:|----------:|----------:|------------:|
| MessagePackSerializationDeserialization_Count_1 | 113.6 ns | 0.08 ns | 0.07 ns | 113.6 ns | 0.0216 | - | - | 136 B |
| MessagePackSerializationDeserialization_Count_10 | 838.8 ns | 1.66 ns | 1.39 ns | 838.9 ns | 0.1984 | - | - | 1248 B |
| MessagePackSerializationDeserialization_Count_100 | 7,989.5 ns | 158.02 ns | 375.54 ns | 7,854.9 ns | 1.8768 | 0.0076 | - | 11776 B |
| MessagePackSerializationDeserialization_Count_1000 | 76,766.5 ns | 1,213.33 ns | 1,134.95 ns | 76,330.1 ns | 19.8975 | 3.2959 | - | 125176 B |
| MessagePackSerializationDeserialization_Count_10000 | 917,663.5 ns | 17,815.88 ns | 16,664.99 ns | 910,570.7 ns | 155.2734 | 80.0781 | 2.9297 | 1268194 B |
| MessagePackSerializationDeserialization_Count_100000 | 25,214,373.4 ns | 494,744.89 ns | 740,510.85 ns | 25,320,850.2 ns | 1531.2500 | 687.5000 | 312.5000 | 12788717 B |
| MessagePackSerializationDeserialization_Count_1000000 | 275,701,080.9 ns | 5,461,375.35 ns | 8,819,103.19 ns | 276,414,093.8 ns | 20000.0000 | 9000.0000 | 3000.0000 | 152010116 B |
| JSONSerializationDeserialization_Count_1 | 495.5 ns | 0.36 ns | 0.32 ns | 495.4 ns | 0.0238 | - | - | 152 B |
| JSONSerializationDeserialization_Count_10 | 5,410.7 ns | 107.95 ns | 225.33 ns | 5,262.1 ns | 0.3738 | - | - | 2368 B |
| JSONSerializationDeserialization_Count_100 | 48,960.2 ns | 175.93 ns | 155.96 ns | 48,957.9 ns | 2.4414 | 0.0610 | - | 15649 B |
| JSONSerializationDeserialization_Count_1000 | 484,821.9 ns | 1,608.85 ns | 1,343.46 ns | 484,129.2 ns | 23.4375 | 0.9766 | - | 152529 B |
| JSONSerializationDeserialization_Count_10000 | 6,817,696.4 ns | 126,110.83 ns | 117,964.15 ns | 6,769,114.7 ns | 179.6875 | 93.7500 | 46.8750 | 1694997 B |
| JSONSerializationDeserialization_Count_100000 | 68,462,841.2 ns | 1,162,845.12 ns | 1,428,078.24 ns | 68,705,116.1 ns | 1571.4286 | 571.4286 | 142.8571 | 20672606 B |
| JSONSerializationDeserialization_Count_1000000 | 664,231,691.7 ns | 12,880,939.80 ns | 12,048,839.08 ns | 665,760,375.0 ns | 18000.0000 | 9000.0000 | 4000.0000 | 289836648 B |
In summary, MessagePack is significantly faster than JSON for serialisation and deserialisation at every Count value. The performance difference is more pronounced for smaller datasets, and while the difference decreases as the dataset size increases, MessagePack is still generally 100% to 2 times faster.
How much does it affect data size?
As I mentioned earlier, MessagePack provides more compact data through binary serialisation. Below you can see how much space the serialised data takes up for data sets containing 1 to 1,000,000 objects. Serialisation with MessagePack is 37% more efficient than JSON.
+-----------+----------------+----------------+
| Count | MessagePack | JSON |
+-----------+----------------+----------------+
|1 | 26 bytes | 43 bytes |
|10 | 281 bytes | 458 bytes |
|100 | 2893 bytes | 4690 bytes |
|1,000 | 29893 bytes | 47808 bytes |
|10,000 | 308893 bytes | 488719 bytes |
|100,000 | 3188895 bytes | 4985072 bytes |
|1,000,000 | 32888895 bytes | 50854409 bytes |
+-----------+----------------+----------------+
As a result, using MessagePack for serialisation in message queues allows the creation of high performance and efficient systems with faster processing times and smaller data sizes. However, this eliminates the ability for humans to read and manually edit the data. If you already have a properly functioning system, why would you need people to manually read and manipulate the data?
Benchmark codes: https://gist.github.com/halilkocaoz/a2c2f345fc7f9a3370652d54ebcd11ad
Data size observation codes: https://gist.github.com/halilkocaoz/70559ae6720001a5052aca50c8c99056