Question

How to compute byte size of list<T> in C#

I want to send a large list of objects to another web service for intake. The web service has a byte limit of 6Mb. I want to send my list of 7,000+ objects in 5Mb "batches."

To do this, I need to compute the byte size of the objects and send a batch when the total bytes reaches 5Mb. The issue is that each object can have multiple child objects, and each child objects can have multiple child objects too.

The batch of objects is first serialized to Json before being sent in the body of the request.

Is there a way of computing the byte size of each object at runtime and then adding the value to a running total?

The code snippet below shows what I have to send the list:

jsonList = JsonConvert.SerializeObject(list, 0);
request = new HttpRequestMessage(HttpMethod.Post, url);
request.Content = new StringContent(jsonList , Encoding.UTF8, "application/json");
request.Method = HttpMethod.Post;
request.Headers.Add("Authorization", "Bearer " + authToken);
response = await client.SendAsync(request);
jsonResponse = await response.Content.ReadAsStringAsync();
            
 4  120  4
1 Jan 1970

Solution

 5

The easiest method is to just guesstimate a small enough quantity and use Linq's Chunk operator.

If that is not accurate enough then you can run a loop in an iterator function, yielding a MemoryStream of enough items to just fill the stream to the size you want, then starting another stream.

public static IEnumerable<Stream>(IEnumerable<YourClass> source, long maxLength)
{
    var serializer = JsonSerializer.Create();  // add settings?
    using var enumer = source.GetEnumerator();
    var (ms, writer) = GetNewStream();
    while (enumer.MoveNext());
    {
        var currentPos = ms.Position;
        serializer.Serialize(writer, enumer.Current);
        if (ms.Position >= maxLength)
        {
            writer.Close();
            ms.SetLength(currentPos);  // truncate to old position
            ms.Write((byte)']');
            ms.Position = 0;  // reset ready for eading
            yield return ms;
            (ms, writer) = GetNewStream();
            serializer.Serialize(writer, enumer.Current);
        }
    }

    if (ms.Length != 1)  // was not empty list
    {
        writer.Close();
        ms.Write((byte)']');
        ms.Position = 0;  // reset ready for eading
        yield return ms;
    }
}

private static (MemoryStream, StreamWriter) GetNewStream()
{
    var ms = new MemoryStream();
    ms.Write((byte)'[');
    var writer = new StreamWriter(ms, leaveOpen: true)
    return (ms, writer);
}
2024-06-28
Charlieface

Solution

 3

Start a StringBuilder with "[", serialize the list objects in a loop one by one and keep adding to the builder (don't forget the commas between objects).

Shortly before reaching the limit, close with "]", convert to UTF8 bytes, fire the request, then reset the builder with a new "[".

As long as your content is mainly in the ASCII-Range, the builder.Length should be equal or only slightly less than the byte length of the content.

If you have lots of special characters, UTF-8 needs to represent them with multiple bytes so your content bytes will grow faster than the string length. In that case, you need to convert each json string to UTF8 and measure that byte count to keep track of the byte total.

Leave yourself some kb leeway before the limit to account for headers and stuff and you should be good.

2024-06-28
TToni