Question
Polly based retry logs intermittently not showing up in App Insights for Windows Azure Function
We have a Windows Azure function where we have Polly implemented as -
public async Task<T> ExecuteTaskWithRetryAsync<T>(int maxRetryAttempts, int delayBetweenFirstRetryInSeconds, string invocationIdentifier, Func<Task<T>> function)
{
logger.LogInformation($"Executing {invocationIdentifier} with retries.");
var retryPolicy = Policy
.Handle<HttpRequestException>(ex => ex.StatusCode != HttpStatusCode.BadRequest && ex.StatusCode != HttpStatusCode.Unauthorized)
.WaitAndRetryAsync(
maxRetryAttempts,
retryAttempt => TimeSpan.FromSeconds(delayBetweenFirstRetryInSeconds * Math.Pow(2, retryAttempt - 1)),
(exception, timeSpan, retryCount, context) =>
{
logger.LogWarning("Execution of {identifier} failed. Waiting {sec} seconds before retry. Attempt number {retryCount}. Exception - {exceptionMessage}", invocationIdentifier, timeSpan.Seconds, retryCount, exception.Message);
});
return await retryPolicy.ExecuteAsync<T>(async () =>
{
T result = await function();
logger.LogInformation($"Successfully executed {invocationIdentifier}");
return result;
});
And being used as
var response = await retryService.ExecuteTaskWithRetryAsync(retryOptions.MaxRetryAttempts, retryOptions.DelayBetweenRetriesInSeconds, nameof(GetFromMetaRPAsync), async () =>
{
return await httpClient.GetAsync(endpoint).ConfigureAwait(false);
});
if (!response.IsSuccessStatusCode)
{
throw new HttpRequestException($"Failed to retrieve any results. Response with status code {response?.StatusCode} - {response?.ReasonPhrase}.");
}
But in App Insights, we can only see the initial log
"Executing {invocationIdentifier} with retries"
And then the HttpRequestException
is thrown with status code 504 (GatewayTimeout).
I can see an open issue which talks about Polly retries not being logged as separate dependencies, but they are able to see the retry log. We are not even able to see the retry logs of
"Waiting {sec} seconds before retry."
when there's an issue on the server, such as BadGateway, GatewayTimeout, etc.
However, we are able to intermittently see this log line
"Waiting {sec} seconds before retry... Exception - There was error sending the request".
In cases where the retries are not being captured, we are getting the response of the call after approx. 20 seconds. Is it possible that the retries are happening internally but are just not being captured by App Insights? Shouldn't the retry logs still be captured, even if dependencies are not? How can we verify that the retries are actually taking place or not?
We have run the code in our local and also have unit tests in place, and in both cases, we can see the retries taking place. Can the behaviour be different on the cloud?