Forge async events function time out error

We are developing a Forge App for syncing CMDB data into Insight. We are using async events for the same. During our testing, we are pushing 200-300 events per minute which is less than 500 events per minute mentioned in the platform limits doc. Each event makes about 5 API calls which complete under 25 sec. limitation. We haven’t added any delay so these events are executed parallelly.
Most of our events are getting retries with the reason FUNCTION_TIME_OUT, is this the expected behavior?
What steps do we take in order to avoid such retries?

That’s very curious. If everything is working within 25 seconds I can only guess that you’re facing rate limits by the 5 API calls. Is that possible?

It’s expected that a function will only send the FUNCTION_TIME_OUT message when the lambda exceeds 25 seconds (barring very very specific circumstances which I don’t believe you are in).

We are not facing rate limit errors from any API. We even tried by calling only one API per event, but still many events were getting retries with the reason FUNCTION_TIME_OUT.

I’ll have a chat with the team responsible. They might know or at provide further debugging tips.

Do you know if the functions time out before, during or after the API calls? If you add some logging statements, you can pinpoint where the problem happens.

We have tried adding logs to identify where the problem happens and observed that it successfully calls the API for initial 60 to 70 events and after that if we push more events then the function timeout retries occur.
If we keep the no. of events under 50 per minute, we do not face any issues. But as per the documentation, we should be able to push 500 events per minute, each having a runtime of 25 sec.
Let us know if our understanding of the limits is correct or not. We will further investigate and share our findings.

Thanks.

According to my investigation, it looks like you are getting timeouts due to the fact that the endpoint you are calling start responding with 503 status code, and it leads lack of completion of your function.