Questions about Forge limits - logs and memory

BPB · December 14, 2023, 12:30pm

Log lines per invocation 100
- Per invocation of what? A function? For example, if a trigger calls function 1, then this puts items on the queue processed by function 2, is each invocation of function 2 given 100 lines?
- What is a line? Is it a row with a date when viewing the Logs UI in the dev console? What does it mean if I open one of those rows and it consists of multiple lines?
- What happens when this is breached?
- Why is it this low? When troubleshooting issues it’s forcing me to comment out portion of standard logs to get to specific portions of the codebase. I think it is also really easy to fall afoul of this and have it seem like the function is “dying”. If this has to be this low, then please at least add a message saying “This function hit the invocation line log limit” or something.
What happens if limits like memory or execution time are breached? Is anything put in the logs?
The new native Node.js runtime has 512MB of memory available per invocation.
- Is this RSS, heap total, heap used, or external memory as defined in process.memoryUsage()?
- Is there a supported way of profiling the code? Maybe with forge tunnel?
- What is the expected memory utilization of a freshly started function before we have done anything?
  - Self answer: a clean node process with only node-fetch is ~48mb RSS.
- With the new runtime, if I have a const which lives within a resolver, it seems as if the function may still be using this memory on the next run? The code below shows memory usage that varies from 63.71 MB to 450.72 MB RSS even though it is being triggered with identical inputs depending on what I assume is the warm lambda state? Screenshot below also shows that there is nothing stored in the global scope:

resolver.define("processHar", async ({ payload, context }) => {
    console.log('processHar.js consumer function invoked');

    const {issueIdOrKey, fileName, attachmentId} = payload;

    try {

        logMemory();

AlexeyKotlyarov · December 19, 2023, 3:44am

Thanks for the questions, they’ll definitely help us to improve the documentation

Is this RSS, heap total, (…)

The limit is applied to the container running the function. Since you can start external processes on Node.js runtime, all of them will count (but if they manage to share some memory it will be counted once).

profiling the code

We don’t have any guidance at the moment. You can profile with forge tunnel but for best results make sure your own environment matches the one used by Forge functions (current version of Amazon Linux 2, x86, Node.js 18).

If I have a const within a resolver… function may still be using this memory on the next run

This is expected because of warm starts. Garbage collector will eventually kick in and free up unused memory, but if you, for example, append to a global variable on every invocation, the memory will eventually be exhausted.

For the rest of the questions, I’ve talked to the team responsible for log handling, and someone from their team will comment as well.

gdeep · December 19, 2023, 7:01am

Thanks for the questions !
Please find the answers below :

Question : Per invocation of what? A function? For example, if a trigger calls function 1, then this puts items on the queue processed by function 2, is each invocation of function 2 given 100 lines?
Ans : Yes. Each invocation is given 100 log line.

Question : What is a log line?
Ans : A log line is the Console.log(). You can write 100 Console.log() in one function.

Question: What does it mean if I open one of those rows and it consists of multiple lines?
Ans : In this case the lines will be converted to object. Something like console.log{object)

Question: What happens when this is breached?
Ans: We have limits enforced that prevent logs from being ingested.

Question: What happens if limits like memory or execution time are breached? Is anything put in the logs?
Ans : No. We don’t add more lines to the logfile. This is on the roadmap However, the metrics will tell you the error status code when memory/timeout breaches happen.

Hope this was helpful !

BPB · December 19, 2023, 8:35am

Let’s assume we’re not spawning anything external and have a single JS function. For those of us operating near the limit, how should we be monitoring this? process.memoryUsage().rss seems to be the best thing I have so far.

Can you share any guidance on how we can do this with the new runtime? I am not sure how to hook up Chrome Devtools or another profiler to Forge.

Sure but if I have a memory intensive task, in my case that is “loading json into memory” and I do my best to isolate this into it’s own function so that I have as much memory overhead as possible, the result is that now when the function starts, it’s actually unreliable in terms of how much memory I can expect to use before the process dies. I essentially have to hope that the GC runs quickly enough as the object is loaded into memory. Techniques like:

import { setFlagsFromString } from 'v8';
import { runInNewContext } from 'vm';

setFlagsFromString('--expose_gc');
const gc = runInNewContext('gc');
gc();

Don’t seem to help here.

Has this been tested? Most of this thread and my exploration into limits has come from functions going quiet in logs and my not knowing why. My frequent dev experience has been:

Build things in tunnel and everything mostly works
Deploy to dev and things are broken
Go to dev console logs and see functions missing log entries mid function (working theory is that log lines are prioritized based on call stack depth) or just going silent (current working theory is memory causes this one)

varun · December 19, 2023, 6:25pm

Has this been tested? Most of this thread and my exploration into limits has come from functions going quiet in logs and my not knowing why

Metrics will be emitted whether a function completes successfully or not. Metrics and logs are emitted at different points in the execution lifecycle. We have functionality in place to allow you to filter metrics by the type of error. If you aren’t seeing metrics when timeouts or OOM errors are happening, please raise a ticket with the app and environment IDs and we’ll take a closer look.

varun · December 19, 2023, 11:14pm

[FRGE-1326] - Ecosystem Jira - can be used to track the logs issue @BPB

BPB · December 20, 2023, 7:12am

I guess the challenge for me is that when timeouts or OOM errors are happening can’t be determined since I don’t see anything in metrics or logs. I am assuming that one of these two are the cause.

Anyways, I guess I gotta write a PoC for flooding logs and then causing an OOM and see what it will do.

varun · December 20, 2023, 11:04am

Hi @BPB,

I just created a naive app that generates an OOM on render and deployed that to my test site. I’m sharing a few screenshots with you in case that helps. The app just allocates data in a loop until it fails.

Here’s the macro that shows an error on a confluence page, where it’s rendered.

Screenshot 2023-12-20 at 21.57.211882×440 31.2 KB
On logs, I can view logs up to the time the OOM happened

Screenshot 2023-12-20 at 21.57.172878×1180 295 KB
On the metrics page (Do note the environment and other filters - I’d deployed to development env), I see the metric reporting that an OOM has been detected (both on the detail page and the overview page)

Hope this helps!

BPB · December 20, 2023, 1:43pm

Interesting to see a wildly different experience than for me. Can you try doing it in an async function? Also, maybe there is an edge case with what I am doing with JSON files? Can you try and have the code fetch a large JSON file, let’s say a 150MB HAR file or something. I can reliably get it to fail (the logs go dead and no metrics issues) with ~115MB.

BPB · February 2, 2024, 12:50pm

I dug into this again and I can 100% consistently reproduce nothing in the logs when I OOM my async process.