Forge storage problems and limitations regarding filters and queries

Using Forge storage has been a nightmare.

Recently, I migrated to @forge/kvs to stay up to date with the latest improvements, but I ran into an issue with filters.
Previously, I was able to chain multiple filters like this:

let query = storage.query().entity('someEntity').index('someField');
query = query.andFilter('condition');

Now, I can’t find a way to achieve the same thing with the new approach. The documentation claims that filters can be chained, but I haven’t been able to get it to work. Could you provide a concrete example of how to do this?

Additionally, the inability to use combined AND/OR conditions in queries is a massive limitation. I need to be able to construct a query like this:

SELECT ITEM WHERE KEY IN ('key1', 'key2') AND FIELD_X = 'value1' AND FIELD_Y = 'value2'

Right now, Forge storage doesn’t seem to support this, making complex queries unnecessarily difficult. Are there any workarounds or planned improvements to address this limitation?

1 Like

Hi @HristoBurgazliev,

You can still chain filters, but now you can assign them to variables first and then just use them with your query whenever you want.

  const filterByFields = new Filter<SomeEntity>()
    .and("fieldX", FilterConditions.equalTo("value1"))
    .and("fieldY", FilterConditions.equalTo("value2"));

  const items = await kvs.entity<SomeEntity>("someEntity")
    .query()
    .index("someIndex")
    .filters(filterByFields)

To your second point, you’re right, we don’t support a combination of “or” and “and” conditions due to limitations with underlying DB.

One possible workaround if you are often performing queries based on field_x and field_y, is to create a custom index where you define the partition based on them and the “key”.

storage:
  entities:
    - name: someEntity
      attributes:
        key:
          type: string
        field_x:
          type: string
        field_y:
          type: string
      indexes:
        - name: field_x-and-field_y-by-key
          partition:
            - field_x
            - field_y
          range:
            - key

You can then query this index like this:

const itemsForKey1 = await kvs.entity<SomeEntity>("someEntity")
  .query()
  .index("field_x-and-field_y-by-key", {
    partition: ['value1', 'value2']
  })
  .where(WhereConditions.equalTo("key1"))

Unfortunately, you would still have to perform two queries (one for key1 and another for key2) as we don’t have support for “In” conditions.

Please let me know if that helps.

1 Like

Thank you for your reply, @RodolfoCandido ! This works, but I’ve found that in order to dynamically append AND/OR conditions to a filter instance, you must append at least one filter when initializing, otherwise, additional filters won’t chain properly.

For example, I want to optionally append a name filter:

if (queryFilters.name) {
    filter.and('name', FilterConditions.contains(filters.name));
}

However, to make this work, I need to initialize the filter with a “useless” condition, or else chaining doesn’t work later:

const filter = new Filter().and('reportId', FilterConditions.beginsWith('report_'));

This feels like an unnecessary workaround compared to the legacy way of chaining filters.

Regarding the combination of AND and OR filters, I hope Forge SQL will eventually solve these issues - whenever it becomes available.

One workaround I found is to keep track of all entity IDs and chain multiple “not equal to” filters to indirectly match the entities I need, while still being able to use other AND filters.

However, this approach introduces an absurd amount of unnecessary logic, contributing to the overall frustrating developer experience with Forge. This shouldn’t be the expected way to handle filtering.

Are there any plans to improve this in future releases?

Hi @HristoBurgazliev,

I noticed that we don’t export the “AndFilter” and “OrFilter” in the package directly, so we will release a patch version to fix that. But you don’t have to create this “useless” condition, you can just keep the filter in a variable and use later with the query. If you provide an empty filter it will not cause an error.

let baseFilter: AndFilter<Employee> = new Filter<Employee>();

if (Math.random() > 0.5) {
  baseFilter = baseFilter.and("surname", FilterConditions.equalTo("bob"));
}

const items = await kvs.entity("employee")
  .query()
  .index("surname")
  .filters(baseFilter)
  .getMany();

We don’t have anything immediately on our roadmap to look into this limitation with filters unfortunately. I will bring attention to this requirement and see what we can do about this limitation. Just a reminder that this filtering only happens after the initial filtering to reduce network bandwidth so technically it’s the same thing as having a function performing this filtering in the application level. See the “info” highlighted section in this part of the docs.

Thank you for your reply, @RodolfoCandido!

Your explanation makes sense, and I appreciate the workaround—but the fact that filters are applied only locally raises a few important questions.

You mentioned the intent is to reduce network bandwidth, but doesn’t this mean that more data is still being pulled initially than what ends up being used? For example, if I have 10 items in storage and only 2 match the filters, are all 10 items still being retrieved before the filters are applied?

If that’s the case, does the total size of all retrieved items count toward Forge’s storage quotas and limits, even if I only need a small subset of them? That would make the efficiency argument a bit questionable, especially for larger datasets.

Overall, I still find Forge storage to be surprisingly limited for real-world use cases. I truly hope Forge SQL or future improvements to KVS can address these gaps.

Thanks again for raising this internally!

Apologies for the delay @HristoBurgazliev.

The filter mechanism works the same way as FilterExpressions in DynamoDB which only happens after matching a particular partition for a given index and then the filter is applied in-memory.

In your example, because the base query would have matched 10 items, that would be your read capacity utilised for the query and that would count towards your quotas. However, the filter would just be reducing the amount of data transferred to your since your would not discard that information anyway.

For KVS storage, we highly recommend defining the indexes based on your access patterns reducing the need to use filters because of its limitations.