com.atlassian.cache.memory.MemoryCacheManager does not release references?

Hi! We are analyzing our plugin on possible memory leaks and found one of the causes of replicated class objects from our packages. Those classes are some business logic components.
So, the memory leak is reproduced with JIRA 7.3, but isn’t reproduced with JIRA 7.0.2.
We think, that the leak appears because of com.atlassian.cache.memory.MemoryCacheManager, who keeps link to ProjectMappingManagerImpl.MappingCacheLoader (that’s one of our classes) after plugin was disabled. As a result full set of plugin’s objects is replicated each time another plugin’s version is installed to JIRA 7.3.

  1. Have someone ever met similar case?
  2. Is it a bug in the com.atlassian.cache.memory.MemoryCacheManager?
  3. If it is a bug in the com.atlassian.cache.memory.MemoryCacheManager then what is the workaround for now?
    • Should we reset or remove some entries from it explicitly on plugin’s disable event?
    • Maybe we should “nullify” refs to our classes explicitly?
1 Like

How are you retrieving the MemoryCacheManager? Is it plain injection or something else? Also since you mentioned the issue of interacting with other add-ons - are you using osgi declarations? Is MemoryCacheManager optional there?

Yes, it’s plain injection. ProjectMappingManagerImpl constructor has CacheManager cacheFactory (com.atlassian.cache.CacheManager) as IN parameter (see below).

import com.atlassian.cache.CacheManager;
import com.atlassian.cache.CacheSettingsBuilder;
import com.atlassian.jira.project.ProjectManager;

public class ProjectMappingManagerImpl implements ProjectMappingManager {

    private final ProjectMappingDao dao;
    private final ProjectManager projectManager;
    private final Cache<Set<Long>, Optional<Set<Integer>>> cache;

    public ProjectMappingManagerImpl(ProjectManager projectManager,
                                     CacheManager cacheFactory,
                                     ProjectMappingDao dao) {
        this.projectManager = projectManager;
        this.dao = dao;
        this.cache = cacheFactory.getCache(ProjectMappingManagerImpl.class.getName() + ".cache",
                new MappingCacheLoader(),
                new CacheSettingsBuilder()
                        .expireAfterAccess(7, TimeUnit.DAYS)
                        .flushable()
                        .build());
    }

“Also since you mentioned the issue of interacting with other add-ons - are you using osgi declarations? Is MemoryCacheManager optional there?”
What is osgi declaration?
We declare ProjectMappingManagerImpl in atlassian-plugin.xml

    <component key="projectMappingManager" name="projectMappingManager" class="com.myplagin.ProjectMappingManagerImpl" public="false">
        <interface>com.myplagin.ProjectMappingManager</interface>
    </component>

but atlassian-plugin.xml doesn’t contain lines like

<component-import key="cacheManager" interface="com.atlassian.cache.CacheManager" />
<component-import key="projectManager" interface="com.atlassian.jira.project.ProjectManager" />

Shall we?

Also since you mentioned the issue of interacting with other add-ons

Sorry for confusing description. It is all about the single and the same add-on. I’ve just described the scenario of updating it from one version to another (so plugin’s event occur, like disable plugin, enable plugin and so on).

Ahhh you’re using the <component-import/> stuff so never mind about that part.

That all looks good. 7 days is a really long cache though so that might explain why you’re seeing more than other folks.

That said - there looks to be a couple of memory related bugs in the issue tracker. [CACHE-210] - Ecosystem Jira in particular.

Maybe @cfuller can shed some light if it’s the same thing?

You rang? :wink: Yes, I think I can help, here.

Short answer: Since you aren’t using a CacheLoader, you are getting the exact same cache back every time, and it will still contain objects from your previous lifecycle. This difference is perhaps not called out as clearly as it could be, but the CacheFactory JavaDocs do talk about it. There is no way to make such a cache behave correctly in JIRA Data Center, so we strongly encourage you to use the lazy-loading approach instead. If for some reason you can’t do that, the only alternative is to clear the cache yourself when your plugin starts or stops.

Longer answer:

To understand why these behave so different requires a bit of a history lesson. The atlassian-cache library already existed in both JIRA and Confluence when we began working on JIRA Data Center in the 6.2 development cycle. JIRA never had made much use of it, preferring to use Guava caches throughout its code, but Confluence used atlassian-cache heavily, but the library didn’t at that stage have any of the lazy-load patterns that we determined were going to be the safest approach for caches to use in the Data Center offering. The main reason for preferring lazy-load caches is that their correctness is much easier to reason about – you don’t have to worry about two nodes making changes to the same thing at once and what state the cache ends up in as a result, because all caches treat the database as the final authority and the only information you replicate is an invalidation message – the nodes reload from the database and redundant invalidations don’t cause any permanent harm.

But caches from plugins cause a couple of ClassLoader issues that we have to contend with:

  1. The CacheLoader that the cache uses for lazy-loading is provided by the plugin. This has to be behind a WeakReference or it will prevent the cache data from being reclaimed.
  2. The Cache itself contains data classes that were likely provided by the plugin. This should be behind a WeakReference as well so that the cache data can be freed when the plugin. Unfortunately…

JIRA saw that this was going to be a problem for non-lazy-load caches too and initially made all of the caches behave this way, where the cache data itself was always behind a weak reference so that it could be reclaimed when the plugin was unloaded. But this change of behaviour was too much for Confluence to accept. They had the (terrible, in my opinion) pattern of calling cacheFactory.getCache("foo") multiple times throughout the classes that were using them instead of calling it once and holding onto it. Putting the cache behind a WeakReference broke this access pattern because the cache contents kept disappearing on them, and they weren’t willing to change all the access points in Confluence to stop doing this (plus, there were compatibility concerns for third-party Confluence plugins).

So we had to put it back the way that it was. On the new APIs with a CacheLoader, we did the right thing, but on the old ones that don’t take a loader, the cache is held via a strong reference, so it survives a plugin unload.

Summary:

  1. If you can, switch to using a CacheLoader. That is the safest model to use and all of these concerns are addressed automatically.
  2. If you can’t do that, then explicitly clear the cache immediately after you obtain it and as you shut down to guarantee that it isn’t polluted with data from before your plugin loaded and doesn’t stick around indefinitely. This isn’t ideal for a clustered environment, as it will negatively impact other nodes, but it’s better than doing nothing.
2 Likes

Hi. thanks for the explanation and exposed details.

…But, we already use a CacheLoader implementation in our code. See, the snippet above. There’s our implementation of CacheLoader used in our code: MappingCacheLoader.

    class MappingCacheLoader implements CacheLoader<Set<Long>, Optional<Set<Integer>>> {
. . . 
}
  1. Do you mean com.atlassian.cache.CacheLoader, right?
  2. Could you advise whether we should check our MappingCacheLoader or there might be another causes?

Thanks in advance.

Do you mean com.atlassian.cache.CacheLoader, right?

Yes and no. I do mean com.atlassian.cache.CacheLoader, but that’s just an interface and @PublicSpi. I mean your plugin’s implementation of it, as well. The library will hold onto the CacheLoader you give it and thus onto the plugin’s classes, but only via the cache, which itself is held by a WeakReference. As long as you specified the CacheLoader at the time you created the cache, you are fine. If you were instead passing it as a Supplier on calls to get(K, Supplier<? extends V>), then that wouldn’t count. It’s the cache’s construction that determines whether it will be held weakly or not.

The only other thing I can think of is if you are using a listener. I’m not as familiar with that area of atlassian-cache, because JIRA never used them. I would avoid them entirely.

Finally, if you are just trying to hunt down why you have duplicate classes and haven’t actually seen any problems, it’s worth noting that WeakReference does not provide strong guarantees about exactly how quickly it gets cleared. Depending on which garbage collector is being used and just how many WeakReferences are passed through on the way to your plugin’s ClassLoader, it may take multiple Full GC cycles for the plugin’s ClassLoader and its corresponding classes to really get collected.