[openib-general] Re: Question about pinning memory

Pete Wyckoff
Wed Jul 27 07:29:14 PDT 2005


rolandd at cisco.com wrote on Mon, 25 Jul 2005 16:36 -0700:
> Hmm, thinking about this some more, it occurs to me that it might be
> possible for the kernel and userspace to work together even more
> efficiently.  Does the following make sense?
> 
> When memory is registered, userspace tells the kernel VM activity
> monitor where its registration cache data structure for that
> registration is.  If the kernel detects some VM activity that affects
> that registration, then it can invalidate it directly in the userspace
> cache data structure.
> 
> I guess there are two issues to resolve here:
> 
>  - Are there any races between the kernel marking something invalid
>    and userspace marking it valid again?
> 
>  - How do we keep track of a process's registrations in the kernel so
>    that we can efficiently find which (if any) registrations are
>    affected by a VM operation?  (Maybe you've solved this already)

I think those are solvable problems and a reasonable approach.
There are a few reasons I didn't make this coupling more intimate
from the start:

    - Userspace may want actively to do something when a forced
      unregistration (due to munmap etc) happens, like allow
      registration of some other block that was not possible due to
      system limits on number or total size of registrations.
      Although I guess it could walk its cache when it goes to try
      to pin more and knows that the kernel has freed something.

    - Some find this sort of mixing unpalatable.  It constrains
      independent development of the userspace library and kernel
      module.  Backward compatibility might be a problem some day.

    - The VM event message tells userspace to make the call
      back into the device to perform the deregistration.  This is
      really just because of a constraint in the VAPI libraries and
      that I didn't want to have to change them.  Perhaps in OpenIB
      it will be more natural for the kernel to call the device to
      do the actual deregistration.  There is some messy error handling
      to deal with, though, and garbage collection in userspace too.
      Some amount of code duplication.

I expect the deregistration time to dominate this entire affair.
Avoiding the interaction with the library when forced invalidations
happen, as you suggest, has the potential to be faster, but the
complexity is a bit daunting.  I'd like to run some profiles
of apps that generate lots of VM activity and see where the time
goes.

                -- Pete


More information about the openib-general mailing list