See our ICLR ‘21 paper and talk. Our prototype implementation can be found here.
Dynamic Tensor Rematerialization (DTR) allows for training deep learning models in less memory by using a heuristic to evict tensors from memory once there is not enough memory for an allocation and recomputing them on demand, acting as a tensor-level cache. Despite the simplicity of its approach, DTR can allow for training larger models in the same amount of memory with only modest amounts of overhead. We hope to apply our technique to further settings for deep learning applications as well as to other domains.
DTR has been adopted by the MegEngine framework.