Efficient Representations for Large Dynamic Sequences in ML (extended abstract)

Abstract

The use of sequence containers, including stacks, queues, and double-ended queues, is ubiquitous in programming. When the maximal number of elements is not known in advance, containers need to grow dynamically. For this purpose, most ML programs either rely on lists or vectors. These structures are inefficient, both in terms of time and space usage. We investigate the use of chunked-based data structures. Such structures save a lot of memory and may deliver better performance than classic container data structures. We observe a 2x speedup compared with vectors, and up to a 3x speedup compared with lengthy lists.

Paper

Arthur Charguéraud and Mike Rainey
ML: ML Family Workshop, September 2017