On the Undecidability of Artificial Intelligence Alignment: Machines that Halt
Published in ArXiv Preprint. Submitted for review in SciRep AI Alignment Collection, 2024
AI Alignment is undecidable. Nevertheless, there is an enumerable set of provenly aligned AIs that are constructed from a finite set of provenly aligned operations. We propose a halting constraint that guarantees that the AI model always reaches a terminal state in finite execution steps.
Recommended citation: de Melo, G. A., Maximo, M. R. O. D. A., Soma, N. Y., & de Castro, P. A. L. (2024). On the Undecidability of Artificial Intelligence Alignment: Machines that Halt. arXiv [Cs.AI]. Retrieved from http://arxiv.org/abs/2408.08995
Download Paper