Limiting OpenAI’s token usage isn’t merely desirable because it is cheaper and faster, but also because it increases the size of the overall context. OpenAI caps this at 8.000 tokens for most models with a possibility of up to 32.000 tokens (if you are lucky to get invited, presumably). I haven’t seen anybody offering 32K tokens computations nor showcasing it. So in the meantime optimizing the token usage makes even more sense.