Context window
Mostly, but not exactly.
Context window means the model’s total working space for one request/conversation turn:
context window = input tokens + output tokens
So if a model has a 128K context window, that does not mean you can always send 128K input and still get a big answer. The output also has to fit.
Example:
Model context window: 128K tokens Your input: 100K tokens Room left: 28K tokens for output
But there may also be a separate max output token cap:
Context window: 128K Input tokens: 100K Remaining room: 28K Max output cap: 16K
Actual max output: 16K
So the terms mean:
Term Simple meaning
Input tokens What you send in: prompt, code, files, logs, chat history Output tokens What the model writes back Context The total text the model can “see” while answering Context window The maximum token capacity for input + output Token limit Usually a general term; could mean context limit or output limit
So: context/window is not synonymous with input size. It is the total capacity that input and output share.