Context window

Mostly, but not exactly.

Context window means the model’s total working space for one request/conversation turn:

context window = input tokens + output tokens

So if a model has a 128K context window, that does not mean you can always send 128K input and still get a big answer. The output also has to fit.

Example:

Model context window: 128K tokens Your input: 100K tokens Room left: 28K tokens for output

But there may also be a separate max output token cap:

Context window: 128K Input tokens: 100K Remaining room: 28K Max output cap: 16K

Actual max output: 16K

So the terms mean:

Term Simple meaning

Input tokens What you send in: prompt, code, files, logs, chat history Output tokens What the model writes back Context The total text the model can “see” while answering Context window The maximum token capacity for input + output Token limit Usually a general term; could mean context limit or output limit

So: context/window is not synonymous with input size. It is the total capacity that input and output share.

Context window

Navigation menu

Search