Enabling Cache on a Deployment
Deployment generation can be cached to reduce processing time and cost. When an input is received and cached already within the Deployment, the stored response will be sent back directly without triggering a new generation. To enable caching head to a Deployment > Settings. Select Enable Caching.
Caching happens Deployment-wide and currently doesn't support image models.
The cache only works when there is an exact match