How to Ensure Chunk Size Does Not Exceed Model’s Input Token Limit in Oracle AI Vector Search?
Hello everyone,
I am relatively new to semantic search and currently working with Oracle AI Vector Search for Oracle Database 23ai. I need some clarification on how to properly prepare input text for vector generation without exceeding the model's maximum input sequence length.
The AI Vector Search User Guide provides detailed information about the chunking functions, aiming to create semantically coherent chunks that aren't too large. However, while chunk size can be adjusted, it only reflects the number of characters or words, not the input tokens. As we know, models impose limits based on the number of input tokens rather than character length.