Click the Create new Knowledge Base button.
Upload your file and select your embedding model.
Here you will be able to decide embedding model, chunk size, and the overlap of your chunks. After you have chosen all three options, select the “Process File(s) button.
The configurations are:
Embedding model: Your preferred embedding models. Several embedding models that we support by default are:
OpenAI (served by Datasaur)
Amazon Bedrock (served by Datasaur)
Vertex AI (served by Datasaur)
Chunk size: The maximum number of characters that a chunk can contain. The larger the numbers, the bigger each chunk will be, allowing more data to be included within it.
Overlap: The number of characters that should overlap between two adjacent chunks. The larger the overlap, the more information each chunk shares with its neighboring chunks.
Advanced settings: Additional settings can enhance your data organization by enabling you to provide information about the file using the File Properties feature.