Skip to main content
All CollectionsKnowledge Base
How are Datasources stored?
How are Datasources stored?

Helpful information for information security teams on how elvex processes and stores data.

Updated over a week ago

When elvex ingests a Datasource, we launch a background job that looks something like this:

  1. Store the raw file (e.g. PDF, DOCX, etc.) in a secure, non-publicly available location.

  2. Launch a job that:

    1. Securely downloads the raw file.

    2. Splits the document into "chunks" (each chunk contains a number of sentences)

    3. Chunks are further processed and stored in elvex's database.

Files provided to elvex are never publicly accessible.

What information is stored in elvex's database?

In order to support the broad range of use cases elvex plans to support, the "chunks" we store in elvex's database include fields like:

  • Metadata about the original file the chunk belonged to.

  • The raw text the chunk was based on (reminder: a chunk is a group of sentences).

  • An embedding for the chunk (a numeric representation of the chunk which captures its meaning).

  • Some additional refined fields which are a bit of our secret sauce to make searching fast and "just work".

So, in a way, your data is stored twice, once as a raw file and again in elvex's database as chunks.

Did this answer your question?